Advanced Mathematical and Computational Tools in Metrology and Testing XI 9813274298, 9789813274297

This volume contains original, refereed contributions by researchers from institutions and laboratories across the world

273 82 13MB

English Pages x+444 [458] Year 2019

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

Advanced Mathematical and Computational Tools in Metrology and Testing XI
 9813274298, 9789813274297

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

ADVANCED MATHEMATICAL AND COMPUTATIONAL TOOLS IN METROLOGY AND TESTING XI

This page intentionally left blank

Series on Advances in Mathematics for Applied Sciences – Vol. 89

ADVANCED MATHEMATICAL AND COMPUTATIONAL TOOLS IN METROLOGY AND TESTING XI Editors

A B Forbes National Physical Laboratory, UK

N F Zhang National Institute of Standards and Technology, USA

A Chunovkina Institute for Metrology “D I Mendeleyev”, Russia

S Eichstädt Physikalisch-Technische Bundesanstalt, Germany

F Pavese IMEKO TC21, Italy

World Scientific NEW JERSEY



LONDON



SINGAPORE



BEIJING



SHANGHAI



HONG KONG



TA I P E I



CHENNAI

Published by World Scientiic Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA oice: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK oice: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data Names: Forbes, Alistair B., editor. | Zhang, Nien Fan, editor. | Chunovkina, Anna G., editor. | Eichstädt, Sascha, editor. | Pavese, Franco, editor. Title: Advanced mathematical and computational tools in metrology and testing XI / edited by Alistair B Forbes, Nien Fan Zhang, Anna Chunovkina, Sascha Eichstädt, Franco Pavese. Description: New Jersey : World Scientiic, [2018] | Series: Series on advances in mathematics for applied sciences ; volume 89 | Includes bibliographical references and index. Identiiers: LCCN 2018033726 | ISBN 9789813274297 (hardcover : alk. paper) Subjects: LCSH: Metrology. | Statistics. Classiication: LCC QC88 .A382 2018 | DDC 389/.1015195--dc23 LC record available at https://lccn.loc.gov/2018033726

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

Copyright © 2019 by World Scientiic Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

For any available supplementary material, please visit https://www.worldscientiic.com/worldscibooks/10.1142/11100#t=suppl

Printed in Singapore

Foreword This volume contains original, refereed contributions from researchers from institutions and laboratories across the world involved in metrology and testing. They were prompted by presentations made at the eleventh edition of the Advanced Mathematical and Computational Tools in Metrology and Testing conference held at the University of Strathclyde, Glasgow, in August 2017, organised by IMEKO TC21, the National Physical Laboratory, UK, and the University of Strathclyde. The aims of the IMEKO Technical Committee 21 “Mathematical Tools for Measurements” (http://www.imeko.org/index.php/tc21homepage) are: 

   



To present and promote reliable and effective mathematical and computational tools in metrology and testing. To understand better the modelling, statistical and computational requirements in metrology and testing. To provide a forum for metrologists, mathematicians, software and IT engineers that will encourage a more effective synthesis of skills, capabilities and resources. To promote collaboration in the context of EU and International Programmes, Projects of EURAMET, EMPIR, EA and of other world Regions, and address requirements relating to the CIPM Mutual Recognition Arrangement. To support young researchers in metrology, testing and related fields. To address industrial and societal requirements.

The conference series is an important activity to support these aims. The themes in this volume reflect the importance of mathematical, statistical and numerical tools and techniques in metrology and testing and their role in addressing the challenges associated with the Metre Convention relating to the demand for measurement standards of ever increasing accuracy, range and diversity, and the need to demonstrate equivalence between national measurement standards. Torino, May 2018

The Editors

v

Book Editors A B Forbes (National Physical Laboratory, Teddington, UK), [email protected] N F Zhang (National Institute of Standards and Technology, Gaithersburg, MD, USA), [email protected] A Chunovkina (Institute for Metrology “D I Mendeleyev”, St. Petersburg, Russia), [email protected] S Eichstädt (Physikalisch-Technische Bundesanstalt, Germany), [email protected] F Pavese (formerly Istituto Nazionale di Ricerca Metrologica, Torino Italy), [email protected]

฀

For the first ten Volumes see this Series vol. 16 (1994), vol. 40 (1996), vol. 45 (1997), vol. 53 (2000), vol. 57 (2001), vol. 66 (2004), vol. 72 (2006), vol. 78 (2009), vol. 84 (2012) and vol. 86 (2015)

IMEKO TECHNICAL COMMITTEE 21 “MATHEMATICAL TOOLS FOR MEASUREMENTS”.

Chairperson: F Pavese, formerly Istituto Nazionale di Ricerca Metrologica, Torino, http://www.imeko.org/index.php/tc21-homepage, http://www.imeko.org



THIS VOLUME WAS PROMPTED BY THE AMCTM 2017 CONFERENCE CHAIRPERSONS: F Pavese (IMEKO), N F Zhang (NIST), A B Forbes (NPL) CONFERENCE INTERNATIONAL SCIENTIFIC COMMITTEE European Chapter: Emil Bashkanski (ORT Braude College, Israel), Eric Benoit (Université de Savoie, France), Wolfram Bremser (BAM, Germany), Dolores del Campo (CEM, Spain), Abdérafi Charki (University of Angers, France), Anna Chunovkina (VNIIM, Russia), Pasquale Duponte (IMEKO, Unisannio, Italy), Sascha Eichstädt (PTB, Germany), Eduarda Filipe (IPQ, Portugal), Jean-Rémy Filtz (LNE, France) Nicolas Fischer (LNE, France), Rainer Göb (University of Würzburg, Germany), V A Granovski (Electropribor, Russia), Peter Harris (NPL, UK), Jean-Marc Linares (Université Aix-Marseille, France), David Lloyd (University of Surrey, UK), Giovanni Rossi (University of Genoa, Italy), Alison Ramage (University of Strathclyde, UK), Marco Seabra dos Reis (EQ.UC, Portugal), Alexandr Shestakov (University of the South Urals, Russia), Klaus-Dieter Sommer (PTB, Germany), Stephen Uhlig (Quodata, Germany), Grazia Vicario (POLITO, Italy), Victor Witkovski (Savba, Slovakia), Louise Wright (NPL, UK). International Chapter: Seda Aytekin (UME, Turkey), Ahmad Barari (UOIT, Canada), Ahmet Diril (UME, Turkey), Yuning Duan (NIM, China), Ragu Kacker (NIST, USA), Murat Kalemci (UME, Turkey), Aliye Kartal (UME, Turkey), Vladik Kreinovich (University of Texas, USA), Gregory Kyriazis (INMETRO, Brazil), Ignacio Lira (PUCC, Chile), Leonel Lira-Cortes (CENAM, Mexico), Katsuhiro Shirono (AIST-NMIJ, Japan), Tomomichi Suzuki (ISO TC69, Japan), Rod White (Callagan, New Zealand), Robin Willink (New Zealand).

ORGANISED BY IMEKO TC21 http://www.imeko.org/index.php/tc21-homepage, NPL (UK), University of Strathclyde (UK). vi

Contents Foreword

v

Analysis of key comparisons with two reference standards: Extended random effects meta-analysis O. Bodnar and C. Elster

1

Confirmation of uncertainties declared by KC participants in the presence of an outlier A. G. Chunovkina and A. Stepanov

9

Quantity in metrology and mathematics: A general relation and the problem V. A. Granovskii

20

Bayesian analysis of an errors-in-variables regression problem I. Lira and D. Grientschnig

38

Triangular Bézier surface: From reconstruction to roughness parameter computation L. Pagani and P. J. Scott

48

On the classification into random and systematic effects F. Pavese

58

Measurement models A. Possolo

70

Metrology and mathematics — Survey on a dual pair K. H. Ruhm

85

Fundaments of measurement for computationally-intensive metrology P. J. Scott

119

Study of gear surface texture using Mallat’s scattering transform W. Sun, S. Chrétien, R. Hornby, P. Cooper, R. Frazer and J. Zhang

128

The evaluation of the uncertainty of measurements from an autocorrelated process N. F. Zhang

138

vii

viii

Dynamic measurement errors correction in sliding mode based on a sensor model M. N. Bizyaev and A. S. Volosnikov

153

The Wiener degradation model with random effects in reliability metrology E. S. Chetvertakova and E. V. Chimitova

162

EIV calibration of gas mixture of ethanol in nitrogen S. Ďuriš, Z. Ďurišová, M. Dovica and G. Wimmer

170

Models and algorithms for multi-fidelity data A. B. Forbes

178

Uncertainty calculation in the calibration of an infusion pump using the comparison method A. Furtado, E. Batista, M. C. Ferreira, I. Godinho and P. Lucas

186

Determination of measurement uncertainty by Monte Carlo simulation D. Heißelmann, M. Franke, K. Rost, K. Wendt, T. Kistner and C. Schwehn

192

A generic incremental test data generator for minimax-type fitting in coordinate metrology D. Hutzschenreuter

203

NLLSMH: MCMC software for nonlinear least-squares regression K. Jagan and A. B. Forbes

211

Reduced error separating method for pitch calibration on gears F. Keller, M. Stein and K. Kniel

220

Mathematical and statistical tools for online NMR spectroscopy in chemical processes S. Kern, S. Guhl, K. Meyer, L. Wander, A. Paul, W. Bremser and M. Maiwald

229

A new mathematical model to localize a multi-target modular probe for large-volume metrology applications D. Maisano and L. Mastrogiacomo

235

Soft sensors to measure somatic sensations and emotions of a humanoid robot U. Maniscalco and I. Infantino

241

ix

Bayesian approach to estimation of impulse-radar signal parameters when applied for monitoring of human movements P. Mazurek and R. Z. Morawski

249

Challenging calculations in practical, traceable contact thermometry J. V. Pearce and R. L. Rusby

257

Wald optimal two-sample test for right-censored data P. Philonenko and S. Postovalov

265

Measurement A. Possolo

273

Sensitivity analysis of a wind measurement filtering technique T. Rieutord and L. Rottner

286

The simulation of Coriolis flowmeter tube movements excited by fluid flow and exterior harmonic force V. A. Romanov and V. P. Beskachko

294

Indirect light intensity distribution measurement using image merging I. L. Sayanca, K. Trampert and C. Neumann

307

Towards smart measurement plan using category ontology modelling Q. Qi, P. J. Scott and X. Jiang

315

Analysis of a regional metrology organization key comparison: Preliminary consistency check of the linking-laboratory data with the CIPM key comparison reference value K. Shirono and M. G. Cox

324

Stationary increment random functions as a basic model for the Allan variance T. N. Siraya

332

Modelling a quality assurance standard for emission monitoring in order to assess overall uncertainty T. O. M. Smith 341 Integrating hyper-parameter uncertainties in a multi-fidelity Bayesian model for the estimation of a probability of failure R. Stroh, J. Bect, S. Demeyer, N. Fischer and E. Vazquez

349

x

Application of ISO 5725 to evaluate measurement precision of distribution within the lung after intratracheal administration J. Takeshita, J. Ono, T. Suzuki, H. Kano, Y. Oshima, Y. Morimoto, H. Takehara, T. Numano, K. Fujita, N. Shinohara, K. Yamamoto, K. Honda, S. Fukushima and M. Gamo 357 Benchmarking rater agreement: Probabilistic versus deterministic approach A. Vanacore and M. S. Pellegrino 365 Regularisation of central-difference method when applied for differentiation of measurement data in fall detection systems J. Wagner and R. Z. Morawski

375

Polynomial estimation of the measurand parameters for samples from non-Gaussian distributions based on higher order statistics Z. L. Warsza and S. V. Zabolotnii

383

EIV calibration model of thermocouples G. Wimmer, S. Ďuriš, R. Palenčár and V. Witkovský

401

Modeling and evaluating the distribution of the output quantity in measurement models with copula dependent input quantities V. Witkovský, G. Wimmer, Z. Ďurišová, S. Ďuriš, R. Palenčár and J. Palenčár

409

Bayesian estimation of a polynomial calibration function associated to a flow meter C. Yardin, S. Amar, N. Fischer, M. Sancandi and M. Keller

417

Dynamic measurement errors correction adaptive to noises of a sensor E. V. Yurasova and A. S. Volosnikov

427

Author index

439

Keyword index

441

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 1–8)

Analysis of key comparisons with two reference standards: Extended random effects meta-analysis O. Bodnar Physikalisch-Technische Bundesanstalt, Berlin, Germany and M¨ alardalen University, V¨ aster˚ as, Sweden E-mail: [email protected], [email protected] C. Elster Physikalisch-Technische Bundesanstalt, Berlin, Germany We propose a statistical method for analyzing key comparisons with two transfer standards measured in two petals. The new approach is based on an extension of the established random effects model. A full Bayesian analysis based on the reference prior is developed and analytic expressions for the results are derived. One benefit of the suggested approach is that it provides a comprehensive assessment of the laboratory biases in terms of their posterior distributions. Another advantage is that it can easily be applied in practice. The approach is illustrated for the CCM.M-K7 key comparison data. Keywords: extended random effects model; two transfer standards; reference analysis; CCM.M-K7.

1. Introduction Drawing inferences from data, which themselves are the results of analyses, is known as meta-analysis. Meta-analysis has become an important statistical tool in many applications. Examples comprise the combination of results from clinical trials 1,2 , the determination of fundamental constants 3–5 , or the analysis of interlaboratory comparisons 6 . Key comparisons are interlaboratory comparison measurements carried out in order to establish the equivalence between national metrology institutes 7 . It has been proposed to apply random effects models for the analysis of key comparison data and to use the resulting estimates of the laboratory effects as the degrees of equivalence between the participating laboratories 8,9 . In some key comparisons, however, two transfer standards are circulated among participants in two separate petals, preventing the immediate application of the classical effects model. Usually, the pilot 1

2

laboratory measures at the beginning and at the end of each petal. Some of the laboratories may participate in both petals, and some only in one of them. The goal of this paper is to propose an analysis for data arising from such a scenario, and to provide the degrees of equivalence. We suggest a statistical model which extends the established random effects model to the situation of interest and propose a Bayesian method for its treatment. The suggested approach is applicable for inconsistent data, and it provides an inference of the laboratories biases. In contrast to existing approaches which suggest to link the measurement results obtained from different petals by constructing the differences between the measurements provided by the participating laboratories of each petal and those of the pilot laboratory obtained in the same petal, the new statistical model allows to combine the measurement data in an appealing way. It also avoids problems which occur when differences are calculated, such as the introduction of correlations 10 . For the proposed Bayesian treatment the Berger&Bernardo reference prior 11 and the corresponding posterior are derived for the model parameters and for the realized random effects. The developed method is used to analyze key comparison data available on the CCM.M-K7 key comparison data 10 . 2. Objective Bayesian inferences 2.1. Statistical model Let X = (X1 , ..., Xn )T and Y = (Y1 , ..., Ym )T be the two vectors of measurements obtained in two petals. Without loss of generality, we denote the measurements of the pilot laboratory, which participates in both petals, by X1 and Y1 , respectively. We assume that X and Y follow an extended random effects model defined by X = µX 1n + λX + εX , ˜ Y + εY , Y = µ Y 1m + λ

(1) (2)

where 1k denotes the k-dimensional vector of ones. It is also assumed that the random effects of the pilot laboratory are the same in both petals. ˜Y = The random effects vectors λX = (λ1,X , ..., λn,X )T and λ T T T (λ1,X , λY ) with λY = (λ2,Y , ..., λm,Y ) defined for both petals are

3

assumed to be independently distributed with λX |σ ∼ Nn (0, σ 2 In ) ,

(3)

2

λY |σ ∼ Nm−1 (0, σ Im−1 ) ,

(4)

where Ik stands for the k-dimensional identity matrix. Furthermore, the model residuals are assumed to be normally distributed, but not necessarily independent, as given by     εX V11 V12 . (5) ∼ Nn (0, V) with V = V21 V22 εY Summarizing (1)–(5), we obtain the following extended random effects model expressed as       X λX µX 1n + ε, (6) +L = λY Y µ Y 1m where L denotes an (n+m)×(n+m−1) matrix which transforms (λTX , λTY )T ˜ T )T and which is given by into (λT , λ X

Y



1 0 ... 0 1 ...  . . .  .. .. . .  0 0 ...   L = 1 0 ...  0 0 ...  0 0 ...   .. .. . . . . . 0 0 ...

0 0 0 ... 0 0 0 ... .. .. .. . . . . . . 1 0 0 ... ... ... ... ... 0 1 0 ... 0 0 1 ... .. .. .. . . . . . .

 0 0  ..  .  0   0 .  0  0  ..  .

0 0 0 ... 1

2.2. Bayesian inference based on the reference prior We present a Bayesian inference for the parameters of model (6), namely for {µX , µY , σ} as well as for the realized random effects λ = (λTX , λTY )T . Since the main interest of the paper is in the laboratory biases, we then consider λ as the model parameter of interest and {µX , µY , σ} as nuisance parameters. The derivation of the reference prior for the extended random effects model is similarly to the derivation of the reference prior for the random

4

effects model given in 9 . The resulting prior depends on σ only and is given by r  2 −1 (7) π(σ) ∝ σ tr (In+m−1 + σ 2 LT V−1 L) LT V−1 L . Utilizing (7), the conditional reference posterior for λ is obtained as   λ|σ ∼ Nn+m−1 µλ|σ , Vλ|σ (8)

with

µλ|σ =



1 L RL + 2 In+m−1 σ T

Vλ|σ =



−1

T

L R

1 L RL + 2 In+m−1 σ T



−1

X Y



,

,

and π(σ) p π(σ|X, Y) ∝ p 2 T −1 det (σ L V L + In+m−1 ) det(KT (V + σ 2 LLT )−1 K) !!  T   1 X X −1 T × exp − R − µλ|σ V µλ|σ (9) λ|σ Y Y 2 where R = V−1 − V−1 K(KT V−1 K)−1 KT V−1 and K=



1n 0n 0m 1 m



.

Using the properties of the multivariate normal distribution, we get the marginal posteriors for each random effect separately as   λi |σ ∼ N eTi µλ|σ , eTi Vλ|σ ei ,

where ei is the i-th basis vector in IRn+m−1 and the marginal posterior of σ is given in (9).

5

3. Application to CCM.M-K7 key comparison data We applied the Bayesian method of Section 2 to the analysis of the CCM.MK7 key comparison data where measurements of two transfer standards were performed in two petals. The considered transfer standards are the weights of 5kg, 100g, 10g, 5g.

a) 5kg

b) 100g

c) 10g

d) 5g

Fig. 1. Measurement data from CCM.M-K7 with two transfer standards measured in two petals. The dashed line in the figure separates the two petals. The pilot laboratory, KRISS, participated in two petals.

Figure 1 shows the data. The pilot laboratory KRISS performed measurements in both petals, while all other national laboratories participated in one petal only. One of the main outputs of the developed Bayesian approach are the joint posterior and the marginal posteriors of the laboratory biases as

6

a) 5kg

b) 100g

c) 10g

d) 5g

Fig. 2. Estimated laboratory biases together with 95% probability symmetric credible intervals for the CCM.M-K7 data with two transfer standards measured in both petals.

presented in Section 2.2. Using the analytical expressions for the marginal posteriors, we computed the estimates for laboratory biases as the corresponding posterior means together with the corresponding probabilistically symmetric 95% credible intervals (see, Figure 2). All of the constructed credible intervals cover zero, which indicates that all participating laboratories measure without biases. In Figure 3, the marginal posterior distributions of the heterogeneity parameter σ are shown, together with their mean, taken as the Bayesian estimate of σ, and a probabilistically symmetric 95% credible interval.

7

a) 5kg

b) 100g

c) 10g

d) 5g

Fig. 3. Marginal posterior for heterogeneity parameter σ for the CCM.M-K7 data from Figure 1. The red lines show the posterior mean and the limits of the probabilistically symmetric 95% credible interval.

4. Conclusion The determination of the laboratory biases is one of the most important tasks in key comparisons. Recently, a random effect model has been applied successfully for this purpose 8,9 . We extend that approach to the case of two transfer standards which are measured in two petals. The new approach is based on the Bayesian reference analysis and infers laboratory biases by their joint and marginal posteriors. The proposed method utilizes an analytical solution and requires only a one-dimensional numerical integration. Therefore, the proposed approach can easily be used in practice. References 1. A. J. Sutton and J. Higgins, Recent developments in meta-analysis, Statistics in Medicine 27, 625 (2008).

8

2. O. Bodnar, A. Link, B. Arendack´a, A. Possolo and C. Elster, Bayesian estimation in random effects meta-analysis using a non-informative prior, Statistics in Medicine 36, 378 (2017). 3. P. J. Mohr, B. N. Taylor and D. B. Newell, Codata recommended values of the fundamental physical constants: 2010, Reviews of Modern Physics 84, 1527 (2012). 4. O. Bodnar, A. Link and C. Elster, Objective bayesian inference for a generalized marginal random effects model, Bayesian Analysis 11, 25 (2016). 5. O. Bodnar, C. Elster, J. Fischer, A. Possolo and B. Toman, Evaluation of uncertainty in the adjustment of fundamental constants, Metrologia 53, S46 (2016). 6. C. Elster and B. Toman, Analysis of key comparison data: critical assessment of elements of current practice with suggested improvements, Metrologia 50, p. 549 (2013). 7. Bureau International des Poids et Mesures, Mutual Recognition of National Measurement Standards and of Calibration and Measurement Certificates issued by National Metrology Institutes (CIPM, revision 2003). 8. A. L. Rukhin and A. Possolo, Laplace random effects models for interlaboratory studies, Computational Statistics & Data Analysis 55, 1815 (2011). 9. O. Bodnar and C. Elster, Assessment of vague and noninformative priors for bayesian estimation of the realized random effects in randomeffects meta-analysis, AStA Advances in Statistical Analysis (2017, To appear). 10. S. Lee, M. Borys, P. Abbott, L. O. Becerra, A. A. Eltawil, W. Jian, A. Malengo, N. Medina, V. Snegov, C. W¨ uthrich et al., The final report for ccm. m-k7: key comparison of 5 kg, 100 g, 10 g, 5 g and 500 mg stainless steel mass standards, Metrologia 54, p. 07001 (2017). 11. J. Berger and J. M. Bernardo, On the development of reference priors, in Bayesian Statistics, eds. J. M. Bernardo, J. Berger, A. P. Dawid and A. F. M. Smith (Oxford: University Press, 1992).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 9–19)

Confirmation of uncertainties declared by KC participants in the presence of an outlier A. Chunovkina† and A. Stepanov D.I. Mendeleyev Institute for Metrology (VNIIM), 19, Moskovsky pr., 190005, St. Petersburg, Russian Federation †E-mail: [email protected] The aims of the work are: to characterize quantitatively conventional procedures for the uncertainties confirmation; to describe the influence of an outlier on confirmation of uncertainties reported by the KC participants; to consider a reasonable limitation of the measurement uncertainties ratio.

1. Introduction A lot of key comparisons of national measurement standards have been arranged since Mutual recognition of national measurement standards and of calibration and measurement certificates issued by national metrology institutes (CIPM MRA) was signed in 1999 1 . Key comparison (KC) is a special type of interlaboratory comparison, which, on the one hand, results in establishing degree of equivalence (DoE) of measurement standards and, on the other hand, serves as a tool for assessment of measurement and calibrations capabilities provided by national metrology institutes (NMIs). Such assessment is arranged as a confirmation of measurement uncertainties declared by KC participants where conventional statistical criteria of measurement data consistency are applied. A lot of publications address the issue of studying KC data consistency as well as items concerning evaluation of a key comparison reference value (KCRV) and evaluation of a DoE of measurement standards 2–10 . In this paper conventional procedures for the KC data consistency based on a chi-square test and En scores are considered. The En scores are treated as a normalized deviation from KCRV as well as normalized pair-wise deviation of measurement results reported by participants. The conventional procedure for KC data evaluation consists of three steps and every step implies checking the consistency of the measurement data. At the first step the data consistency is checked using a chi-square test with the aim to validate 9

10

the choice of a weighted mean as an estimate for key comparison reference value. At the second step the degree of equivalence of measurement standards is evaluated and En scores are used for confirming of measurement uncertainties reported by each KC participant. Finally, at the third step, pair-wise degrees of equivalence of measurement standards are evaluated and corresponding En scores are used to confirm the measurement uncertainties. So there is a hierarchical chain of statistical tests used for checking the data consistency and consequently for confirming the declared uncertainties. In this paper probabilities of passing the above tests are investigated as a quantitative measure of properties of the whole procedure. The influence of an outlier (which is a result of unresolved systematic bias in the context of this paper) on to the confirmation of declared uncertainties is discussed. Finally some practical recommendations for KC arrangement concerning the choice of weights of measurement data are formulated. The paper is arranged as follows. In Sec. 2 the conditional probabilities of data consistency test passing are analyzed. In Sec. 3 an outlier model is introduced and the outlier influence on confirmation of declared uncertainties is analyzed. 2. Evaluation of conditional probabilities in testing KC data consistency 2.1. Conventional procedure for KC data evaluation KC comprises two tasks: evaluation of the KCRV and confirmation the uncertainties reported. MRA does not directly state the task of confirmation of the reported uncertainties, but the results of KC are used for acceptance (or not) of CMC, which are presented as quantities and the associated uncertainties. So finally we need some objective foundation or tests for confirmation of declared uncertainties. These tests are usually based on analysis of deviations of the measurement results from the KCRV. The KCRV is calculated using measurement results (measured values and the associated uncertainties) obtained by participants. Therefore, any measurement result influences on KCRV determination; that results in a correlation between the measurement results and KCRV (in case of a weighted-mean KCRV the correlation coefficient is equal to a square root of the corresponding weight). This issue was discussed in many publications 11–15 . As it is mentioned above, the conventional procedure for the KC data

11

evaluation consists of three steps: • Checking the consistency of the data {xi , ui }, obtained by KC participants, using a chi-square test: χ2obs =

n X (xi − x ¯ w )2 1

u2i

≤ χ20.95 (n − 1),

where x ¯w is a weighted mean of the data, and uw is the corresponding uncertainty: !− 12 Pn n X w x i i wi , wi = u−2 uw = u (¯ xw ) = x ¯w = P1 n . i , 1 wi 1

If the test is passed then the KCRV xref is established equal to x ¯w with the associated uncertainty uw : xref = x ¯w ,

uref = uw .

• Calculating DoEs di = xi − xref , corresponding extended uncertainties U0.95 (di ), and confirming the declared uncertainties using En scores: |xi − xref | En(i) = q ≤ 2. u2i − u2ref

• Calculating the pair-wise DoEs which are also used for confirmation of declared uncertainties: |xi − xj | En(i,j) = q ≤ 2. u2i + u2j 2.2. Coverage factor examination At first, let us check if factor value 2 used in the above criteria, is a suitable (i) choice. The En score is intended to check the consistency of measurement results of a single i-th KC participant, but in practice it is often verified for all the participants. Therefore, the probability that the criterion is met for all the participants is obviously lower than 0.95. Consider this probability: q o n P (En(i) (K)) = P |xi − xref | ≤ K u2i − u2ref , 1 ≤ i ≤ n , as well as a conditional probability that the criterion En is met for all the KC participants, when the chi-square test is passed: P cond (En(i) (K)) = P (En(i) (K) | χ2obs ≤ χ20.95 (n − 1)).

12

The plots of the probabilities obtained under the assumption of Xi distribution normality, are given below (a Monte Carlo technique was used to cond obtain the data) — see Fig. 1. Values K0.95 , K0.95 of K corresponding to the probability level 0.95 are presented in the Table 1 (see also 12 — the similar K0.95 values were calculated there).

Fig. 1.

P (K) dependencies, consistency with xref .

Table 1.

cond values. K0.95 , K0.95

n

K0.95

cond K0.95

5

2.55

2.33

10

2.80

2.64

15

2.92

2.80

20

3.01

2.91

Consider also a probability of pair-wise consistency as a functions of the factor K: q n o P (En(i,j) (K)) = P |xi − xj | ≤ K u2i + u2j , 1 ≤ i < j ≤ n ,

and a conditional probability of pair-wise consistency for all the participants, when both chi-square and En criteria are met (the factor value

13

K = 2 is used for En here, as it is a common practice case): P cond (En(i,j) (K)) = P (En(i,j) (K) | χ2obs ≤ χ20.95 (n−1) and En(i) (2) is passed). The plots of these probabilities obtained under the assumption of Xi distribution normality, are given at Fig. 2; and the values are presented in the Table 2.

Fig. 2.

P (K) dependencies, pair-wise consistency.

Table 2.

cond values. K0.95 , K0.95

n

K0.95

cond K0.95

5

2.73

2.35

10

3.16

2.92

15

3.39

3.19

20

3.54

3.37

(i,j)

The analysis of the above conditional probability P cond (En ) shows that the consistency with the reference value for all the participants does not guarantee pairwise consistency (and the probabilities of pair-wise consistency are lower as compared with the consistency probabilities with the KCRV). So the indication of pair-wise DoEs for the KC seems reasonable. cond It should be noted also that the values K0.95 , K0.95 for given n vary widely enough (the values for the conditional probabilities are lower, as expected).

14

3. Quantitative expression of an outlier influence to confirmation of declared uncertainties 3.1. Outlier model Now let us turn to the question of the effect of a single outlier onto the consistency of the KC results. Consider a single additive bias for i0 -th laboratory, which is proportional to ui0 : E(Xi ) = µ,

Var(Xi ) = u2i ,

E(Xi0 ) = µ + λui0 ,

i 6= i0 ,

Var(Xi0 ) = u2i0 .

(1)

The following notes can be made about the distributions of the following statistics in presence of the outlier (assuming the normality of the Xi distributions and the equality of all the weights with the exception of wi0 ): • Statistic χ2obs has a non-central χ2 (n − 1) distribution with parameter λ2 (1 − wi0 ); |x −x | has a normal distribution, • Statistic Yi = √ i 2 ref 2 ui −uref

 √  λ 1 − w i , i = i0 , 0 q E(Yi ) = wi −λ 1−w wi0 , i 6= i0 , i

Var(Yi ) = 1;

|x −x | • Statistic Yi,j = √ i 2 j 2 has a normal distribution, ui +uj

  − 21 w  λ 1 + wii0 E(Yi,j ) = 0, j 6= i ,

j = i0 ,

Var(Yi,j ) = 1.

0

3.2. Limitation on the ratio of measurement uncertainties In case of equal weights wi for i 6= i0 it is easy to check (taking into account Pn that 1 wj = 1) that the expectation of the statistic Yi as a function of the outlier weight wi0 is non-monotonic (decreases at first, and then increases). Denote wi∗0 = arg min P {Yi ≤ 2} , wi0

uj = uk , if j, k 6= i0

the minimum point for the above probability P {Yi ≤ 2}. Values of wi∗0 for some n are presented in the Table 3 as well as the corresponding weight w and uncertainty ratios wii0 , uuii . 0

15 Table 3. E(Yi ) minimum points and corresponding weights ratios. n

wi∗0

wi 0 wi

ui ui 0

4

0.38

1.8

1.34

5

0.42

2.9

1.70

6

0.44

3.9

1.98

7

0.45

4.9

2.22

10

0.47

8.0

2.82

15

0.48

12.9

3.59

20

0.48

17.5

4.19



0.50





So starting from wi∗0 , the increase of the outlier weight (determining its influence on the KCRV) enhances the false consistency of the measurement data with the reference value for the other participants. To avoid this w anomalous situation, the corresponding weight ratio wii0 (depending on n) could be considered as a reasonable limitation on the weights for the KC participants, so this information can be used at a preliminary stage of the comparison design. 3.3. Monte Carlo simulation For the further study of the outlier impact a Monte Carlo simulation technique was used. Again, introduce the weights ratio wi0 max wi r = max = i6=i0 wi w ≥w min wi i i 0

and consider the following quantitative characteristics (probabilities) to evaluate the outlier influence depending on the weights ratio r: n o P1 = P En(i) ≤ 2, for all 1 ≤ i ≤ n , n o n o P2 = P En(i) ≤ 2, for all i 6= i0 , P3 = P En(i0 ) > 2 ,

i.e. P1 is a probability for all the KC participants to pass the En criteria, P2 is the same probability for all the participants excepting i0 (the outlier), and P3 is a probability for the outlier data to fail the En score check.

16

The Pi plots for n = 10 and λ = 3, 5 (assuming equal weights for i 6= i0 and normal or uniform distributions for Xi ) are presented below as an example (see Figs. 3, 4). It should be noted that P2 here again demonstrates a nonmonotonic behavior (as well as E(Yi ) from the previous clause), so the minimum point r∗ for it may again be considered as a natural limitation onto the weights ratio r (r ≤ r∗ ) to avoid an anomalous increase of P2 . Note that the minimum point for P2 remains practically unchanged for any given n, while the probabilities Pi change significantly when varying the parameter λ.

Fig. 3.

Example, n = 10, λ = 3.

Fig. 4.

Example, n = 10, λ = 5.

17

It should be also stressed that the behaviour of Pi (and, consequently, the r∗ value) strongly depends on the assumption of uncertainty ui , i 6= i0 or the weights proportion. In the above example we considered the equal uncertainties (weights) for the non-outlier participants. On the Fig. 5 the P2 dependencies are presented (n = 10, λ = 3) when the non-outlier uncertainties • are equal, • form an arithmetic progression, • form a geometric progression. In the latter two cases the nonmonotonic character of P2 is not so clearly noticeable (in the case of the geometric progression P2 may be considered monotonic), and the r∗ point shifts towards larger weights ratios (making it more difficult to establish the restrictions on them).

Fig. 5.

P2 dependency on the weights proportion.

Let us suppose that the chi-square test was performed (and passed) before applying of En criteria, i.e. consider the following conditional prob(i) abilities for En : n o P1 | χ2 = P En(i) ≤ 2, for all 1 ≤ i ≤ n | χ2obs ≤ χ20.95 (n − 1) ; n o P2 | χ2 = P En(i) ≤ 2, for all i 6= i0 | χ2obs ≤ χ20.95 (n − 1) ; n o P3 | χ2 = P En(i0 ) > 2 | χ2obs ≤ χ20.95 (n − 1) .

18

An example for n = 10, λ = 5 and equal non-outlier weights is presented on the Fig. 6.

Fig. 6.

Example, n = 10, λ = 5, conditional probabilities.

Note that P2 is increased significantly in comparison with nonconditional case, as well as the value of the recommended upper bound r∗ for the weights ratio (i.e. the preliminary chi-square test affects the weight ratio limitation noticeably). It could be also mentioned that the probability of the pair-wise consistency for all the NIMs (as well as the conditional one) behaves in a quite similar to P1 manner.

4. Conclusion Three conventional criteria for checking the declared uncertainties were discussed. Probabilities of criteria passing were considered (unconditional and conditional). The importance of pair-wise DoEs calculation (especially in the case of inconsistent data) was confirmed on the model examples. Influence of an outlier on confirmation of declared uncertainties was quantitatively expressed. The probability of confirmation of measurement uncertainties by all the KC participants (with the exception of the outlier) depends on a range of uncertainties to the greatest extent. Reasonable limitation on the measurement uncertainties ratio was discussed.

19

References 1. CIPM MRA. Bureau International des Poids et Mesures, 1999. 2. Cox M.G. The evaluation of key comparison data: An introduction. Metrologia, 2002, 39, pp. 587-588. 3. Cox M.G. The evaluation of key comparison data. Metrologia, 2002, 39, pp. 589-595. 4. Cox M.G. The evaluation of key comparison data: determining the largest consistent subset. Metrologia, 2007, 44, pp. 187-200. 5. Lira I. Combining inconsistent data from interlaboratory comparisons. Metrologia, 2007, 44, pp. 415-421. 6. Elster C., Toman B. Analysis of key comparisons: estimating laboratories biases by a fixed effects model using Bayesian model averaging. Metrologia, 2010, 47, pp. 113-119. 7. Chunovkina A.G., Elster C., Lira I., Woeger W. Analysis of key comparison data and laboratory biases. Metrologia, 2008, 45, pp. 211-216. 8. Toman B., Possolo A. Laboratory effects models for interlaboratory comparisons Accreditation and Quality Assurance. Metrologia, 2009, 14, pp. 553-563. 9. Elster C., Toman B. Analysis of key comparison data: Critical assessment of elements of current practice with suggested improvements. Metrologia, 50, 2013, p. 549. 10. Steele A.G., Wood B.M., Douglas R.J. Exclusive statistics: simple treatment of the unavoidable correclations from key comparison reference values. Metrologia, 2001, 38, pp. 483-488. 11. Beissner K. On a measure of consistency in comparison measurements. Metrologia, 2002, 39, pp. 59-63. 12. Beissner K. On a measure of consistency in comparison measurements: II. Using effective degree of freedom. Metrologia, 2003, 40, pp. 31-35. 13. Willink R. Statistical determination of a comparison reference value using hidden errors. Metrologia, 2002, 39, pp. 343-354. 14. Hari K. Iyer, Wang C.M., Vecchia D.F. Consistency tests for key comparison data. Metrologia, 2004, 41, pp. 223-230. 15. Hari K. Iyer, Wang C.M. Detection of influential observations in the determination of weighted-mean KCRV. Metrologia, 2005, 42, pp. 262265.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 20–37)

Quantity in metrology and mathematics: A general relation and the problem*

V. A. Granovskii Concern CSRI Elektropribor, JSC, 30, Malaya Posadskaya Str., 197046, Saint Petersburg, Russia E-mail: [email protected] The quantity notions in mathematics and metrology and their relation and interaction are considered. The quantity in mathematics belong to the modelling field and is an ideal object while the quantity in metrology has an experimental character and so is an uncertain object. Every metrological model object including measurement aim, measurand, metrological characteristic of a measuring instrument, are expressed by mathematical quantities. When the model object is evaluated by using measurement, so the experimental quantity is obtained, that is metrological quantity. Because of the principal uncertain character of metrological quantities, measuring data and relating metrological quantities have to be processed using firstly non-classic mathematical means but of special type, taking into account the above-mentioned character. There are approximate linear equations theory, interval arithmetic, fuzzy set, and so on. There is not wide use of these means. The reasons are traditions, and absence of data structure analysis, and special place of stochastic tools. The latter is conditioned by some peculiarities of probabilistic-stochastic models. Main, and wide spread, mistakes or faults in use of these models are discussed. Indirect measurement is considered as the field of most complicated interaction of experimental and mathematical quantities. Keywords: Quantity; Mathematics; Metrology; Model; Experimental; Accuracy; Critical ill-conditioning; Approximate equation; Interval variable; Fuzzy variable.

This work was supported by the Russian Foundation for Basic Research (project 16-08-00082).

*

20

21

1. Introduction Data processing in measurement is carried out by mathematical rules because data are, in the long run, quantities. Quantity is defined in mathematics as [1] An entity that has a numerical value or magnitude or as [2] An object which is fully characterized by one number (scalar) or finite aggregate of numbers (scalars, vector, tensor). At the same time, quantity serves as initial concept of metrology, and it is defined as follows [3]: Quantity is property of a phenomenon, body, or substance, where the property has a magnitude that can be expressed as a number and a reference. In other words, quantity in metrology which is always obtained by an experiment, that is an experimental quantity (EQ), is expressed by a mathematical quantity (MQ). A work, in the measurement domain, with mathematical objects has the fundamental feature which is that they are mentioned as inaccurate (uncertain) in principle. It is unlike classical (general) mathematics where the object under study is considered to be adequate to itself, to which the accuracy concept is not applied. Data processing is the stage of measurement on which EQs and MQs are used together. More correctly, the EQ is expressed by the MQ. Accordingly, the problem arises of correct mathematical operations taking into account the EQ features. The report is focused upon the analysis of the problem. An attempt is made to trace which mathematical tools modification is corollary of the use of metrological principle of an EQ uncertainty†, and how metrological problems should be solved by means of modified mathematical tools. The operation scheme under criticism is shown in the Figure 1.

† The notion “uncertainty” is used in the paper, first of all, as the general notion characterizing knowledge, and in some cases — as the competitive term to “error” and the base of the now popular conception of a measurement accuracy evaluation (which author is not follower of).

22

Fig. 1. Usual scheme of metrology and mathematics interaction in the course of data processing.

There are two separate domains — metrology and mathematics. The metrological problem is formulated (of course, in metrology) and re-formulated as mathematical one. Then operations are transported in mathematics where proper tools are chosen, then mathematical transformations are realised, and the result is transported back in metrology. More correct scheme is proposed at the end of this paper. 2. The features of mathematical and experimental quantities Mathematics arrives at last at objects having a single meaning even when it studies not well-defined (not hard determined) models. For instance, in mathematical statistics, any statistic is, on the one hand, a probabilistic (nondeterministic) index, on the other hand, it is defined by probabilistic indexes which are deterministic values — MQs. Classical mathematics used for calculations (including calculation in the course of data processing) includes arithmetic and its generalisations, first of all, algebras — linear, vector, and tensor ones. Accordingly, mathematics deals with numbers and sets of numbers. A number is treated as one-valued identifiable object. In other words, a number is considered as an object which is identical

23

only to itself. Notion “accuracy” can not be applied to number: “inaccurate number” is euphemism. In this regard number is an ideal object. Metrology deals with real object properties and attributes which are characterized by measurable quantities. Moreover, quantity is determined in physics as measurable property of the object [4]. The EQ is modelled by the MQ composed of a non-denominate number, and an unit which is the uniquely identifiable specification. In other words, on the modeling level, nominated numbers, which are one-valued identifiable objects as well, are added to single numbers and sets of numbers. But any metrological action, even theoretical one, compels to come from a MQ to an EQ. The latter is interpreted as principally not having a single meaning. For instance, the electrical current strength unit — ampere — is defined as follows [5] “the ampere is that constant current which, if maintained in two straight parallel conductors of infinite length, of negligible circular cross section, and placed 1 metre apart in vacuum, would produce between these conductors a force equal to 2 × 10−7 newton per metre of length”. It is clear that attempt to reproduce ampere according to its definition will be affected by the need to recede from the definition in some points. Moreover, the definition itself contains a direction on the uncertain quantity “negligible circular cross section”. Therefore the definition of ampere is principally ambiguous one, or inaccurate one — if the definition is considered as specification for the realisation of the unit. Any object in measurement is principally not accurate except for indications as usual (mathematical) numbers. But indications themselves have no metrological sense. In [6] it is rightly commenting: «… rather a long chain of hypothetic-deductive constructions lead to instrument's readings as elements of empirical knowledge, and only these constructions impart a sense to the readings». Any EQ is an aggregate of number field (for the scalar EQ — simply a number) and two model domains “surrounding” it: one of them is “around” the measurand and another one is “around” the measurement unit. Therefore, a model area exists around the MQ. A rear projection of a model domain onto number one begets the number field (an error or an uncertainty (UC)) around the

24

MQ, that is an estimate of the measurand. Thus an EQ is an “open area” of MQs. A notion “open area” is treated as MQ totalities whose pairwise differences are unlimited. Really set diameter of MQs forming the area depends on quantities measurement accuracy whose quantities are characterized by means of MQs, and the accuracy can be different. Besides, if the above-mentioned measurements are accompanied by random errors, so the latter are often modelled with unlimited probability distributions, for instance, with the normal distribution. Therefore, in that special case of regular errors, an open area of a MQ — its uncertainty area — can be modeled with an unlimited non-centered probability distribution. Generally, any MQ as a model of the EQ is conditioned by specific measurement problem, so the MQ must have specific features. Besides all MQs must have common properties which following one is mostly important from. Because of all EQs are results of measurement and thus comparison results, so EQs have to be compared. Therefore MQs modelling EQs have to be elements of normalized space where notion of distance between any two elements is introduced. It permits, in principle, to define and realise the measurement scale required because the distance notion (in the MQ space) is the model of the difference notion (in the EQ domain). The MQ and the EQ are different, as well, regarding influence of temporal factor, that is, physical time. The MQ is static even if we deal with function of argument t. “Static” is treated as regular: even if quantity depends on t, the dependence function f is stable at physical time. In opportunity, the EQ includes essentially time as influencing factor. The EQ “lives” at time, and that is the reason of its changeability. MQ and EQ features have to be taken into account for their transformations. 3. Transformations of experimental quantities Mathematical transformations, in metrology, are EQ transformations. As a rule, every transformation is formed by two “accurate” transformations — transformation of MQs (estimates) and transformation of MQs (errors). Let consider, for instance, a work model of a strapdown inertial navigation system (SNS) [7]. The model is represented as an aggregate of the accurate equation (the SNS correct work equation):

25

�̅



(1)

�̅ ∗ � ,

where � — a radius-vector of moving object mass centre in terrestrial reference system; � ̅ — an apparent acceleration vector; �̅ ∗ — a vector of terrestrial gravity (force of gravity), which is reduced to the form �

Ω

Ω

�̅



(2)

�̅ � ,

where Ω — a vector of rotation angular velocity of navigation co-ordinate system relative to inertial system (the Earth rotation angular velocity); �̅ — an acceleration of gravity vector, and the SNS error equation: ��

Ω

Ω

��

��

(3)

�� � ,

which is resulted from (2) by transition to variations ��, ��, ��. Such a way of getting equations for errors on the base of variances in exact equations is prevailing. Then, the exact equations are used for calculation of an aimed result (exact or nominal), and error equations are used purposely. The approach is valid if an influence factors (sources of errors) affecting on a final result can be represented by means of linear operators. In general case, the above-mentioned approach is methodologically incorrect. It is shown [8] that if linear algebraic equation with n unknowns where �



;�

with exact equation

��

� ;�

�,

(4)

�∗ ,

(5)

� ; � , � – approximate values, is confronted �∗ � ∗

whose coefficients are known with uncertainties (terminology of [8]), accordingly, ij, i: �∗ ∈ �

� , �∗ ∈ �

� ,�

so solution of the equation (4) where �

� ,�



� ,�

� ],

(6)

(7)

��,

, is not unreasonable (makes sense) only in the case



that matrix � in (5) does not have singularity when coefficients change within limits (6). Otherwise system (5) named critical ill-conditioning one. In other words, if (5) is written as ∗



�� �

��



�� ,

(8)

26

where �� matrix �

���� , |��| ���� , and introduce coefficient uncertainty � , then we get that necessary and sufficient condition for

absence of critical ill-conditioning is that matrix � �� does not have singularity when coefficients change within limits (6), that is det �

whenever |��| � or �� � , �, � The ill-conditioning factor,

��

∑ ∑



�,

(9)

, , … , �.

| |

(10)

is introduced for practical calculations, and using it the clause (9) for absence of critical ill-conditioning becomes �

(11)

.

The critical ill-conditioning problem of algebraic system can be treated as the part of a general ill-posed problem [9], for instance, when solving linear operator equation that represents work of a measuring instrument in dynamic mode: �

��,

(12)

where x, y are accordingly input and output signals of a measuring instrument having converse function which is represented by operator B. Both inverse problems relating to (12) – determination of converse function and recovered input signal





�,

(13)



� �

(14)

— are ill-posed problems [10]. Detailed elaboration of (12) to the Fredholm and/or Volterra equation and its sampling lead [11] from (12) to (4). The methodological principle of separating experimental data, firstly obtained by measurement, into exact data and their uncertainties (errors) is subjected to criticism from a more general viewpoint in [12], where it is shown that the principle of uniqueness of solution of modeling problem for data is valid only if data are exact and full. Otherwise uncertainty principle is valid: imperfect data do not bring to the single model.

27

4. Mathematical tools An idea of inexact equations, that is equations of not exact values, finds a natural development in interval analysis [13], although, formally, an idea of a value (number) replacement by an interval also originates in the field of computer calculations [14, 15]. It is important to note that the basic problem of the interval analysis solving an algebraic equation, which is the determination of the unknown limits, fully matches the measurement problem and the formal measurement data processing task. Actually, in the framework of error conception, the limits (deterministic or probabilistic) of the measurand are determined as the measurement result. Development of the interval analysis methodology for the stochastic representation of uncertainties ij, i for coefficients and second member in (5) as confidence intervals [8] allows to formulate the following condition of absence of critical ill-conditioning: �

where statistical ill-conditioning factor is

∑ ∑ � �



where �

(15)

,



; �∗







��

(16)

,

det �∗ ; �

det � .



,�

(17)

As a result, the following estimate of the required unknown standard deviation is obtained: �





� �

, … , �.

(18)

Then, we come to unknown value confidence limits that meet, as well, the measurement problem when random errors are considerable. Generalising, it can be stated that interval analysis method meets measurement problem that is formulated in both the limits of error and uncertainty conceptions, because the latter intends to get quasi-stochastic interval for the measurand. Kuperman’s and Moore’s ideas were the first steps in understanding that any model representation is restricted. It was evident [16-18] that an uncertainty (UC) of any MQ representing some EQ by the scheme:

28

��

��

��

(19)

has to be treated more extendedly and must include not only (a) model knowledge but (b) components that defy modeling, namely, (b1) ignorance, that is semantic uncertainty of initial notions, and (b2) spontaneous, i.e. irregular changes of the above-mentioned components. So, the next stage of idea development of not fully determined values — fuzzy variables — analysis is fuzzy set notion [19, 20]. It appears as the “crossing point” of the stochastic interval analysis [8], and the criticism of probability-stochastic methods [12, 21], and the uncertainty extended interpretation. Actually, the interval as a tool of binary logic permits constructing models but only generalized ones, and obtaining estimates but only rough ones, that not fully meet practical demands. On the other hand, if some (more often — normal) distribution is arrogated to the quantity, doubts are cast upon reliability of the estimates obtained. From the above point of view, it is more practical to introduce, on an interval, a membership function, or generalized characteristic function, which do not lean on the notion of a parent population, and do cover all uncertainty sources. Of course, a rational basing for the membership function used is necessary for fuzzy set (fuzzy logic) tool application. So transferring measurement inaccuracy principle into mathematics leads to alteration of mathematical tools required for solving mathematical problems that serve as model ones concerning the application to technical problems. It should be recognized the fact that interval analysis and fuzzy variable methods do not break through in metrology in spite of separate attempts [22-25]. The reasons are the following. First, new tools are not traditional ones. Second, the tools are not developed enough for practical use. Third, the tools do not promise more than they can give. All above-mentioned reasons prove to be especially important if the situation with probabilistic-stochastic model is taken into account. 5. Statistical data peculiarity Actually, probability-stochastic model was the first answer on challenge connected with the problem of not-well determined quantities. Statistical data arise in practice as experimental estimates of a quantity under study. In that sense, data are realistic and objective. But to be processed they have to be

29

accompanied with the quantity probabilistic model, that is probability distribution law. Often it happens without a firm rational basing. The reason is that, for introducing probability distribution law, we use mathematical tools for solving nearly all problems of data processing. Of course, this is only illusion, but temptation is strong. Rather, the only reason for the choice is availability and simplicity of mathematical tools for data processing. The classical example is extremely widespread occurrence of the normal distribution. Generally speaking, “conjecturing” the distribution law is the fundamental defect of many cases in statistics practical use [21]. The next important defect is absence of full and clear idea of the nature of statistic, especially sample statistic. Any statistic is the EQ but in terms of probability and measurement concurrently. As a measurement result, the statistic has to be characterized by accuracy parameters. As a probabilistic object, the statistic generates an unlimited number of statistics of different kind. For instance, if the variance D(x) of a quantity x is under evaluation, the sample variance � � is used as an estimate. The latter is characterized particularly by a variance � � whose estimate is the sample statistic � � which in its turn is characterized by the next variance � � , and so on. Unlike the recent practice, taking into account the fact of the abovementioned “embedded” statistics leads to a conception of more accurate processing data comprising random errors. Usually, apart the estimate centrality problem, we concentrate ourselves on the dispersion. As a rule, an estimate calculation is not accompanied by its dispersion evaluation out of dependence on a sample size. It was shown long ago [26], that the good estimate of a second moment needs a sample size larger than 50, a third and fourth — many hundreds, and so on. At this point it has to mark that, in classical metrology, it was customary to evaluate the dispersion of a measurement error standard deviation [27]. So even for the direct measurement when a result accuracy does not depend on the measurand, the measurement result expression proves to be trinomial one. Development of the idea to use, besides “the first floor” of statistical processing “the second floor” and estimate “an error of the error” of a measurand estimate obtained by measurement, would allow to get new useful consequences concerning traditional calibration condition, that is, a certain ratio k of accuracies of the reference standard (RS) and of the measuring instrument under calibration [28]. It was shown that the important (but not necessary) condition for possible k decrease is that the RS does not have any significant systematic error. In this

30

case we note that the necessary and sufficient condition for the problem solution is the presence of estimates of the RS error characteristics, the accuracy of which is not less than the predetermined level depending on the accuracy ratio of the compared devices. In other words, the RS can be considered not only as the device more accurate than the instrument under calibration but as the device whose accuracy is known more exactly. This makes it possible to decrease the range of acceptable accuracy ratios. 6. A necessity of data structure analysis As used here the general uncertainty conception is useful as the metrology methodological principle. According to this principle, there are no exact quantities in metrology. Any quantity is obtained by measurement and represented by the measurement result (perhaps mathematically transformed). Therefore it should be considered as an uncertain one and accompanied by an uncertainty (19). But at the same time it is important acknowledging the fact that the uncertainty, regarding to a measurement result accuracy evaluation, covers both regular inaccuracy sources represented by deterministic and stochastic models, and irregular sources defying modeling. In the course of data processing, principally, we should deal with uncertain quantities and process the required transformations. Mathematical tools used for the transformations must be selected on the basis of the data structure analysis using a priori information and information obtained during the measurement process. Data structure analysis means firstly subdivision of the dataset body into some parts according to their origins, and revealing the source of every part. Then the sources should be studied for determination of their peculiarities, and of their interference each other. Further proper mathematical tools must be chosen for every part. It is important to underline that it is necessary to resist the temptation of applying a single common mathematical tool before the dataset subdivision and the mathematical tool choice. Otherwise most likely probabilistic-stochastic model will be affected. Of course, after choice of different mathematical tools, the tools co-ordination problem arises. In general case, it is not a simple problem. But difficulty of the problem solving is not big price for validity of data processing results.

31

The interval, as it treated in the interval analysis, answers a minimum information about the EQ. In practice, the uniform distribution in the interval limits is often attributed to a quantity defined by an interval. It is as incorrect as attributing the normal distribution to a quantity represented by a dispersion parameter, for instance, by the standard deviation formally obtained. In fact, interval statistics does not make use of probability distributions. Practice of a not well-founded stochastic models application is excoriated (riddled) in [12, 21]. This practice is characterized by an absence of both a capability study of the parent population notion and an examination of the probability distribution stability. Moreover nearly always an ungrounded transfer to the field of time series is performed from an initial probabilistic space. The time series obtained in the measurement process is processed as a sample from a parent population without checking ergoticity of the process generating data. Generally, the principle “in persisting absence of well-grounded deterministic model, the stochastic model should be used” must be rejected decidedly. Similarly we should handle the particular principle “in persisting absence of the probability distribution information the normal distribution must be used”. When data structure analysis is lacking, an use of the one or another EQ model and relative mathematical tools remains to be a tribute to a fashion or to author’s weakness. 7. An example: Indirect measurements An indirect measurement serves as «place of meeting» EQs and MQs and as an example of their interaction. The indirectly measured quantity is introduced (defined) by means of the function �

� � ,� ,…,�

(20)

,

where every arguments � is under direct measurement. For given (ith) totality of argument values � , the quantity Q is the calculation result: �

� � ,� ,…,�

.

(21)

Strictly speaking, (20) and (21) are correct only if � are MQs. If � are considered as EQs, that is uncertainties inhere in them and their realisations � ,

so (20) is incorrect and (21) become approximate:

32

or, what is equivalent,



� � ,� ,…,�

,

(22)



� � ,� ,…,�

.

(23)

where � — approximate function («strained» relative to f). Nevertheless, Q and Qi are undoubtedly calculation results from (20) and (21). In applied field, where quantity Q is used, it can be additionally transformed as required. So number of calculation results arises, that is number of MQs: Q, Q(1), Q(2),… On the other hand, the arguments are the measurands. Every measurand is defined as the measurement object model parameter which characterized by so called “threshold uncertainty” [29]. Generally speaking, “threshold uncertainty” is a discrepancy between the measurand as a model object and the real object property under measurement. In that way, the arguments are EQs as potential measurable quantities, and as actual ones. Then Q, as indirectly measured quantity, is an EQ as well. The indirect measurements result is determined by their organization. If it is possible to obtain, at the same time, all arguments estimates in ith cycle of direct measurements then (22) has to be used for getting Qi because the arguments estimates are biased. The bias is fixed but includes regular and irregular parts bearing in mind its origin (sources). For samples i = 1,…, n, mean value can be get as Q estimate: �



∑ � � ,� ,… , �



� � , � , … , � ).

.

(24)

The above-mentioned reasoning concerns to a greater extent estimating Q when the real object (an object under research or test) peculiarities force to measure the arguments separately. Then sample estimates of a proper statistic are getting at first, for instance, sample mean values � , � , … , � . As they are considered as EQs the expression (20) can be used for getting estimate (25)

It is approximate not only because of a statistical character of the arguments estimates but firstly taking into account (22), (23). Of course, the estimate (25) quality depends particularly on the function f(ꞏ) character. It is clear that the estimates (24) and (25) are not congruent in a general case. Therefore two intervals are under comparison where the measurands with some probability (or membership function value) have to be in. Because the

33

measurand is the same in that case so the intervals must have some common part. Probabilities (membership measures) of the indirectly measured quantity located into this common zone must not differ significantly. The difference is conditioned by assumptions accepted for the one or another algorithm. Moreover the above-mentioned difference can serve as a measure of the assumptions validity. In particular, if the function f() is a nonlinear one and a resulting error is calculated by means of a linear part of total differential, the significant difference of the above-mentioned probabilities is evidence of the fact that the total differential linear approximation (or what is the same — the function f() linear approximation) is unacceptable. Coming back to the EQ estimates accuracy evaluation (including sample estimates), we should point out a necessity, in general case, to evaluate the remaining term in the total differential expansion. As shown above, indirect measurements comes with the problem of the quantity Q (21) attribution to the computational one (the MQ) or the measured one (the EQ). When considering only the “calculation” problem, it is necessary to repudiate the indirect measurement notion at all. The traditional “measurement” position requires the perception and taking into account features of EQ mathematical transformations particularly approximate character of expressions (21)–(25). 8. Conclusions The notions of quantity cooperate and interact in a complicated manner in mathematics, and in metrology. Experimental quantity, in metrology, needs modeling, and mathematical quantity, in mathematics, needs be used. An experimental quantity in metrology is expressed by a number complex, and the latter fits with mathematical quantity exponent — a number, which is basic element of classical mathematics. The number structure of an experimental quantity is determined by a priori information and the information collected in the course of measurement. In metrology, we deal only with experimental quantities. All mathematical quantities modelling experimental quantities, and experimental quantities mathematical transformations have to meet requirements of the general uncertainty principle which is fundamental for experimental quantities.

34

Data processing is an unified process of measurement which embraces all operations concerning a priory information and data obtained in the course of experimental measurement procedure. Every operation has a metrological content expressed particularly by experimental quantities transformations. Data processing division into two parts: the metrological one, that is experimental quantities metrological transformations, and the mathematical one (mathematical quantities mathematical conversions) is unreasonable and unacceptable. Mathematical tools for modelling experimental quantities and their mathematical transformations have to be chosen taking into account all features of the specific measurement situation including the characteristics of the transformed data. Because of that, data structure analysis is a most important initial phase of data processing. The general metrological principle of measurement inaccuracy, and of uncertainty, has an influence on mathematical processing. It stimulates developing new, more complicated, mathematical quantities particularly directed towards experimental quantities modeling. Accordingly new mathematical tools are created for modeling. Coming back to the scheme in Fig. 1, we can propose a more correct one in Fig. 2.

Metrology Measurement

Task formulation Experiment

Data processing Result acceptance

Tools choosing

Mathematics Tools choosing

Operations

Fig. 2. Correct scheme of metrology and mathematics interaction in the course of data processing.

35

The meaning of the scheme is as follows. After the metrological problem is formulated, the boundary between the metrology and mathematics domains is moved down so the metrology domain is extended and includes the part of mathematics. Further operations are performed in the extended metrological domain. Getting into metrological domain, mathematical quantities lose some mathematical features and get some metrological features. It permits to save the metrological content of all mathematical transformations. References 1. 2. 3.

The Concise Oxford Dictionary of Mathematics, 5th edition, 2014. Yu. Ya. Kaazik Vocabulary on Mathematics. Tallinn: Valgus, 1985. International vocabulary of metrology – Basic and general concepts and associated terms (VIM), 3rd edition. JCGM 2012. 4. Oxford Dictionary of Physics (7 ed.) / Oxford University Press, 2015. 5. Resolution 2 of the 9th General Conference on Weights and Measures, 1948. 6. F. M. Kanak, с я а с я п с с (About a place of measurement in correlation of empirical and theoretical onces). In: Epistemological aspects of measurement [Trans. 2nd Symp. on Epistemological Problems of Measurement in Physics Teaching] – Kiev: Naukova Dumka, 1968, . 79-96. 7. N. T. Kuzovkov, O. S. Salychev, И а ая а а я п а ая а я (Inertional navigation and optimal filtering). – .: Mashinostroenie (Mechanical engineering), 1982, 217. 8. I. B. Kuperman, Approximate Linear Algebraic Equations, London, Van Nostrand Reinhold, 1971. 9. B. Hoffmann, Ill-Posedness and Regularization of Inverse Problems – A Review of Mathematical Methods. In: The Inverse Problem / Ed. by H. Lübbig. Akademie Verlag GmbH, Berlin, 1995. 10. V. A. Granovskii, Models and methods of dynamic measurements: results presented by St. Petersburg Metrologists. In: “Advanced Mathematical and Computational Tools in Metrology and Testing X” (AMCTM X) (F. Pavese, W. Bremser, A.G. Chunovkina, N. Fischer, A.B. Forbes, Eds.), vol. 10, 86, World Scientific, Singapore, 2015. 11. M. T. Heath, The numerical solution of ill-conditioned systems of linear equations, Oak Ridge National Laboratory, Oak Ridge, Tennessee, 1974.

36

12. R. E. Kalman, Identification of Noisy Systems. Russian Mathematical Surveys (1985), 40(4): 25. 13. R. E. Moore, R. B. Kearfott, M. J. Cloud, Introduction to Interval Analysis, SIAM, Philadelphia, 2009. 14. R. E. Moore, Interval Arithmetic and Automatic Error Analysis in Digital Computing / Ph.D. dissertation, Department of Mathematics, Stanford University, Stanford, CA, 1962. 15. I. B. Kuperman, Approximate linear algebraic equations and rounding error estimation. Ph.D. thesis, Deptof Applied Math., Univ. of Witwatersrand, Johannesburg, 1967. 16. L. A. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes. IEEE Trans Syst Man Cybernet, 1973, 3: 28-44. 17. A. S. Narinyani, п с с с п с а я а а (An underdeterminedness in a system of knowledge presentation and processing). . А ССС , « ехниче я и е не и », 1986, № 5. 18. E. Hyvonen, Constraint reasoning based on interval arithmetic: the tolerance propagation approach, Artificial Intelligence, 1992, vol. 58. 19. W. A. Lodwick, K. D. Jamison, Interval-valued probability in the analysis of problems containing a mixture of possibilistic, probabilistic and interval uncertainty, Fuzzy Set Syst 2008, 159: 2845-2858. 20. L. A. Zadeh, Fuzzy sets, Information and Control, 1965, 8 (3), 338–353. 21. В. . ин. а п с ( я с с а с с с ) (An adaptability bounds (Probabilistic-statistic methods and their facilities)). .: н ние, 1977. 22. V. Kreinovich, Interval computations and interval-related statistical techniques, In: F. Pavese et al. (eds.), Advanced Mathematical and Computational Tools in Metrology and Testing, World Scientific, Singapore, 2015, pp. 38-49. 23. G. N. Solopchenko, L. K. Reznik, W. C. Johnson, Fuzzy intervals as a basis for measurement theory / Fuzzy Information Processing Society Biannual Conference, 1994. Industrial Fuzzy Control and Intelligent Systems Conference, and the NASA Joint Technology Workshop on Neural Networks and Fuzzy Logic. 24. G. N. Solopchenko, K. K. Semenov, V. Kreinovich, L. Reznik, Measurement’s result and its error as fuzzy variables: background and perspectives, Key Engineering Materials, 2010, Vol. 437, pp. 487-491. 25. K. K. Semenov, G. N. Solopchenko, V. Kreinovich, Fuzzy intervals as foundation of metrological support for computations with inaccurate data,

37

26. 27. 28.

29.

In: Franco Pavese, Wolfram Bremser, Anna Chunovkina, Nicolas Fisher, and Alistair B. Forbes (eds.), Advanced Mathematical and Computational Tools in Metrology and Testing AMTCM'X, World Scientific, Singapore, 2015, pp. 340-349. W. Feller, An introduction to probability theory and its applications. Vol. 1. 2nd ed. N.Y., Wiley & Sons, Inc., 1957. M. F. Malikov, Basics of metrology. Moscow: Kommerpribor, 1949. V. A. Granovskii, M. D. Kudryavtsev, Error ratio of a measuring instrument under calibration and the reference standard: conditions and possibilities of decrease, Advances in Intelligent Systems and Computing, vol. 543 “Recent Advances in Systems, Control and Information”, eds. Roman Szevczyk, Małgorzata Kaliczyńska, Springer, Warsaw, 2016. V. A. Granovskii, Systemic metrology: metrological systems and metrology of systems, St. Petersburg: CSRI Elektropribor, 1999.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 38–47)

Bayesian analysis of an errors-in-variables regression problem I. Lira Pontificia Universidad Cat´ olica de Chile, Vicu˜ na Mackenna 4860, Santiago, Chile E-mail: [email protected] D. Grientschnig Metrolytik, Kernstockstr. 4, 8600 Bruck an der Mur, Austria E-mail: [email protected] Little guidance is available to metrologists for the formulation and Bayesian analysis of errors-in-variables regression problems. In this paper, we present a simple such problem, which consists in the determination of a natural convection heat transfer coefficient. The independent variable is time and the dependent variable is temperature; both of them are assumed to be subject to independently and identically distributed measurement errors. Keywords: Errors-in-variables; Regression; Bayesian analysis; Convection.

1. Introduction In a typical regression problem with two variables, it is desired to infer the ⊺ values of some or all of the parameters θ = (θ1 , . . . , θp ) of the curve η = fθ (ξ),

(1)

where ξ is a covariate, stimulus or independent variable related to a target, response or dependent variable η. These two variables are measured simultaneously at several states of some system, such that pairs of values (xi , yi ) are observed with corresponding additive errors (δi , ǫi ), i.e. xi = ξi + δi yi = fθ (ξi ) + ǫi

i = 1, . . . , n.

(2)

Sometimes — in standard regression — the errors δi can be ignored and the values xi are taken as the true values ξi of the covariate. But if this assumption is inappropriate, both types of error are to be taken into account. Equation (2) then defines an ‘errors-in-variables’ (EIV) regression model. 38

39

EIV models have been analyzed by frequentist and Bayesian means. The latter approach can be traced to the works of Lindley and El-Sayyad [1], Florens et al. [2] and Reilly and Patino-Leal [3], among others. More recent contributions along similar lines are those by Bolfarine and Rodrigues [4], Carrasco et al. [5] and Fang et al. [6]. However, applications of the Bayesian framework to EIV regression problems in metrology seem to be wanting. Below, we analyze in this way a simple EIV model and illustrate the resulting equations by means of an example involving a heat transfer experiment. 2. A simple EIV model Consider an EIV model of the type (2), in which the errors are assumed to be realizations of independently and identically distributed random  variables. These are modeled as N 0, σx2 for all the δi ’s and N 0, σy2 for all the ǫi ’s. According to Bayes’ theorem, the joint posterior for the regression ⊺ parameters θ, the true values ξ = (ξ1 , . . . , ξn ) of the covariate and the standard deviations σx and σy , is proportional to the product of the prior and the likelihood, p (θ, ξ, σx , σy | x, y) ∝ p (θ, ξ, σx , σy ) × L(θ, ξ, σx , σy ; x, y), ⊺

(3)



where x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) . The prior p (θ, ξ, σx , σy ) can be expressed as the product of p (θ, ξ) and p (σx )p (σy ) since, before using the model, knowledge about vector ⊺ ( θ ⊺, ξ ⊺ ) cannot be expected to provide any information about the standard deviations σx and σy , and vice versa. In turn, the likelihood L(θ, ξ, σx , σy ; x, y) can be decomposed into the product of the likelihood for the stimuli and that for the responses. Under the aforementioned assumption of independent Gaussian distributions, these likelihoods are   n Y (ξi − xi )2 1 (4) exp − L(ξ, σx ; x) ∝ σ 2 σx2 i=1 x and L(θ, ξ, σy ; y) ∝

  n Y (fθ (ξi ) − yi )2 1 exp − , σ 2 σy2 i=1 y

(5)

so (3) becomes   p (σx ) p (σy ) Sx Sy p (θ, ξ, σx , σy | x, y) ∝ p (θ, ξ) exp − 2 − , σxn σyn 2 σx 2 σy2

(6)

40

where Sx =

n X

(ξi − xi )2

and

Sy =

n X

(fθ (ξi ) − yi )2 .

(7)

i=1

i=1

If θ and ξ can be taken a priori as being independent, the prior p (θ, ξ) becomes p (θ)p (ξ). With respect to the standard deviations, they both contribute to the posterior through expressions of the form   S p (σ) (8) g (σ, S) = n exp − 2 . σ 2σ Two special cases can be distinguished. In the first, there is no prior information at all about σ, so the usual non-informative prior p (σ) ∝ σ −1 for a scale parameter may be used. Integration of (8) over σ then yields g (S) ∝ S −n/2 .

(9)

In the second special case, σ is known exactly. Then, the prior p (σ) becomes a delta function and so   S (10) g (S | σ) ∝ exp − 2 . 2σ In an intermediate case, a state-of-knowledge distribution for the standard deviation is available. Such an informative prior for σ might be   b (11) p (σ) ∝ σ −a exp − 2 , σ where a must be greater than 1 and b must be positive1 . Inserting (11) into (8) and integrating the result over σ yields g (S | a, b) ∝ (S + 2 b)−(a+n−1)/2 ,

(12)

which reduces to (9) if a → 1 and b → 0. The greater the value of a, the more informative is the prior (11). If a best estimate σ ˜ is at hand, after having chosen a value for a, it would be appropriate to set b = a σ ˜2 /2, because then the mode of p (σ) would coincide with σ ˜. 1 The

most commonly used informative prior for a variance parameter is the inverse gamma distribution with shape parameter α and scale parameter β [7]. It can be readily shown that the distribution for the corresponding standard deviation takes the form (11), where a = 1 + 2α and b = β.

41

3. Application: A heat transfer experiment Because of the complexity of the laws that govern convective phenomena, its teaching in elementary engineering courses is usually reduced to explaining some basic concepts of dimensional analysis and their relevance to the formulation of commonly used heat transfer correlations. Fortunately, some very simple experiments can be devised to facilitate students’ comprehension of these matters. One such experiment, designed to measure a convective heat transfer coefficient, is described next. A 5-mm deep hole was drilled into a polished 1-inch diameter stainless steel bearing ball weighing 67 g. A thermocouple was inserted in the hole and secured in place with a high temperature epoxy resin. The ball was submerged in boiling water and kept there until its temperature had stabilized. After its removal from the bath, the sphere was dried, placed on a wooden surface and left to cool in still air. Since the emissivity of polished stainless steel is very low, radiation can be neglected. Similarly, since the thermal conductivity of wood is low and there was little contact area with the ball, heat conducted to the table can also be neglected. Therefore, cooling by natural convection only may be assumed. 3.1. Theory Newton’s law of cooling states that the convective heat loss from a solid body immersed in a fluid is proportional to its surface area and to the difference in temperature between the surface of the body and the fluid. The proportionality constant is the heat transfer coefficient, denoted by h. Because heat comes from the body’s interior by conduction, its temperature is not uniform during the cooling process. However, spatial variability can be neglected if the Biot number Bi = hV /(kA) turns out to be much smaller than one, where V and A are the volume and surface area of the body, respectively, and k is its thermal conductivity. Conventionally, the upper limit for this so-called ‘lumped capacitance model’ to be valid is taken as being Bi = 0.1. In our case, for a stainless steel sphere of diameter D = 25.4 mm and conductivity k = 15 W/(m K), we obtain h limit = 354 W/(m2 K), which is much higher than the usual values of the heat transfer coefficient for gases in free convection. Therefore, it can reasonably be assumed that the temperature T of the bearing ball remains uniform during cooling. Since heat loss causes a decrease in the internal energy mc T of the sphere — where m = 67 g is its mass and c = 510 J/(kg K) is its specific

42

heat — the cooling process is governed by hA(T − Ta ) = −m c

dT , dt

(13)

where Ta is the air temperature. The solution of this differential equation is   t , (14) T = Ta + (T0 − Ta ) exp − τ where T0 is the temperature of the sphere at t = 0 and mc τ= hA

(15)

is the characteristic cooling time. Equation (14) is of the form (1) with {ξ, η} = {t, T } and {θ1 , θ2 , θ3 } = {Ta , T0 , τ }. The parameter θ3 is the one of interest, because the coefficient h can be obtained from it. Nuisance parameters are the ambient temperature θ1 and the starting temperature θ2 . 3.2. Data All measurements were carried out by a group of students. They selected t = 0 to be the time when the thermocouple indicated 60 ◦ C. The bearing ball temperature was recorded manually afterwards every two minutes for a total of twenty minutes. The air temperature was also recorded at the same times using another thermocouple. Time was measured with the chronometer in one of the students’ mobile phones. Results appear in Table 1 and the data points are plotted in Fig. 1 (a), together with the adjusted curve obtained as described next. 60

T  oC

55 50 45 40 HaL 35

0

HbL

200

400

600 ts

Fig. 1.

800

1000

1200

0

1000

1912

5000 Θ3  s

(a) Data points and fitted curve. (b) The prior for θ3 .

8000

43 Table 1.

Raw data.

Time x = t/min

Ball temp. y = T /◦ C

Air temp. T a /◦ C

0 2 4 6 8 10 12 14 16 18 20

60.00 57.62 52.15 49.50 47.28 45.67 42.91 41.57 37.58 37.64 35.84

18.08 18.56 19.04 19.08 18.82 18.15 17.35 16.98 17.03 17.22 17.82

3.3. Prior knowledge about the parameters The measurements of the ambient temperature are not included in the  likelihood, so they serve to establish the prior for θ1 as p (θ1 ) ∼ N T a , σa2 , where T a = 18 ◦ C is the arithmetic mean of the measurements. For σa we chose 0.5 ◦ C, which is of the order of their standard deviation. Before taking any temperature readings from the thermocouple inserted into the sphere, the only thing that may be safely said about the starting temperature is that it has to be between the ambient temperature and the boiling point of water. Thus, taking p (θ2 ) as a rectangular prior ranging between these two temperatures is a reasonable choice. To establish a prior for the characteristic cooling time θ3 , it is natural to use Churchill’s correlation for free convection over a sphere [8]: Nu = 2 +

0.589 Ra1/4 [1 + (0.469/Pr)9/16 ]4/9

(16)

where Nu = hD/k ∗ is the Nusselt number, Ra = gβ(T − Ta )D3 /(να) is the Rayleigh number and Pr = ν/α is the Prandtl number. In these numbers, g is the acceleration due to gravity and k ∗ , β, ν and α are properties of the fluid, which in our case is air: k ∗ is its thermal conductivity, β is its thermal expansion coefficient, ν is its kinematic viscosity and α is its thermal diffusivity. All these properties depend on temperature, so they should be evaluated at the film temperature Tf = (Ts + Ta )/2, where Ts is the surface temperature. Since neither Ts nor Ta are constants, taking them as equal to the respective averages during the experiment is quite reasonable. This gives Tf = 32 ◦ C. From the values of k ∗ , β, ν and α of

44

air at this temperature we get Nu = 8.4 and an estimated heat transfer coefficient ˜ h = 8.8 W/ (m2 K). With m = 67 g and c ≈ 510 J/(kg K) for stainless steel, we get θ˜3 = 1 912 s for the estimated characteristic time. Of course, heat transfer correlations such as (16) above are not exact. Also, neither the specific heat of the bearing ball’s material nor its surface area are known accurately. So, since h cannot be negative, it seems reasonable to take the prior p (θ3 ) of the form (11). An ad hoc value a = 4 was chosen, which, together with b = a θ˜32 /2, gave the distribution shown in Fig. 1 (b).

3.4. Prior knowledge about the standard deviations The errors ǫ in the measured temperatures of the sphere arise primarily from the fact that Eq. (14) is based on conditions that are known not to be strictly valid (called ‘model imperfections’). For one, the temperature Ta does not remain constant. Neither is the actual heat transfer coefficient a constant, because to a certain extent it depends on the difference T − Ta at any given time. Moreover, its value is influenced by air currents in the room. Also, despite fulfilment of the Biot number criterion, a slight temperature inhomogeneity within the sphere does exist, so the surface temperature is not exactly equal to the one indicated by the thermocouple (which in turn cannot be taken to be exact). Finally, recall that we have ignored heat escaping by radiation and conduction, which in fact does occur. Granted, the effects of some of these influences are systematic, thus affecting all responses alike and so causing the errors to be correlated. However, for simplicity they will be assumed to be purely random and uncorrelated, so we shall model all errors ǫ as N 0, σy2 . Since there is no prior knowledge about σy , we shall set p (σy ) ∝ σy−1 . The measured values of time, x, should not be taken as being equal to the true values ξ, because the students may have read the temperatures slightly before or after the set times. These possible time differences are precisely the errors δ, which can also be assumed to be random and so, to be modeled as N 0, σx2 . Since there is hardly any prior knowledge about the true times ξ, the improper prior p (ξ) ∝ 1 applies. However, to determine the standard deviation σx , it is possible to carry out an evaluation of the students’ reaction times. We shall consider two scenarios. In the first, we suppose that such an evaluation was executed and that it produced 2 s as the precisely known value of σx . In the second scenario, we take 2 s as being only a best estimate of σx and adopt (11) as the functional form of

45

p (σx ). With p (θ) = p (θ1 )p (θ2 )p (θ3 ), the respective posteriors are   Sx p (θ, ξ | x, y, σx ) ∝ p (θ) exp − 2 Sy−n/2 2 σx

(17)

and p (θ, ξ | x, y, σ ˜x , a) ∝ p (θ)(Sx + a σ ˜x2 )−(a+n−1)/2 Sy−n/2 .

(18)

3.5. Results We used Markov chain Monte Carlo (MCMC) [9] to integrate the posteriors (17) and (18), with chains consisting of 106 steps and discarding their initial 105 steps. In the second scenario we chose again the ad hoc value a = 4. The results of the two scenarios did not differ significantly. As expected, the means of the first two parameters were very close to 18 ◦ C for θ1 and 60 ◦ C for θ2 , with corresponding standard deviations of about 0.5 ◦ C and 0.6 ◦ C. The means of the true values ξ were almost equal to the measured times x. The only noticeable difference caused by the choice of the prior for σx was in the standard deviations of these values. They were all very close to 2 s when determining the marginal distributions for the ξi ’s from (17), but about 4 s, if these marginals were calculated from (18). The mean and standard deviation of the posterior marginal for the characteristic time θ3 were found to be about 1 390 s and 60 s in both scenarios, respectively, with small variations from run to run. The histogram for this parameter is shown in Fig. 2 (a). In order to find a numerical approximation for the distribution of the heat transfer coefficient h, we proceeded to apply the Monte Carlo method in Supplement 1 to the GUM [10]. To this end, we used the stored values of θ3 and assumed normal distributions for the mass of the sphere, for its specific heat and for its diameter, with standard deviations 0.25 g, 5 J/(kg K) and 0.2 mm, respectively. Fig. 2 (b) shows the histogram for this quantity. Its mean is about 12.2 W m−2 K−1 and its standard deviation is 0.6 W m−2 K−1 . The continuous curve over this second histogram is a Gaussian distribution with the same mean and standard uncertainty; it was drawn for comparison purposes. Our estimate for the convection coefficient confirms that heat transfer correlations such as (16) are not exact expressions. The fact that the computed value of h is almost 40 % higher than the one predicted by Churchill’s correlation should probably be attributed to minor air currents in the lab. It is interesting to compare these results with those of a nonlinear leastsquares fit of (14) to the data. We did this by ignoring the errors in the

46

time measurements, assuming θ1 = 18 ◦ C to be precisely known and taking starting values 60 ◦ C for θ2 and 1 912 s for θ3 (this second starting value originates from the estimation in subsection 3.3). The results were 1 376 s for the characteristic time and 12.3 W/(m2 K) for the heat transfer coefficient, with an associated standard uncertainty of 47 s for the former. Since these values concur quite well with the respective figures obtained with the Bayesian analysis, we conclude that the uncertainties of the covariates are not really significant in this problem. The least-squares adjusted curve is visually indistinguishable from that in Fig. 1 (a).

HaL 1200

HbL 1300

1400 Θ3  s

Fig. 2.

1500

1600

11

12

13

14

h  HW m-2 K -1 L

Histograms for the characteristic time (a) and for the convection coefficient (b).

4. Conclusions Regression problems arise frequently in metrology, and in some instances they have to be analyzed taking into consideration the uncertainties in both coordinates. Then, errors-in-variables (EIV) model techniques have to be used. The Bayesian approach is a convenient alternative to least-squares methods, because it results in a posterior distribution regardless of whether the problem is linear or nonlinear. Moreover, the Bayesian approach allows taking prior knowledge into account and, in contrast to least-squares, does not require a minimization criterion that — not being part of the original regression model — is somewhat arbitrarily imposed. In this paper, we performed a Bayesian analysis of a simple EIV regression problem. The problem was to obtain the distribution for the convective heat transfer coefficient h over a metallic sphere cooling in fairly still air. For demonstration purposes, the errors in the measurement of temperature and time were assumed to be independently and identically distributed. Strictly, this assumption is only an approximation that should be revised if

47

the analysis were intended to provide a meticulous value of the uncertainty associated with the resulting estimate for h. Nevertheless, the results given above do produce an order of magnitude of such an uncertainty and are in general more reliable than the point estimate and standard uncertainty of h obtained by ignoring all measurement errors of the covariate, let alone the value that follows from the use of a convection correlation. However, in the current instance, the differences with the respective results of least-squares fitting were very small. References [1] D. V. Lindley and G. M. El-Sayyad, The Bayesian estimation of a linear functional relationship, J. Roy. Stat. Soc. B Met. 30, 190 (1968). [2] J.-P. Florens, M. Mouchart and J.-F. Richard, Bayesian inference in error-in-variables models, J. Multivariate Anal. 4, 419 (1974). [3] P. M. Reilly and H. Patino-Leal, A Bayesian study of the error-invariables model, Technometrics 23, 221 (1981). [4] H. Bolfarine and J. Rodrigues, Bayesian inference for an extended simple regression measurement error model using skewed priors, Bayesian Anal. 2, 349 (2007). [5] J. M. F. Carrasco, S. L. P. Ferrari and R. B. Arellano-Valle, Errorsin-variables beta regression models, J. Appl. Stat. 41, 1530 (2014). [6] X. Fang, B. Li, H. Alkhatib, W. Zeng and Y. Yao, Bayesian inference for the errors-in-variables model, Stud. Geophys. Geod. 61, 35 (2017). [7] A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari and D. B. Rubin, Bayesian Data Analysis, third edn. (CRC Press, Boca Raton, Florida, 2013). [8] S. W. Churchill, Free convection around immersed bodies, in Heat Exchanger Design Handbook, ed. E. Schl¨ under (Hemisphere Publishing, New York, NY, 1983), section 2.5.7. [9] C. P. Robert and G. Casella, Monte Carlo Statistical Methods (Springer Texts in Statistics) (Springer-Verlag New York, Inc., Secaucus, NJ, 2005). [10] BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation of measurement data – Supplement 1 to the ‘Guide to the Expression of Uncertainty in Measurement’ – Propagation of distributions using a Monte Carlo method (Joint Committee for Guides in Metrology, JCGM 101, 2008).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 48–57)

Triangular B´ ezier surface: From reconstruction to roughness parameter computation L. Pagani∗ and P. J. Scott EPSRC Future Metrology Hub, Centre for Precision Technologies (CPT), School of Computing and Engineering, University of Huddersfield, Huddersfield, HD13DH, UK ∗E-mail: [email protected] Texture parameters computation on freeform surfaces is becoming an important topic in surface characterization. Triangular meshes are used to describe a measured freeform surface. This reconstruction method linearly interpolates three points, to better approximate the surface triangular B´ ezier patches are used. A method to compute areal texture parameters is proposed. Based on numerical experiments the degree of the function approximating the surface and the number of quadrature points required to compute parameters values are investigated. Keywords: Surface texture, Surface reconstruction, Triangular B´ ezier.

1. Introduction Modern manufacturing processes allow to create products with freeform shapes. A method to represents these complex surfaces is a triangular mesh, which is able to handle all the measured surface. Areal texture parameters described in ISO 25178-2 7 are designed for a measured surface defined by an height map. A set of parameters defined on a general freeform surface were proposed in Pagani et al. 9 . Alternatives reconstruction methods able to approximate a surface with a higher degree polynomials are based on B´ezier and spline fitting 8 . B-splines fitting techniques need the parametrisation of the triangular mesh. Since both the parametrisation and the computation of B-splines control points involve the solution of large linear equations systems, the whole process is usually time consuming. Triangular B´ezier surface 6 is a flexible reconstruction approach able to reconstruct triangular meshes with arbitrary genus. It is a face-based method, so for each face of the mesh there is a triangular B´ezier patch. Since the number of patches is high, it should be possible to approximate the integral defining the areal parameters with few quadrature points. In this paper the 48

49

number of quadrature points needed to compute the areal surface parameters are investigated. The paper is structured as follow: in Sec. 2 the triangular B´ezier surface is introduced and two reconstruction procedure are presented, in Sec. 3 a general scheme to compute the texture parameters of a parametric surface is introduced and quadrature rule on a triangular surface is discussed, in Sec. 4 a test case is presented. 2. Triangular B´ ezier patches A triangular B´ezier patch is defined as parametric polynomial surface in R3 whose parameter space is defined on a triangular domain T 6 . Let T a triangle in R3 with vertices v 0 , v 1 , v 2 and x = x(u) ∈ T , a point on the B´ezier surface is computed as X r(u) = bi Bin (u), bi ∈ R3 (1) |i|=n

where i = (i, j, k)T , |i| = i + j + k, u = (u, v, w)T = (u0 , u1 , u2 )T is the vector of the barycentric coordinates, bi are the vectors of control points and n! i j k u v w , i+j+k =n Bin (u) = i!j!k! is the Berstein polynomial of degree n. The point evaluation of a B´ezier patch is performed through the de Casteljau algorithm, see Farin 6 . In order to approximate the triangular surface each triangle is approximated by a B´ezier patch. Two interpolation methods, with patches of degree two and five, will be briefly described. Degree two surface approximates each face with a degree two polynomial, the surface is C 0 (the function is continuous, but derivative is not) at the vertices and along the edges. A degree five surface allows a C 1 reconstruction (it is continuous up to the first order derivative). 2.1. Degree two interpolation A degree two B´ezier patch is described as X r(u) = bi Bi2 (u).

(2)

|i|=2

For each patch there are six control points to estimate: one for each vertex and one in the middle of each edge. The control points at the vertices correspond those values. In order to compute the middle edge control

50

points, the directional derivatives at the vertices must be evaluated. These quantities can be estimated from the triangular mesh vertices normals, which can be computed as the weighted average of the incident face normal 2 P t∈Ni αt nt ni = P t∈Ni αt

where Ni represents the face neighbourhood of the i-th vertex (represented by all the faces which have the i-th vertex), αt is the incident angle (angle between two edges with a common vertex) and nt is the normal of the t-th face. Since the directional derivatives are orthogonal to the normal, they can be estimated rotating the vector of the unit normal r dji (u) = Rji ni where dij = v i − v j and Rji is a (r(uj ) − r(ui )).

π 2

(3)

rotation matrix of the vector ni towards

2.2. Degree five interpolation A quintic B´ezier patch is described as X r(u) = bi Bi5 (u).

(4)

|i|=5

In order to estimate the twenty-one control points the middle edge derivatives and the second order directional derivatives must be estimated. Similarly to the previous case, the middle edge normal can be approximated as the average of the normals of the two adjacent faces. The directional derivative can thus be computed rotating the normal in the direction of the vector connecting the middle of portion edge to the opposite vertex. To compute the second order directional derivatives the surface is locally approximated with a second degree polynomial 3 . For each vertex, v i and height function is used to approximate a smooth surface for each of the coordinates (x, y and z). The surface used is the paraboloid  1 α u2 + 2 β u v + γ v 2 . (5) f (u, v) = 2 After estimating the coefficients, the hessians are   α β Hi = i i βi γi and finally the second order directional derivatives as  T r dij dkl (u) = dTij H x dkl , dTij H y dkl , dTij H z dkl .

(6)

51

2.3. Reconstruction results The analysed mesh is a portion of a lattice structure build with an additive manufacturing machine, it was measured with a Nikon XT H 225 microfocus CT. Mesh reconstruction was performed by an adaptive algorithm implemented in CGAL 1,10 . The mesh was manually segmented to extract a shape with a cylindrical form. Fig. 1 shows the mesh (grey) and along with the estimated form surface (blue).

Figure 1: Analysed meshes, 140 971 faces (values in mm).

The mesh was approximated with surfaces of degree two and five with, respectively, C 0 and C 1 continuity at the edges. In Fig. 2 is shown a magnification of the lattice mesh with the reconstruction through a triangular mesh and the two B´ezier surfaces. It is possible to observe that the meshes approximated with the B´ezier surfaces better approximate regions with high curvature, these portion of the surface appear “smoother”. The was written in C++ and all the computation times were recorded using a machine with 6 threads on a Linux operating system. The mean of ten reconstructions times correspond to 0.11 s and 0.39 s with a standard deviation of 0.001 s and 0.002 s, respectively, for the degree two and five surface.

3. Areal texture parameters computation The method implemented by the authors in Pagani et al. 9 is used to compute the areal parameters. The extension of the functional parameters computation is proposed. It is assumed that a manufactured part can be described with a regular surface Σ ⊂ R3 as r(u, v) = T (x = x(u, v), y = y(u, v), z = z(u, v)) with u = (u, v) and U ⊂ R2 . U is called the parameters space. In order to compute the roughness parameters

52

(a) Triangular mesh

(b) Triangular B´ezier degree 2

(c) Triangular B´ezier degree 5

Figure 2: Magnification of the reconstruction of the lattice mesh. the surface must, at least, be decomposed as r(u, v) = r f orm (u, v) + r sl (u, v)

(7)

where Σf orm : r f orm (u, v) represents the form (F operator) and Σsl : r sl (u, v) the scale limited surface. The form surface (r f orm ) is usually computed computed using a total lest squares method (orthogonal regression). The scale limited surface (r sl ) is then computed as the difference between the measured surface and the estimated form surface. The parameters are computed as the integral of a scalar field on a surface 5 . For example the absolute value of the height is computed as ZZ 1 Sa = |r sl (u, v) · nf orm (u, v)| dσf orm (8) A Σf orm R where A is the area of the form surface (A = σf orm dσf orm ) and dσf orm = kr f orm,u (u, v) × r f orm,v (u, v)k du dv

where r f orm,• (u, v) is the partial derivative in • direction. The other height parameters can be computed with similar formulae. The root mean square gradient is computed as s ZZ

2  T

1

J f orm G−1 (9) ∇ (r · n ) Sdq = u sl f orm dσf orm f orm

A Σf orm where J f orm is the Jacobian matrix of r f orm (u, v), Gf orm = J Tform J f orm and   r sl,u · nf orm + r sl · nf orm,u ∇u (r sl (u, v) · nf orm (u, v)) = r sl,v · nf orm + r sl · nf orm,v

53

where nf orm (u, v) is the normal vector of the form surface. The developed interfacial area ratio is computed as ZZ 1 kr u (u, v) × r v (u, v)k Sdr = A U

(10)

− kr f orm,u (u, v) × r f orm,v (u, v)k du dv

where U is the domain of the parameters’ space. A method to approximate the density of the Abbott-Firestone (AF) curve is now presented. The measured surface may present some re-entrant features, so the AF curve is not monotonically non decreasing. For this reason the Sxx parameters defined in the ISO 25178-2 cannot be defined, while the V xx parameters can be computed. In this paper the authors propose to use a method inspired by the definition of the Lebesgue integral to compute the AF curve. The form surface is first transposed along its normal by a value equal to Sv; the volume below the surface can therefore be computed as ZZ k(u, v) (r sl (u, v) · nf orm (u, v) − hmin ) dσf′ orm (11) V = Σf orm

where k(u, v) is a function that is equal to -1 if the point (u, v) is a reentrant feature and 1 otherwise, h(u, v) = r sl (u, v) · nf orm (u, v), hmin = minU h(u, v) and dσf′ orm is the infinitesimal area element of r ′f orm (u, v) = r f orm (u, v) − hmin nf orm (u, v). Since a numeric quadrature rule is used, it is possible to proportionally divide the value of each quadrature point along the height. To compute the density curve, through an histogram, a number of bins has to be arbitrarily set; the portion of volume for each bin can be then computed. The height range was divided in 500 bins. Let n be the total number of the bins, bi , ∀ i = 1, . . . , n the portion of the volume of each bin, h − hmin be the value of the quadrature point and ∆b = Sz n the range of each bin, the number of bin with equal value are   h − hmin nb = , ∆b so the value to add to the nb+1 -th bin is bnb +1 = h−hmin −bnb +1 nb

h−hmin −nb ·∆b , ∆b

while

∀ i = 1, . . . , nb ; Sz is the range h(u, v). Summing the bi = contribution of each quadrature point it is possible to compute the density of the volume as a function of the height. To compute the AF curve the area is computed dividing the volume previously computed by ∆b . It is not possible to compute the parameters V vx according to ISO 25178-2 because the bearing curve is not a bijective function. In this paper the parameters

54

are computed using the percentage of the height. Let fV (h∗ ) the density distribution of the volume as a function of the percentage of the height, a possible definition of V m(p), with 0 ≤ p ≤ 1, could be Z 1 Sz fV (h∗ ) dh∗ , (12) V m(p) = Amax p while V v(p) =

Z

Sz Amax

p

fV,max (h∗ ) − fV (h∗ ) dh∗

(13)

0

where Amax is the maximum section area and r sl (u, v) · nf orm (u, v) − hmin fV (h∗ ). fV,max (h∗ ) = max h∗ Sz The volume related parameters could be computed as h∗ =

V mp = V m(p) − V m(1) V vc = V v(q) − V v(p)

V mc = V m(q) − V m(p) V vv = V v(p) − V v(0).

3.1. Numerical integration Since the surface is described by a union of B´ezier patches, the residuals are then computed for each patch as the difference between the reconstructed and the form surfaces. Parameters’ computation can be performed through a per-face integration 4 . Quadrature points are located inside the domain of the triangle patch, the simplex. Differential quantities on the edges and at the vertices do not need to be computed, so it is possible to compute the integration on a C 0 surface without approximating those quantities. Numerical quadrature can be computed as Sx =gx

np nquad X X

wj fi (uj , vj , wj )

i=1 j=1

·

kr if orm,u (uj , vj , wj )

×

r if orm,v (uj , vj , wj )k

!

(14)

where Sx is a parameter, i represents the i-th B´ezier patch, wj is the j-th quadrature point, fi (uj , vj , wj ) is the function to integrate of the i-th B´ezier patch in the j-th quadrature point and gx (•) is the function to compute after the integral has been performed. Quadrature rules presented in Xiao et al. 11 are used to compute the quadrature points and weights. Parameter computation with the triangular mesh method are approximated with the linear finite element method 9 .

55

4. Numerical results Parameters described in Sec. 3 are evaluated setting different degrees of the integrand function (i.e. defining the number of quadrature points). The integrand is a function of degree equal to twice the degree of the form surface minus one plus the degree of the function applied to the residuals. For example to compute Sq if the surface is approximated by a triangular B´ezier surface of degree 2, since the integrand is a function of degree 6, 16 quadrature points are needed. Degrees from one to twenty are evaluated. For each value ten replications are performed. Fig. 3 shows the values of the computed areal parameters, while in Fig. 4 are shown the computational times. In each graphic there is an horizontal black line with the performance of the approximation with the linear method.

Figure 3: Roughness parameters of the lattice mesh.

56

The convergence of the parameters’ estimation is achieved with few points: three quadrature points are enough to compute the texture parameters. The barycentric quadrature (one quadrature point in the middle of the triangle) assures a good estimation with the degree two B´ezier surface. Evaluation times, see Fig. 4, have reasonable values for all the evaluated parameters. The most time consuming quantities to compute are represented by the Sdq parameter and the AF curve, it has a linear behaviour respect to the integrand degree.

Figure 4: Computation time of the roughness parameters of the lattice mesh.

5. Conclusion In this paper a method to reconstruct a triangular surface for metrology purpose was investigated. It was shown that the reconstruction of mesh with the triangular B´ezier surface is fast, it does not need a parametrisation of the measured mesh and it is possible to handle every triangular mesh. A face-based integration was applied to the reconstructed surface, the number of quadrature points needed to compute the areal texture parameter was investigated. It was shown that the triangular B´ezier of degree

57

two converges with one or two quadrature points and the estimation of the parameter Sdq is not biased although the surface is C 0 at the vertices and edges. Both functions of degree two and five estimate similar values of the roughness parameters, so if a C 1 continuity is not needed, the simplest model of degree two can be computed. Acknowledgements The authors gratefully acknowledge the UKs Engineering and Physical Sciences Research Council (EPSRC) founding of the EPSRC Fellowship in Manufacturing: Controlling Variability of Products for Manufacturing (Ref: EP/K037374/1) and the EPSRC Future Metrology Hub (Grant Ref: EP/P006930/1). References 1. Jean-Daniel Boissonnat and Steve Oudot. Provably good sampling and meshing of surfaces. Graphical Models, (67):405–451, 2005. 2. Mario Botsch, Leif Kobbelt, Mark Pauly, Pierre Alliez, and Bruno L´ evy. Polygon mesh processing. A K Peters, 2010. 3. Jakob Andreas Bærentzen, Jens Gravesen, Fran¸cois Anton, and Henrik Aanæs. Curvature in Triangle Meshes, page 143–158. Springer London, London, 2012. ISBN 978-1-4471-4075-7. 4. Fernando de Goes, Mathieu Desbrun, and Yiying Tong. Vector Field Processing on Triangle Meshes. URL http://graphics.pixar.com/library/ VectorFieldCourse/. 5. C. Henry Edwards. Advanced Calculus of Several Variables. Mineola, NY, 1994. 6. Gerald Farin. Triangular Bernstein-B´ ezier patches. Computer Aided Geometric Design, 3(2):83–127, 1986. ISSN 0167-8396. 7. ISO 25178-2, Geometrical product specification (GPS) - Surface texture: Areal Part 2: Terms, definitions and surface texture parameters. ISO 25178-2, 2012. 8. Seng Poh Lim and Habibollah Haron. Surface reconstruction techniques: a review. Artificial Intelligence Review, 42(1):59–78, 2014. ISSN 1573-7462. 9. Luca Pagani, Qunfen Qi, Xiangqian Jiang, and Paul J. Scott. Towards a new definition of areal surface texture parameters on freeform surface. Measurement, 109:281– 291, 2017. 10. Laurent Rineau and Mariette Yvinec. A generic software design for Delaunay refinement meshing. Comput. Geom. Theory Appl, (38):100–110, 2007. 11. Hong Xiao and Zydrunas Gimbutas. A Numerical Algorithm for the Construction of Efficient Quadrature Rules in Two and Higher Dimensions. Comput. Math. Appl., 59(2):663–676, January 2010. ISSN 0898-1221.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 58–69)

On the classification into random and systematic effects F. Pavese Formerly CNR-IMGC and INRiM, Torino, 10139, Italy E-mail: [email protected] Long since in statistics the classification of the errors/effects in “random” and “systematic” is by far the most commonly used: this fact is recognised in the GUM:1995 and this is the choice made in most ISO standards. The aim of this paper is to investigate whether the above distinction, used in the vast majority of statistical treatments, is justified in treating the two types as necessary and different components leading to the overall measurement uncertainty/error. Keywords: Random error; Systematic error; Random effect; Systematic effect; Classification.

1. Introduction Long since in statistics the classification of the errors/effects in “random” and “systematic” is by far the most commonly used: this fact is recognised in the GUM:1995 3.2.1 1 and this is the choice made in most ISO standards. The aim of this paper is to investigate whether the above distinction, used in the vast majority of statistical treatments, is justified in treating the two types as necessary and different components leading to the overall measurement uncertainty/error. The GUM:1995, introducing the term “correction”—and “correction factor”—indicates it as a mandatory action in the presence of systematic effects/errors. Also the use of “corrections” is almost universal since Gauß’s times, but it is also the source of many problems—when performed—and of a tough dilemma—‘to correct or not to correct’—both deserving a very wide literature. This paper intends to show that a discussion on the concept of correction is a key point to start with for a full understanding of the meaning of

58

59

systematic effect (GUM:1995 compliant [1]) or error (ISO 3534:2006 compliant 2), both being defined in VIM:2012 3 *. Measurement error is defined in VIM:2012 clause 2.16 as “measured quantity value minus a reference quantity value”, while in clause 2.19 the random error/effect is defined as “random error component of measurement error that in replicate measurements varies in an unpredictable manner” and in clause 2.17 the systematic error/effect is defined as “component of measurement error that in replicate measurements remains constant or varies in a predictable manner”. The first is modelled by a random variable with symbol , expectation E() = 0, and variance 2() ≠ 0. The second is modelled by a random variable, often referred to as “bias”, B, with expectation E(B) ≠ 0 (most often), and variance (B) ≠0, and E(B) ≠ 0 is required by GUM and most literature to be “corrected”—in VIM 3, Note 3 to clause 2.17, it “can be corrected” (emphasis added). 2. Starting concept: The measurand In measurement science the object of a measurement is called “measurand”. It is defined in VIM:2012 [3] clause 2.3 as “quantity intended to be measured” (but in GUM:1995, B.2.9 it is defined instead according to VIM:1993 clause 2.6 “particular quantity subject to measurement”. In VIM:2012, NOTE 1 to clause 2.3 specifies “the specification of a measurand requires … description [i.e., a model] of the state of the phenomenon, body, or substance …”, and NOTE 3 adds “the measurement, including the measuring system and the conditions under which the measurement is carried out, might change the phenomenon, body, or substance such that the quantity being measured may differ from the measurand as defined. In this case, adequate correction is necessary” (emphasis added: notice the difference to Note 3 of the above clause 2.17). The term “correction” in VIM:2012 clause 2.53 is “compensation for an estimated systematic effect”— while the term “bias”, defined in clause 2.18, is “estimate of a systematic measurement error”. Thus, systematic error/effect means that the “quantity being measured” (in a specific case) may not be the intended one.

Note that GUM:1995 [1] has been based on VIM:1993 [4].

*

60

2.1. The measurand prescriptive model † The concept of measurand should be shared by the relevant Community, because the same measurand is supposed to be the object of replicated measures that must be comparable, i.e., it should be recognised as a quantity having a current recognisable meaning for the community. In the dialect of the science philosophers, this means that it should be projected into a “social framework” [5]. Also in the scientific frame this means that the measurand model must be of the “prescriptive” type, meaning “giving directions or injunctions” [6]—which does not always mean “physical model”. The design of an experiment (DoE) must start from this initial conceptual model of the measurand, “socially shared”, not from building up the descriptive model of the measuring system, which is specific to each measurement arrangement. 2.1.1. An example of prescriptive measurand model: Thermodynamic temperature The current model of thermodynamic temperature, T, is indicated in Eq. (1) for the ideal gas: T = pV / R n ,

(1)

where p = pressure, V = volume, n = amount of gas, R = kBNA gas constant, with kB = Boltzmann constant, NA = Avogadro constant—upright character used because kB and NA will be stipulated exact from 2018. Therefore, the quantities prescribed to be measured are p, V, n. Figure 1 shows a usual graphic representation of the model.

Figure 1. Input-output representation of the model in Eq. (1). The black-box indicates the analytical relationship f(Xi) between the input variables Xi and the output variable Y = T.

† Note that here the model is that of the measurand, not that of the measurement, which will be introduced in a subsequent step, and called “descriptive model”.

61

In the GUM:1995, these quantities are called “input quantities”, where in VIM:2012 clause 2.50 input quantity (in a measurement model) is a “quantity that must be measured, or a quantity, the value of which can be otherwise obtained, in order to calculate a measured quantity value of a measurand”. 2.1.2. A generic prescriptive measurand model Instead of the graphic model of Fig. 1, in the following the cause-effect diagram (or Ishikawa diagram [7]) will be used. It does not require an explicit formulation of the functional relationships, but indicates the cause-to-effect flow of information on which the relationships are based.

Figure 2. Cause-effect diagram of a generic prescriptive model of the measurand. The subordinate lines show (with arrows added for clarity) the direction of the flow of the input-quantities influence on the output quantity Y, irrespective to the direction of their slope. There are input quantities directly influencing Y (single subscript), others subordinated to other influence quantities (double subscript); further levels are possible, according to the complexity of f(Xi). For the meaning of the superscript  see text.

In that type of representation, Fig. 2 shows a generic model for the prescriptive model, where the quantities are linked by implicit functional relationships schematically indicated. The superscript  (borrowed from chemical-physics where it means “ideal state”) indicates that the variables in the model have the meaning required by the prescription, i.e., be “socially shared”; it also avoids to confound that model with the measurement model, that is instead specific to any single arrangement. Y indicates the “intended quantity”, the measurand. All X are assigned by the measurement an expected value and are affected by a variance (except when stipulated), resulting in an expected value and variance of the measurand Y. 3. Implementation: The specific measuring system Also each measuring system must be modelled, based on the specific solutions that are chosen in order to implement the prescription.

62

3.1. The measuring system descriptive model The previous conceptual model, being independent on any specific experimental implementation, is clearly a highly idealised one. It does not even allow appreciating the experimental difficulties and compromise (which are graduated depending on the target uncertainty). They arise from three categories of sources indicated in the VIM:2012: 1) the phenomenon, body, or substance; 2) the measuring system; 3) the conditions under which the measurement is carried out. A corresponding model must describe the measurement conditions (often called “physical”, “experimental” or “observation” model—here it does not necessarily correspond to any of them). 3.1.1. An example of descriptive model for thermodynamic temperature: Constant volume gas thermometry (CVGT) A descriptive model for thermodynamic temperature, T, whose prescriptive model is indicated in Eq. (1) for the ideal gas, must take into account all the influence quantities of the chosen actual implementation. A typical material setup is shown in Fig. 3 [8].

Figure 3. Typical setup of the mechanical part of a CVGT. The gas, whose pressure p is that at T = Tbulb in the CVGT bulb of volume V, also fills the above capillary/connecting-tube up to the differential pressure diaphragm and (closed) by-pass valve, generally at room temperature Troom ≠ Tbulb, where the pressure-measuring manometer is connected to the other side of the differential gauge. The rest of wording in the figure is irrelevant here. [8]



The influence quantities can be grouped according to the above three categories (detailed description of the symbols is irrelevant here, see [8]).

63

(1) Influence of the physical parameters (see [8] for quantities meaning) Substance (ideal gas) non-idealities: (i) For a low-density gas: pV = nRT [1 + B(T)(n/V) + C(T)(n/V)2 + ... ], where B(T), C(T) are the virial coefficients. (ii) Non-perfect purity: different non-ideality of impurities Amount of “active” substance, nbulb : “Active” means that only the gas in the isothermal bulb where T is measured. (i) Dead-volume correction, see later: nbulb = n – ndeadvol (ii) Adsorption of gas on the bulb: ln(V/V0) = ln() + vapHm(RT + /RT3) (iii) Impurities: not included in n (iv) Condensation of impurities: “increases” V. (2) Influence of the technical parameters: Thermal apparatus (extended, optimised) Extensions outside isothermal bulb Vbulb where T = Tbulb is measured: pressure measuring line (p, n involved) (i) (p*int + pTM + paer)Vbulb = RTb(nbulb + ncap+ nr)(1+B(nbulb + ncap + nr)/Vbulb), ncap = (/4R)d2p*int(dh/(T/h)); nroom = Vr(p*/RTroom)/(V(p*/RT))  ≈ T2 ndeadvol = ncap + nroom (ii) Pressure p* is measured at Troom ≠ Tbulb, thermomolecular pTM = (C/pavd2)[(T2/T0)2n+2 – (T1/T0)2n+2] aerostatic head paer ≈ ((g/m)/R)p*int(dh/T(h)) Non-perfect mechanical rigidity of bulb (Vbulb): (i) Pressure difference across bulb walls p = (pint – pext) cylinder V/V0 = (p/E)(d/c + d/c/(4 + c/d)); plate V/V0 = (p/E)(55ꞏ10–3 d4/Lp3) (ii) Cubic expansion coefficient of the bulb material: VCu/V0 = a + bT + cT2 (iii) Temperature gradients in the measuring bulb (T) (3) Influence of the technical parameters: Instruments      

    

Pressure Mechanical sensors Mechanical indication (piston-gage), Electronic indication Calibrated instruments, critical Volume Constant volume gas thermometer: ideally constant, but deviations (as indicated before) Acoustic gas thermometer: microwave, critical Dielectric, refractive index gas thermometer: volumetric, not critical Amount of substance Indirectly measured quantities Temperature (thermometers usually already calibrated)

64

  



Electrical resistance measurements Temperature coefficients of equipment Thermal controls Electronic controls.

(4) Influence of the conditions under which measurements are carried out  

      



Target accuracy not reached The conceptual model is directly applicable only for very low requirements (e.g., gas thermometer for the thermostat of a home fridge) Inadequacy of the experimental model Insufficient capacity of analysis for increasing accuracy Quality of the experimental set-up Instruments specifications do not meet the needs Skill of the operators Insufficient training in detecting suspicious conditions Scarce experience in data handling Insufficient capacity of handling the statistical tools of analysis.

Typically, the number of quantities actually involved in the measurement is much larger than those in the initial prescriptive model, and they are physically localized. Therefore, e.g., in the example, the generic indication p, V and n is insufficient because it is actually affected by non-uniqueness. The quantities qualifying the (same) measurand should better be specified for the specific measuring system. In the example, being T the temperature in the bulb, it is: p  pbulb for the pressure of the gas in the bulb at temperature T; V  Vbulb,0 for the volume of the bulb as it was rigid and at a reference temperature T0; n  nbulb for the amount of “active” gas within the volume Vbulb,0 of the bulb. In addition, all involved quantities must be taken into account in the measuring system model. 3.1.2. A generic descriptive model of specific measurement system An actual generic model for a specific measuring system is the one depicted in Fig. 4, to be compared with the prescriptive model in Fig. 2.

65

Figure 4. Generic model for a specific measuring system: see text for the meaning of the symbols.

It must be noted that X = E(X) + 0X (0 = zero mean; absent if X is stipulated) and the following differences with respect to Fig. 2 apply:  Y is not the measurand.  Xi* are the localised Xi  Xij , Xijk are “intricate” functional expressions to match the complexity. In the gas thermometer example (see [8] for the quantities meaning):  Xi* are pbulb, Vbulb,0, nbulb for Y ≈ Tbulb  Xij is, e.g., the virial effect [1 + B(T)(n/V) + C(T)(n/V)2], or the bulb non-rigidity effect V/V0 = (p/E)(d/c + d/c/(4 + c/d))  Xijk is, e.g., the aerostatic-pressure paer, or nroom—the portion at room temperature of ndeadvol. The functional expressions of the Xij and Xijk contain additional quantities, or the quantities Xi for different locations of the measuring setup. When, like for the virial effect, a quantity in any Xij or Xijk is Y, an iterative method has then to be used to compute Y. What is being measured is almost never the measurand, i.e., the system is not in the “reference” condition (i.e. it does not match the prescriptive state indicated with ). Referring again to the example, the measurement results are those where:  the measured gas is real: its reference condition is the “ideal” state  the bulb is not rigid: its reference condition Vbulb,0 is being rigid and not T dependant,  the pressure is not that in the bulb: its reference condition is pbulb  the amount of gas in the bulb is lower: its reference condition is nbulb,active  the temperature of the bulb is not enough uniform, as it should be  … etc  most instruments are not in their reference conditions, as they should be.

66

All quantities should have been assigned a reference condition. In the VIM and the GUM (and in most literature) this requirement is interpreted as shown in Fig. 5.

Figure 5. Generic model for a specific measuring system with bias indication, according to GUM and VIM.

It is like Fig. 4 changed by labelling “bias”, B, caused by systematic effects, all quantities X exceeding those in the prescriptive model. ‡ Note in Fig. 5 that X = E(X) + 0X; B = E(B) + 0B (0 = zero mean; omitted if stipulated), where most often E(B)  0. On the contrary, as shown in Fig. 6, the X must be localised (*), and, must be identified when they already are in their reference condition ().

Figure 6. Same as Fig. 5 with bias indication, but with indication of localization and of reference state.

The figure pictures “bias” as an out-of-reference condition. When a factor: B = (1 + B*); when an addend: B = B*, where B* is an out-of-reference expression, usually having E(B*)  0.



Incidentally, In GUM, the B’s are not called “input quantities” nor “bias”.

67

4. Consequences The B should be called “input quantities” too, affecting the localised quantities of the prescriptive model. All input quantities can be affected by bias, because all may happen to be measured when being in an out-of-reference condition. The B are no more “systematic” then the X. They are as important as the X, as far they “influence” the value of Y by exceeding the target uncertainty, otherwise they would not appear in the model. The B are not “errors” nor responsible for errors. Should happen that a E(Bi*) = 0, this is not equivalent to omit it as an influence quantity, since 0Bi* stands. Out-of-reference does mean a functional dependence of each specific quantity on further quantities, consisting in general, of sub-equations, if in analytical form like in the example, or in tables of values, or in differences of values, affected by an uncertainty with Type A/Type B components. One state for each independent variable needs be taken as the reference, to which a reference state of each dependent variable corresponds, and, consequently, the output quantity Y is in its reference state, Y. None of the models illustrated so far is a “measurement model” according to the VIM and GUM definitions—which are too generic and with no associated “concept diagram”. So far, the model reported in Fig. 6 looks the closest to the prescriptive one, but is not appropriate yet as the descriptive model of the measurand, Y. In order to transform it to become the measurand descriptive model for the specific experiment, it must be modified in order to represent the reference state of each quantity.

Figure 7. Same as Fig. 5 with bias indication, but also with indication of the hierarchical order to follow for normalisation.

The key issue at this last step is that there is a hierarchy among the influence quantities as evidenced in Fig. 7: the operation of computing the value of each quantity at the reference state proceeds from the lower hierarchy level (triple

68

subscript in the figure) to the top level (single subscript)— i.e., the level of the quantities appearing in the localised prescriptive model of Fig. 4. 4.1. The final model for the specific measuring system compatible with the measurand The model is shown in Fig. 8, where there is no further need to use, for any influence quantity, a symbol other than X. In it, the input quantities are: Xi*, Xij*  Bij*, Xijk*  Bijk* (memo: as a factor B = (1 + B*), as an addend B = B*). Normalisation is not due only for the quantities that are measured at their respective reference states (). The normalisation operation is today called “correction”, i.e., “compensation for systematic effect/error”. However, we have shown instead that: —the rational is to match the reference condition represented by the measurand —the aim of the specific measurement, but considered as a “socially shared” quantity; —all quantities in the final model are not the same of the initial model.

Figure 8. The final model for the specific measuring system compatible with the measurand.

69

5. Conclusion The model in Fig. 8 brings to the correct measurement result that can be compared with replicated measurements of the same measurand. Should one abandon the concept of “true value”, then it can be argued that one should also abandon the concepts of “systematic error” and “correction”. The “cancellation” of systematic effects that can be detected, e.g. from evidence in comparison exercises, can be another way out to avoid correction. References 1. 2. 3.

4. 5.

6. 7. 8.

Guide to the Expression of Uncertainty in Measurement (GUM), 1st edn. International Organization for Standardization, Genève, Switzerland, 1993. Statistics– Vocabulary and Symbols ISO 3534:2006, 3rd edn., International Organization for Standardization, Genève, Switzerland, 2006. BIPM International Vocabulary of Metrology—Basic and General Concepts and Associated Terms (VIM), 3rd edn., 2008, http://www.bipm.org/en/publications/guides/ BIPM International Vocabulary of Metrology—Basic and General Concepts and Associated Terms (VIM), 2nd edn., 1993. N. De Courtenay, F. Grégis, The evaluation of measurement uncertainty and its epistemological ramifications, Studies in History and Philosophy of Science 2017, online June 24 https://doi.org/10.1016/j.shpsa.2017.05.003 The Free Dictionary. http://www.thefreedictionary.com/prescriptiveness http://wikipedia.org F. Pavese, G.F. Molinar Min Beciet, Modern Gas-Based Temperature and Pressure Measurements, Springer, New York, 2nd Edn., 2012.

$%)RUEHV1)=KDQJ$&KXQRYNLQD6(LFKVWlGW)3DYHVH HGV  $GYDQFHG0DWKHPDWLFDODQG&RPSXWDWLRQDO7RROVLQ0HWURORJ\DQG7HVWLQJ;, 6HULHVRQ$GYDQFHVLQ0DWKHPDWLFVIRU$SSOLHG6FLHQFHV9RO ‹:RUOG6FLHQWLILF3XEOLVKLQJ&RPSDQ\ SS± 

Measurement models A. Possolo Statistical Engineering Division, National Institute of Standards and Technology, Gaithersburg, MD 20899-8980, USA E-mail: [email protected]

There is growing recognition of the need to consider measurement models other than the model used in the Guide to the expression of uncertainty in measurement (gum) [1], which expresses the measurand as the value (output) of a known, real-valued function of several quantitative inputs for which there are estimates and evaluations of associated uncertainty. The need for statistical measurement models, in particular, arises when the measurand appears not as a function of the observable inputs themselves, but as a function of parameters of the probability distribution of the inputs. These models are sometimes referred to as observation equations [2–4]. Close examination reveals that statistical models are already used implicitly in some of the examples discussed in the gum. Since statistical models typically include purely descriptive components involving arbitrary assumptions not dictated by substantive considerations, two questions should be answered before these models are used to estimate the measurand and to evaluate the associated uncertainty. (i) model selection: which model seems to be best among the plurality of models that could reasonably be employed? (ii) model validation: how tenable are the assumptions that enter into the model’s definition? This contribution addresses both issues from a practical viewpoint, offering suggestions for progress in metrological model building, illustrated by examples drawn from several areas of measurement science, and including cases where the measurement data themselves are coaxed to express a preference for the model that should be used to analyze them. Keywords: measurement; model; measurement equation; observation equation; measurement uncertainty; statistical model; mixed effects model; model selection.

70

71

1. Introduction “Just as scientists seek to explain nature, not simply predict it, we see human thought as fundamentally a model-building activity.” — Lake et al. (2017) [5] “Models are vehicles for learning about the world. Significant parts of scientific investigation are carried out on models rather than on reality itself because by studying a model we can discover features of and ascertain facts about the system the model stands for.” — Frigg and Hartmann (2017, Models in Science) [6] Models are interpretive representations of selected attributes of objects, relations, processes, or phenomena, that facilitate studying them. “Models are metaphors” [7], and “every metaphor is the tip of a submerged model” [8]. Since the use of models pervades all fields of science, humanities, and the arts, there are many different kinds of models, from the Standard Model of particle physics, to the models walking the runways during Paris Fashion Week. Consider a scale model of a bridge whose purpose is to study how the bridge will behave when buffeted by winds. This model does not aim to represent all the properties of its full-scale counterpart, but interprets the relationship between the attributes of the bridge and its response to winds, to focus only on those attributes that determine this response (constitutive materials, structural design, anchoring, etc.). On the one hand, this model of the bridge need not represent how the bridge will age when exposed to seawater, or how it will perform in an earthquake. On the other hand, since the model may be placed in a windtunnel, where wind speed and direction may be varied over very wide ranges in a controlled manner, the model in fact enables a much more thorough study of the bridge’s response to winds than would ever be possible using only observations of bridges that have already been built and of natural winds. This contribution focuses on models that are useful in reductions of measurement data that ultimately produce an estimate of the measurand, and qualify it with an evaluation of measurement uncertainty. The measurement models that we will discuss, similarly to models that are used in many other areas, do not necessarily capture all aspects of measurement, only those aspects that inform data reductions and uncertainty evaluations.

72

Section 2 reviews different characteristics of mathematical models, in particular whether they are theoretical or empirical on the one hand, or deterministic, probabilistic, or statistical, on the other hand. Section 3 discusses models for data, and whether these may be chosen simply because they fit the data sufficiently well for the intended purpose, or whether some theory motivates their form. Section 4 presents a contemporary, model-based, wide and inclusive understanding of measurement. Sections 5 and 6 review the conventional measurement models considered in the gum (measurement equations), and statistical models (observation equations). Section 6, in particular, presents an extended discussion of two examples — one concerning the Newtonian Constant of Gravitation, the other the relationship between the vapor pressure of gold and temperature —, including criteria for model selection, and examination of diagnostics that are informative about model adequacy. 2. Mathematical Models Mathematical models are representations of natural or man-made systems and processes using mathematical entities and relations between them. These models may be classified, from one viewpoint, as (a) theoretical (population growth, Brownian motion), or empirical (regression); from another viewpoint, as (b) deterministic (population growth), probabilistic (Brownian motion), or statistical (regression). Theoretical mathematical models consist of a mathematical relation or set of relations that represent a particular aspect of a theory: for example, the relationship between the angular displacement θ(t) at time t, of a point pendulum supported by a massless, inextensible string of length L, oscillating in vacuum at a location where the acceleration of gravity is g, is ¨ + (g/L) sin θ(t) = 0 [9]. described by the differential equation θ(t) Empirical mathematical models employ a mathematical relation as an approximate representation of a relationship between observed properties: for example, a spline function that describes how the mean of experimentally measured values of the vapor pressure of gold varies as a function of temperature (Figure 1) [10]. Deterministic model — Population growth. If the size N (t) of a population at time t is such that N˙ (t) = rN (t)(1−N (t)/ν), for some growth rate r > 0, and the size of the population is constrained (by availability of resources, say) not to exceed a maximum possible size ν, then N (t) =

73

ν/(1 + α exp(−rt)), with α = ν/N0 − 1, where N0 is the initial size of the population. While N (t) remains much smaller than ν, the population grows exponentially fast, but growth is increasingly dampened as N (t) approaches ν [11]. Probabilistic model — Brownian motion. The jittery motion visible when a particle suspended in a liquid is examined under a microscope, called Brownian motion [12], is attributable to collisions with the molecules of the liquid [13]. The path followed by the particle may be described by a collection of random variables indexed by time, with independent Gaussian increments, whose values are points in two-dimensional space [14]. This model enables the computation of the probability of general events related to the particle’s motion without reference to any empirical observations. For example, that Brownian motion in the plane, started at the center of the unit circle, returns to the unit circle infinitely often with probability 1, and that the time that Brownian motion spends in any given subset of the plane, is proportional to the area of the subset. Statistical model — Regression. The left panel of Figure 1 depicts values of the vapor pressure p of gold measured at several values of the temperature T , and a curve fitted to them, of the form p = α exp(−β/T ) [10]. This curve is a non-linear statistical regression model, depicting the mean value of p at each value of T , neglecting the “errors” that scatter the measured points around the curve. The regression curve is based on the assumption that p is a smooth function of T . The model is statistical because it involves random variables with a joint probability distribution that depends on adjustable parameters whose values must be estimated using the empirical data so that the model may become relevant in practice.

3. Data Models The regression model introduced above presents itself naturally once one realizes that log p is approximately a linear function of 1/T (right panel of Figure 1), whose slope and intercept may be estimated in any one of several different ways. This realization may be evinced from the bulging rule applied in conjunction with a trial-and-error exploration of the ladder of powers [15]. However, a generalized additive model (gam), fitted using R function gam defined in package mgcv [16, 17], and involving a thin plate regression spline, achieves essentially the same fit without relying on the linearity “clue”

−4

74

●● ● ● ●

−5 −6



● ● ● ●

● ● ●● ●● ●● ● ●● ●

● ●●

● ● ● ● ● ● ● ●● ● ●

−9

0.004

● ● ●●

−7

log(p )

0.008

● ●

● ●● ● ●● ● ●

−8

0.012

LM GAM







1400

●●

●● ● ● ● ● ● ●● ●● ●●● ●● ● ●● ● ● ●●●● ●

1500

1600

● ● ● ● ●● ● ● ● ●● ●

−11

● ●●●

−10

● ●●

0.000

Vapor Pressure of Gold (kPa)

● ● ●

1700

Temperature (K)

1800

0.00055

0.00060

0.00065

0.00070

1 T

Fig. 1. The left panel depicts values of the vapor pressure p of gold measured at several different values of temperature T [10, Table 2, Lab 9], and a model of the form p = A exp(−β/T ) (lm, thick red line), which fits the data fairly closely. The thin orange curve on the left panel (gam), which essentially tracks the red curve, represents a generalized additive model.

provided by the plot on the right panel of the same figure. To this extent, both models (lm and gam) are empirical because they have been built without invoking any substantive understanding of the relationship between p and T that may be derived from fundamental physical laws. As it turns out, the Clausius-Clapeyron equation [18, §2.2g] suggests that a model of the form log p = α − β/T , with suitably chosen values of α and β, should achieve a good fit to the experimental data [19, 20], and it does, but this played no role in the production of the models reviewed above.

4. Measurement & Models In a review article in the Stanford Encyclopedia of Philosophy, Tal (2015) [21] explains that “according to model-based accounts, measurement consists of two levels: (i) a concrete process involving interactions between an object of interest, an instrument, and the environment; and (ii) a theoretical and/or statistical model of that process, where ‘model’ denotes an abstract and local representation constructed from simplifying assumptions.” Our focus is upon (ii), where the theoretical or statistical measurement models are mathematical models. The equation α2 = (2R∞ /c)(Ar (87 Rb)/Ar (e))(h/ma (87 Rb)) is an example of a theoretical measurement model that underlies one particular method

75

of measuring the fine-structure constant α [22], which may be characterized as the ratio of the velocity of the electron in the first circular orbit of the Bohr model of the atom, to the speed of light in vacuum, c. Several of the alternative procedures that Koepke et al. (2017) [23] apply to a set of determinations of the Newtonian constant of gravitation G, to produce a consensus value, involve a statistical measurement model of the form Gj = G + λj + ǫj , fitted to the same measurement results that the Task Group on Fundamental Constants of the Committee on Data for Science and Technology (codata, International Council for Science) used to produce the 2014 recommended value for G [24]. The present discussion of measurement models rests on an inclusive understanding of measurement, as an experimental or computational process that, by comparing the measurand (property intended to be measured) with a standard, produces an estimate of the true value of a property of a material or virtual object, or collection of objects, or of a process, event, or series of events, together with an evaluation of the uncertainty associated with that estimate, and intended for use in support of decision-making [4]. This understanding implies that a comprehensive measurement model should represent: (a) how the measurand is compared with a reference or standard; (b) how instrumental indications and other observations obtained in the course of a physical or virtual experiment are reduced finally to produce an estimate of the measurand; (c) how the uncertainty associated with this estimate is evaluated. Modeling (a-c) typically involves consideration of multiple, interconnected sub-models. Here we focus on (b-c), as we turn to two common types of measurement models: one recognized explicitly in the gum, already mentioned above in relation with the fine-structure constant; the other already lurking in the gum but not articulated explicitly there, exemplified by the calculation of a consensus value for the Newtonian constant of gravitation. 5. Measurement Equations The measurement equation for the fine-structure constant, α = [(2R∞ /c) (Ar (87 Rb)/Ar (e)) (h/ma (87 Rb)]½ , expresses the measurand as a function of other quantities whose values can be determined independently of α. This is an instance of the only measurement model considered in the gum, which expresses an output quantity (measurand) as a known function of a finite set of input quantities that are informative about the output quantity

76

but that do not, in turn, depend on the measurand. In this example, as in most others involving a model of this type, the input quantities will have played the role of output quantities in earlier measurements that produced estimates, and uncertainty evaluations, for them. The Rydberg constant R∞ = α2 me c/(2h) [25, Equation (6)], where me denotes the mass of the electron at rest and h denotes the Planck constant, is one of the input quantities above. It appears to violate the prerequisite just mentioned, because here it is expressed as a function of α. However, this is not the relevant measurement model for R∞ when it plays the role of measurand. In fact, its value is derived from measurements of atomic transition frequencies in hydrogen, deuterium, and muonic hydrogen, and involves both theoretical calculations and the determination of a best fit between theory and measurements [24]. Mohr et al. (2016) [24] point out that the relative standard uncertainties of R∞ , and of the relative atomic mass of the electron, Ar (e), are about 6 × 10−12 and 3 × 10−11 , respectively. The relative uncertainty associated with Ar (87 Rb) is 8 × 10−11 approximately [26]. The more challenging measurement is of the ratio h/ma (87 Rb), of the Planck constant to the atomic mass of the 87 Rb atom. For the present purposes, this ratio functions as a single input quantity, whose value Bouchendira et al. (2011) [27] measured with relative standard uncertainty 1.2 × 10−9 . Their measurement of this ratio was based on a determination of the recoil velocity (h/2π)k/ma (87 Rb) of a 87 Rb atom when it absorbs a photon of wavenumber k. The resulting relative uncertainty for α was 7 × 10−10 approximately, while the relative uncertainty associated with the codata 2014 recommended value of α is 2.3 × 10−10 [24].

6. Observation Equations Observation equations are statistical measurement models where the measurand appears as a known function of parameters of the probability distribution that models the dispersion of the measurement data. In particular, these models serve to reconcile different measured values obtained for the same measurand under conditions of repeatability. For example, when several laboratories working independently obtain different

77

estimates of the Newtonian constant of gravitation (§6.1), or when the vapor pressure of gold is measured repeatedly by the same laboratory under repeatability conditions [28] (§6.2). Statistical models are also well-suited to combine preexisting knowledge about a measurand, with fresh experimental data acquired about it, using Bayesian technology [4]. In most cases, more than one model may reasonably be entertained for measurement data whose relation to the measurand should be expressed in the form of a statistical model. Therefore, the issue of model selection arises naturally when building models for measurement data. The adequacy of any model to data should be examined critically in all cases. We will discuss both issues next, and illustrate the application of relevant criteria for model selection and for model adequacy in the context of two specific examples: one concerning the measurement of the Newtonian constant of gravitation, the other concerning the characterization of how the vapor pressure of gold varies as a function of temperature. 6.1. Newtonian Constant of Gravitation The value of G recommended by codata in the 2014 adjustment is a $ of the n = 14 estimates of G listed in Table 1, with weighted average, G, weights proportional to {1/(κu(Gj ))2 } where κ is the smallest multiplicative inflation factor sufficient to achieve normalized residuals of absolute value smaller than 2. Table 1. Measurement results for the Newtonian gravitational constant used in the codata 2014 adjustment of the fundamental physical constants [24]. G experiment NIST-82 TR&D-96 LANL-97 UWash-00 BIPM-01 UWup-02 MSL-03

u(G)

/10−11 m3 kg−1 s−2 6.672 48 6.6729 6.673 98 6.674 255 6.675 59 6.674 22 6.673 87

0.000 43 0.0005 0.000 70 0.000 092 0.000 27 0.000 98 0.000 27

G

u(G)

experiment

/10−11 m3 kg−1 s−2

HUST-05 UZur-06 HUST-09 JILA-10 BIPM-14 LENS-14 UCI-14

6.672 22 6.674 25 6.673 49 6.672 34 6.675 54 6.671 91 6.674 35

0.000 87 0.000 12 0.000 18 0.000 14 0.000 16 0.000 99 0.000 13

Mohr et al. (2016) [24] used κ = 6.3. A different approach to the estimation of κ suggests that the value chosen by codata is unnecessarily large: κ =

78

3.78 suffices to reduce Cochran’s Q statistic sufficiently for a statistical test of size 0.05 not to reject the hypothesis of homogeneity of the measurement results (once the reported uncertainties will have been replaced by inflated uncertainties). This is the conventional chi-squared test with probability 0.05 of incorrectly rejecting the hypothesis of homogeneity [29]. Not only are the aforementioned estimation criteria ad hoc, the very definition that codata has chosen for the normalized residuals, as rj = $ (Gj − G)/(κu(G j )), is questionable: in fact, the denominator arguably $ because G $ is a function of the {Gj }. should be κu(Gj − G), The statistical model implicit in codata’s reduction of the experimental data is a so-called common mean [30] (or, fixed-effect [31]) model that represents the value measured by experiment j = 1, . . . , n as Gj = G + δj , where the {δj } are realized but non-observable values of independent Gaussian random variables with mean 0 and standard deviations {κu(Gj )}. The adjustable parameters in this model are G and κ. The corresponding % = 6.674 08(24) × 10−11 m3 kg−1 s−2 maximum likelihood estimates are G and κ % = 4.8(9). Both this model, and the additive random effects model already mentioned in §4, Gj = G + λj + ǫj , for j = 1, . . . , n, are readily extensible to accommodate correlations between the {Gj }. Here we assume that the {λj } are like a sample from a Gaussian distribution with mean 0 and standard deviation τ , and the {ǫj } are non-observable outcomes of independent Gaussian random variables all with mean 0 and standard deviations {u(Gj )}. The three estimation procedures used in [23] to fit this additive model to the measurement results in Table 1, all produce estimates of G that are statistically indistinguishable from one another and from codata’s. The largest of the corresponding estimates of τ pertains to a Bayesian procedure, τ$ = 0.001 02 × 10−11 m3 kg−1 s−2 . This suggests effective uncertainties {($ τ 2 + u2 (Gj ))½ } whose median is 4.3 times larger than the median of the {u2 (Gj )}, a value strikingly close to the maximum likelihood estimate of κ in the common mean model that includes a multiplicative inflation factor for the reported uncertainties. The question of which model is preferable may be easily answered when the alternative models are fitted by the method of maximum likelihood, using model selection criteria like the Akaike Information Criterion (aic) or the Bayesian Information Criterion (bic) [32]. Since the NIST Consensus Builder (consensus.nist.gov) does not currently offer maximum likelihood estimation as an alternative, the maximum likelihood fit was

79

achieved in this case using a custom, non-linear optimization R code. aic and bic are numerically identical in this case because the models they are applied to both have the same number of adjustable parameters — G and κ for the common mean model, and G and τ for random effects model: their value is −143 for the common mean model, and −150 for the random effects model. Since the smaller the value of the criterion the better the model, the additive random effects model is clearly to be preferred. 6.2. Vapor Pressure of Gold An equation proposed by Louis Charles Antoine [33–35], of the form log p = A(B − 1000/(T + C)) fits the data mentioned above, for the vapor pressure of gold, slightly better than the models already discussed, in particular achieving a smaller residual sum of squares than the linear regression model fitted to the values of log p and 1/T : 1.0815 versus 1.0887. However, both aic and bic do not deem this improvement sufficient to warrant the added complexity of an extra parameter, and suggest the simpler model as the preferable alternative. The same data lends itself to yet another kind of modeling, once one recognizes that the experiment was organized into four different runs, which are like four independent instances of the same experiment, carried out under conditions of repeatability: log pij = (α + aj ) − (β + bj )/Tij + εij , where i = 1, . . . , mj denotes the observation made within run j = 1, . . . , n, with n = 4, m1 = 13, and m2 = m3 = m4 = 14. The {aj } and {bj } are modeled as realized values of independent Gaussian random variables all with mean 0, the former with standard deviation σA , the latter σB . The introduction of these effects really means that a different straight line is fit to the pairs of values of {(1/T, log p)} in each run. Therefore, it is not surprising that this model should fit the data so much more closely than the model that ignores the fact that the experiment was designed to comprise four different runs. The model, fitted by restricted maximum likelihood (reml [36, 37]) may % , where α be summarized as log p = α % − β/T % = 19.313 log(kPa) and % β = 42 243 K, but it may also be depicted graphically as comprising four different straight lines (left panel of Figure 2) whose intercepts are {% α +% aj }, and whose slopes are {−(β% + %bj )}. The standard uncertainties associated with the slope and intercept are % = 517 K, computed using the conventional u(% α) = 0.266 log(kPa) and u(β)

−4

80



● ●

●●

0.2

● ● ●●

● ● ●





● ●

● ●● ●

● ● ●

● ●

0.1

● ●



● ● ●



● ● ●

● ● ● ● ●

0.0

Residuals





● ●









● ●













● ●



● ● ●



● ●











● ●



● ●







● ●● ●





● ●

−0.2







● ●







● ● ●

−0.2

−9







Run 1 Run 2 Run 3 Run 4



● ● ●





● ● ●



−0.3



● ●

−0.3

−10

● ●



● ●

−0.1

● ● ●

−8

log(p )

−7





● ●







● ●

● ●





● ●





● ●





● ● ●



● ●



● ● ●

0.0

●● ●



Residuals

−6

●● ●

0.1



−0.1

−5



0.2

● ●





−11



0.00055



0.00060

0.00065 1 T

0.00070



−11

−10

−9

−8

−7

−6

Fitted values of log(p )

−5

−4

−11

−10

−9

−8

−7

−6

−5

−4

Fitted values of log(p )

Fig. 2. left: Random effects model fitted to the values of the vapor pressure p of gold measured at several different values of temperature T as a function of temperature, [10, Table 2, Lab 9]. center: Residuals corresponding to the linear regression model fitted disregarding differences between runs. right: Residuals from the random effects model.

asymptotic approximation based on the value of the Hessian matrix of the optimization criterion used to fit the model, evaluated at the optimum [36, 38]. The correlation between the estimates is −0.994. The estimates of the standard deviations of the random effects are σ %A = 0.457 log(kPa) and σ %B = 938 K, both about 2 % of the estimated intercept and slope, respectively. The standard deviation of the residuals amounts to about 4 % of the standard deviation of the measured values of log p. The random effects model improves markedly upon the model fitted neglecting the design of the experiment into runs, even if this is achieved at the expense of a more complex model. Both aic and bic indicate very clearly that the random effects models is to be preferred: bic being −48 for the simple linear regression, and −102 for the random effects model. The Shapiro-Wilk test [39] suggests that both sets of random effects, {aj } and {bj }, are consistent with the assumption that they are like samples from Gaussian distributions. However, the power of the test (that is, the probability of detecting a departure from Gaussian shape), with as few as four observations per set, likely is quite low. Figure 3 is a graphical diagnostic device that, together with the plot of residuals in the rightmost panel of Figure 2, suggest that the model is adequate for these data.

0.15

81

● ●

0.05 −0.05

Residuals Quantiles





●●● ● ●●● ●●● ●● ●● ●●● ●●● ● ● ● ● ● ●● ● ●● ●●● ●●● ●●● ● ●●●● ●

−0.15

●● ●● ●



−2

−1

0

1

2

Theoretical Quantiles

Fig. 3. Decorated QQ-plot of the residuals from the random effects model fitted to the gold vapor pressure data. The gray band is a 95 % coverage band for residuals computed on the assumption that these residuals are like a sample from a Gaussian distribution.

7. Conclusions An inclusive understanding of measurement, prompted by the ever widening scope of metrology, calls for a comparably wide range of models to express the relationship between measurement data and the measurand. The conventional measurement model considered in the gum, and statistical measurement models (or, observation equations), are valuable analytical instruments capable of addressing a very broad range of modeling challenges, by enabling the production of both estimates of the measurand and evaluations of uncertainty to qualify these estimates. Regardless of the kind of model being used, thoughtful consideration of alternatives, rigorous model selection (or, alternatively, suitable model averaging [40], which we did not discuss but also recommend as a potentially useful modeling device), critical evaluation of the adequacy of the model to the data, and assessment of the fitness for purpose of the results that the model delivers, are critical components of reliable measurement.

References [1] Joint Committee for Guides in Metrology, Evaluation of measurement data — Guide to the expression of uncertainty in measurement (International Bureau of Weights and Measures (BIPM), S`evres, France, 2008), BIPM,

82

[2] [3] [4] [5]

[6]

[7] [8] [9] [10]

[11]

[12]

[13]

[14] [15] [16] [17] [18] [19]

IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, JCGM 100:2008, GUM 1995 with minor corrections. A. Possolo and B. Toman, Assessment of measurement uncertainty via observation equations, Metrologia 44, 464 (2007). A. B. Forbes and J. A. Sousa, The GUM, Bayesian inference and the observation and measurement equations, Measurement 44, 1422 (2011). A. Possolo and H. K. Iyer, Concepts and tools for the evaluation of measurement uncertainty, Review of Scientific Instruments 88, p. 011301 (2017). B. M. Lake, T. D. Ullman, J. B. Tenenbaum and S. J. Gershman, Building machines that learn and think like people, Behavioral and Brain Sciences 14, p. e253 (2017). R. Frigg and S. Hartmann, Models in science, in The Stanford Encyclopedia of Philosophy, ed. E. N. Zalta (The Metaphysics Research Lab, Center for the Study of Language and Information, Stanford University, Stanford, California, 2017) Spring edn. E. Derman, Metaphors, Models & Theories Edge.org (November, 2010), Introduction by John Brockman. M. Black, More about metaphor, Dialectica 31, 431 (December 1977). R. A. Nelson and M. G. Olsson, The pendulum — rich physics from a simple system, American Journal of Physics 54, 112 (1986). R. C. Paule and J. Mandel, Analysis of Interlaboratory Measurements on the Vapor Pressure of Gold (Certification of Standard Reference Material 745) (National Bureau of Standards, Washington, DC, January 1970), Special Publication 260-19. P.-F. Verhulst, Recherches math´ematiques sur la loi d’accroissement de la population, Nouveaux M´emoires de l’Acad´emie Royale des Sciences et Belles-Lettres de Bruxelles 18, 1 (1845). R. Brown, A brief account of microscopical observations made on the particles contained in the pollen of plants, London and Edinburgh Philosophical Magazine and Journal of Science 4, 161 (1828). A. Einstein, On the motion — required by the molecular kinetic theory of heat — of small particles suspended in a stationary liquid, Annalen der Physik 17, 549 (1905). P. M¨ orters and Y. Peres, Brownian Motion (Cambridge University Press, Cambridge, UK, 2010). F. Mosteller and J. W. Tukey, Data Analysis and Regression (AddisonWesley Publishing Company, Reading, Massachusetts, 1977). S. N. Wood, Generalized Additive Models: An Introduction with R (Chapman & Hall/CRC, Boca Raton, FL, 2006). R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, (2017). H. Margenau and G. M. Murphy, The Mathematics of Physics and Chemistry (D. Van Nostrand, New York, NY, 1943). W. S. Horton, Statistical aspects of second and third law heats, Journal of Research of the National Bureau of Standards 70A, 533 (November– December 1966).

83

[20] K. Nakajima, Determination of optimal vapor pressure data by the second and third law methods, Mass Spectrometry 5, p. S0055 (2016). [21] E. Tal, Measurement in science, in The Stanford Encyclopedia of Philosophy, ed. E. N. Zalta (The Metaphysics Research Lab, Center for the Study of Language and Information (CSLI), Stanford University, 2015) Summer edn. [22] P. Clad´e, E. de Mirandes, M. Cadoret, S. Guellati-Kh´elifa, C. Schwob, F. Nez, L. Julien and F. Biraben, Determination of the fine structure constant based on Bloch oscillations of ultracold atoms in a vertical optical lattice, Physical Review Letters 96, p. 033001 (January 2006). [23] A. Koepke, T. Lafarge, B. Toman and A. Possolo, NIST Consensus Builder — User’s Manual. National Institute of Standards and Technology, Gaithersburg, MD, (2017). [24] P. J. Mohr, D. B. Newell and B. N. Taylor, CODATA recommended values of the fundamental physical constants: 2014, Reviews of Modern Physics 88, p. 035009 (July-September 2016). [25] P. J. Mohr, B. N. Taylor and D. B. Newell, CODATA recommended values of the fundamental physical constants: 2006, Reviews of Modern Physics 80, 633 (April-June 2008). [26] C. M. Wang and H. K. Iyer, On non-linear estimation of a measurand, Metrologia 49, 20 (2012). [27] R. Bouchendira, P. Clad´e, S. Guellati-Kh´elifa, F. Nez and F. Biraben, New determination of the fine structure constant and test of the Quantum Electrodynamics, Physical Review Letters 106, p. 080801 (February 2011). [28] Joint Committee for Guides in Metrology, International vocabulary of metrology — Basic and general concepts and associated terms (VIM), 3rd edn. (International Bureau of Weights and Measures (BIPM), S`evres, France, 2012), BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, JCGM 200:2012 (2008 version with minor corrections). [29] W. G. Cochran, The combination of estimates from different experiments, Biometrics 10, 101 (March 1954). [30] J. Hartung, G. Knapp and B. K. Sinha, Statistical Meta-Analysis with Applications (John Wiley & Sons, Hoboken, NJ, 2008). [31] G. Schwarzer, J. R. Carpenter and G. R¨ ucker, Meta-Analysis with R (Springer, New York, 2015). [32] K. Burnham and D. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd edn. (Springer-Verlag, New York, NY, 2002). [33] C. Antoine, Tensions des vapeurs: nouvelle relation entre les tensions et les temp´eratures, Comptes Rendus Hebdomadaires des S´eances de l’Acad´emie des Sciences 107, 681 (Juillet-D´ecembre 1888). [34] C. Antoine, Calcul des tensions de divers vapeurs, Comptes Rendus Hebdomadaires des S´eances de l’Acad´emie des Sciences 107, 778 (Juillet-D´ecembre 1888). [35] C. Antoine, Tensions de diverses vapeurs, Comptes Rendus Hebdomadaires des S´eances de l’Acad´emie des Sciences 107, 836 (Juillet-D´ecembre 1888). [36] S. R. Searle, G. Casella and C. E. McCulloch, Variance Components (John

84

Wiley & Sons, Hoboken, NJ, 2006). [37] J. Pinheiro, D. Bates, S. DebRoy, D. Sarkar and R Core Team, nlme: Linear and Nonlinear Mixed Effects Models, (2015). R package version 3.1-120. [38] L. Wasserman, All of Statistics, A Concise Course in Statistical Inference (Springer Science+Business Media, New York, NY, 2004). [39] S. S. Shapiro and M. B. Wilk, An analysis of variance test for normality (complete samples), Biometrika 52, 591 (1965). [40] M. Clyde, Model averaging, in Subjective and Objective Bayesian Statistics: Principles, Models, and Applications, ed. S. J. Press (John Wiley & Sons, Hoboken, NJ, 2003) pp. 320–335, 2nd edn.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 85–118)

Metrology and mathematics — Survey on a dual pair

Karl H. Ruhm Swiss Federal Institute of Technology (ETH), Institute of Machine Tools and Manufacturing (IWF), Leonhard Strasse 21, LEE L219, Zurich, CH 8092, Switzerland E-mail: [email protected]

When we do usual Computation in Science, especially in Metrology, we assume that we do Mathematics. This is only partly true in spite of the fact that database handling and data processing are most important indeed. The field of Measurement plus Observation is much more entangled with Mathematics. The following survey “Metrology and Mathematics” will focus on selected ideas, concepts, rules, models, and structures, each of which is of a basic mathematical nature. When doing this, it is not surprising at all that most theoretical and applicational requirements in the field can be reduced to just a few basic logical and mathematical structures. This is also true for complex processes in fields like humanity and society. Signal and System Theory (SST) supports this claim. However, such an endeavour asks for an orderly and consistent definition of quantities and processes and of their mathematical models, called signals and systems. These models represent all kinds of issues, items, and phenomena in the real world. The term model indicates the superordinate means of mathematical description, observation, and representation in Science and Technology. The question arises, what the role of Logic and Mathematics looks like in detail. The term structure will be consistently pivotal. And, as is generally known, structures are best visualised by graphical means, here called Signal Relation Graphs (SRG). Keywords: Metrology, Measurement, Observation, Mathematics, Stochastics, Statistics, Signal and System Theory, Model, Structure, Relation, Signal, System.

85

86

Introduction One should think that Metrology, as Measurement Science and Technology, is somehow comparable to Pure Mathematics and Applied Mathematics. Pure Mathematics follows purely abstract goals. In principle, Measurement Science does so too, but only seldom. Most Pure Sciences, including Measurement Science, are in need of mathematical support, and rely or have to rely on Applied Mathematics: As a fact, Metrology is interested in Applied Mathematics, but very seldom in Pure Mathematics, which fosters top-down frameworks (bird perspective) and allows holistic views. On the contrary, applicational topics in everyday measurement life utilise bottom-up frameworks (frog perspective), allowing “only” highly specialised and thus limited based insight. In summary, so-called abstract needs in Measurement Science call for concepts and results of Applied Mathematics. Thus, Pure Mathematics and Measurement Science do not reside on the same level. This is all common feeling, may be considered natural, and at least comprehensible. Measurement Science aims at descriptions of past, present and future states. It either regards unrealistically the whole, hypothetical infinitely large real world, or rather realistically the particular, bounded, tangible small real world. The sensory process together with the observation process may serve as an example of such a restricted real process. It acts as a subsection of Measurement Technology in data acquisition and data processing. In order to succeed in such a description of real world measurement and observation processes, we have to “download” abstract concepts, methods, and procedures from superior scientific insights, which are delivered by Measurement Science. They emerge in form of terms, rules, standards, definitions, analytics, inter- and extrapolations, procedures, best practises, trade-offs, calculations, refinements, corrections, tricks, programs with coding and decoding, and so on. It may be surprising that all these abstract and manifold artefacts are fostered by few concepts, methods and tools, presented in the next sections. These concepts, methods, and tools are first and foremost of a logical and mathematical character indeed. This is even true, if we at times do not have enough quantitative information about structures and parameters of the processes under investigation. Nevertheless, they stand by in the background. A first informal statement concerning a relation (interaction, dependence, association, correlation, assignment, coherence) between

87

well-defined quantities of interest is at present sufficient for subsequent, determined, quantitative activities: Such a preliminary black box mode proves truly convenient in practise. Of course, we finally have to know, which of the concepts, methods and tools in the wide field of Applied Mathematics may best serve our particular, narrow needs, a challenge to be supported beforehand by Measurement Education. Normally, the metrological community enjoys mathematical concepts only reluctantly, unless they come in form of shortly useful “cooking recipes”. One reason for this aversion of becoming committed in mathematical fields may be lacking overview and skills concerning the superior, holistic, and systematic principles and structures, which Mathematics readily provides to everybody. Admittedly, the principal structure of cause and effect is widely accepted [2, 4], but a deeper engagement in theoretical and practical consequences and possibilities is often missing; another challenge to be met by Measurement Education. Besides, amazing analogies can be detected as to many Philosophers of Science respecting their commitment to logical and mathematical structures. Often, direct consequences result from such reservations. Design and realisation are studied from scratch and invented anew, although solutions are ready for use from generalising background sources. An outstanding example is the important task of error correction, which is once and for all offered by the structure of an inversion procedure of a mathematical model. However, who cares? The following sections describe and recommend some selected mathematical concepts and structures, which primarily refer to measurement and observation procedures. This is not too demanding an endeavour, since most theoretical and applicational requirements in the field may, surprisingly enough, be reduced to about half a dozen main structures according the motto “Keep it Simple”. This is also true for complex processes: Signal and System Theory (SST) supports this claim. In this presentation, the path to this concept will be gradually prepared. Moreover, the main aim will be, to create the impression that such a holistic approach serves even way beyond Natural and Technological Sciences. The disposition of this survey starts with selected mathematical tools to describe quantities and processes in a general way. These two basic terms represent any sort of phenomena and items on the real world side: The term mathematical model, concerning these two terms, stands for the superior means of abstract description, behaviour, and representation in Science and Technology. Above all, the most tantalising omnipresent claims “Our mathematical universe” [10] or “Why our world is mathematical” [11] remains. Such popular statements will be commented shortly in the concluding section.

88

1 Assumptions and Preconditions We will consider Measurement and Observation Procedures [9] and will use their common terminology. Fortunately, after slight adaptations of terminology and corresponding statements, the presented concepts prove valid in other fields too. Additionally, we assume that all quantities to be measured and observed are well defined, in spite of the fact that there are always severe discussions about definitions and meanings of quantities. The well-known Cause and Effect Principle allows multiple quantities as causes, considered as inputs to the process. They may even occur correlated. We may also recognise multiple quantities as effects, which are considered outputs from the process. According to the properties of the process, they may be correlated too. The general description of a process comes from analytical and / or empirical modelling. The behavioural description of a process is derived from the general description as a set of solutions by acknowledged relationships and rules. The performance of any procedure, for example of a sensory procedure, is judged by means of the more or less erroneous output quantities. All generated information content is uncertain. It is essential to point out again and again that such considerations are general. They are used as a base for numerous activities: to think, mean, reason, interpret, inquire, declare, guess, examine, test, verify, compromise, justify, predict, investigate, hypothesise, theorise, evaluate, mine, order, decide, confirm, report and so on. Of course, they have their own theories and technologies regarding particular concepts, methods, procedures, and tools, but they are applied right after measurement and observation procedures. They show the same logical and mathematical structures. However, due to our focus on information acquisition and processing by measurement plus observation, such activities are excluded here. The main goal is to access quantities and deliver information, representing these quantities on the one hand and to unseal interrelations (dependencies), which are each defined by a set of mathematical structures on the other hand. The term structure will dominate the whole presentation, visualised by Signal Relation Graphs (SRG). We assume that quantities may be described as deterministic / probabilistic and that the relations between these quantities are linear and / or nonlinear, as well as dynamic and / or nondynamic. Moreover, we assume that some models of quantities and processes are available. This strong request causes considerable challenges concerning measurement and observation on the one hand and concerning empirical modelling from experiments and / or “from first principals” on the other hand. Sadly enough,

89

many disturbing quantities of the environment and the background keep interfering. They have to be considered structurally and fought systematically. We will show, how Mathematics, in the form of models, describes all these procedures of the real world. We assume that there are multivariate signal types (vector notation) with corresponding interrelations (matrix notation). Therefore, we anticipate that the following mathematical structures will mostly deal with vectors and matrices, which are handled by Linear Algebra (LA). We acknowledge the fact that, due to different reasons, models are never able to fully describe quantities and processes; they remain erroneous and uncertain. Mathematics cannot be blamed for this: We always deal with approximations and trade-offs. They have to be optimised gradually, until the models serve our needs sufficiently well enough. Though the concept of ideal against nonideal is a base in Measurement Science, it is not a direct implication of Pure Mathematics. We do not consider examples of the vast applicational field of information acquisition. The reader will recognise the analogies between his needs and the concepts provided by Signal and System Theory (SST): Some sort of transformation or translation will generate some sort of an individual terminological dictionary. We acknowledge that Mathematics is the master of many sub sciences like Stochastics and Statistics, Signal and System Theory, Modelling, Optimisation, Regression Theory, Operations Research, and so on. After these preliminaries, we are ready to concentrate on the core task of the role of Mathematics in Measurement plus Observation. The following sections will serve as a survey and not as a tutorial. 2 Metrology — Mathematics in Metrology It is an easy but delicate task to talk about Mathematics on the one hand and Measurement plus Observation on the other hand. Why? If we consult Philosophers of Science, we get several contradictory ideas about Mathematics and the real world. They provide few ideas concerning the mathematical description about the acquisition of information from the real world and the following processing of the resulting data by Mathematics in an abstract world. This seems strange, since the acquisition of information from the real world (concrete reality), realised by so-called sensor processes, and its transmission to the mathematical world (virtual reality), manifests itself as one of two links between these two worlds. Even less discussed is the second link leading from the mathematical to the real

90

world, which is realised by so-called actors. We will mainly focus on the relation between Mathematics and Metrology. What is Metrology good for and what is Mathematics in Metrology good for? This is seemingly a trivial question: Broadly speaking, we first generate information by measurement procedures. Then we process purposeful information by mathematical tools. However, this borderline is blurry. There is another, more challenging aspect: Measurement Technology (measurement processes and procedures) reside on the real world side, whereas Measurement Science (theoretical foundations and principles) resides on the abstract world side. Likewise, mathematical concepts and rules reside on the abstract world side, whereas tools of Applied Mathematics serve as “hardware” on the real world side [1; 2]. Commonalities between Measurement Science and Mathematics are obvious. In order to make this important concept even more refined and consistent, items in those two worlds ought to have their appropriate precise terms: In the real world, we speak of processes and their quantities, whereas, in the abstract world, we denote their models as systems and their signals respectively [3]. Thus, we make it clear, which level we address. This seemingly arbitrary and not widely accepted concept is supported by the dominating field of Signal and System Theory (SST), which is deemed the Extended Theory of the Cause and Effect Principle on the one side, and is in fact Applied Mathematics on the other side. Signal and System Theory belongs entirely to the abstract world side. The following sections will concentrate on the abstract, mathematical side, on the logical and / or mathematical models of processes and their quantities, and on the respective mathematical tools. The ideas presented here, should serve as a first scenario of the dual pair Measurement Science and Mathematics. Now, let us proceed to concrete concepts of both sides with their relations and structures. We will start with our primary task of Metrology, the measurement procedure as postulated by Measurement Science. A Sensor Process is actually a real world process like any other process, but of course, with its own wellknown and well-defined particular task. As soon as we start talking about this process, we generate a model of it, which already resides on the abstract side. Unlike in some other fields, a quantitative, mathematical model is mandatory, no matter how simple it may be. It will consist of a set of more or less complex equations, which describe this real process.

91

3 Direct Mathematical Structures 3.1 Basic Mathematical Model One starts with the well-known, general, and mathematical formulation y(t) = f{u(t)}. It shows the sensitivity of a time dependent output signal y(t) on a time dependent input signal u(t). Such a linear / nonlinear algebraic equation is a typical model of a general process and in particular of a sensor process: y(t) = g.u(t). g is the unit afflicted parameter, normally called transfer value. For the moment, that’s all concerning the general description and even the behavioural description of the scalar sensor process with the measurand u(t) at the input. A graphical tool, the signal relation graph (SRG), visualises this simple mathematical model (Figure 1).

Figure 1. Description of an input-output system.

Note that we do not just talk about common quantities of the real world side, but likewise about purely abstract signals, for example models of characteristic values or functions of real world quantities, which do not have equivalents on the real world side. Signal and System Theory does not differentiate between models of abstract world quantities and models of real world quantities. It is a formidable fact and a welcome assurance that this basic formulation will remain the same for nonlinear, multivariable, dynamic processes, even with external and internal disturbances, errors, and uncertainties. Mathematical notations and structures are just slightly updated from case to case. In a next, obvious step, we consider the important extension by regarding inner structures and quantities, thus abandoning the previous black box concept: Inner quantities are modelled as multivariate state signals x(t). In addition to the ordinary input-output description, we thus get the inputstate-output description of a general process, and thus of a sensor process. At the same time, we take the opportunity to decompose the multivariate signals: At the input, we will have the input signals u(t), the disturbance signals v(t), and the reference and condition signals w(t).

92

At the output we will have the output signals y(t) and the loading signals z(t)). Note the succession of the letters, u, v, w, x, y, z; easy to memorise (Figure 2)!

Figure 2. Description of an input-state-output system.

3.2 Basic Mathematical Tasks Question: Which general mathematical issues or operations do we face for a given process model in input-output description? We know that the process model consists of multiple input signals, of multiple output signals, and of the logical and mathematical relations between these signals. These relations are characterised by structures and parameters. According to the cause and effect principle, there are three and only three possible task definitions for this basic configuration, in which always two items are given and one is searched. We best indicate this in form of a table. Given

Given

Searched

1 input system → output signals u structure; signals y system parameter p 2 output system → input signals y structure; signals uˆ system parameter p 3 input output → system signals u signals y structure; system parameter p

Scope of tasks conversion, mapping, control, experimenting, transformation, simulation, measurement, coding, convolution, error and uncertainty analysis, forecast, prediction inversion, inference, reconstruction, restitution, deconvolution, decoding, reestablishment, diagnosis, estimation, conclusion system identification, model building, calibration, diagnosis, test

Mathematics direct structure

inverse structure

combined structure

These three tasks require particular activities in Measurement Science and Technology on analytical and / or empirical bases. Precondition is always the availability of a minimal abstract model, even if it is only qualitative, incomplete, erroneous, and uncertain. The concept is important.

93

Remark: Frequently, we find the adjective “backward” added to task two, in which the procedure inference is performed. Such and similar statements are misleading, because we clearly face a forward and not a backward operation in the sense of a series (cascade) connection in a chain. Besides, the generated results, the estimates uˆ of the input signals u, are never identical with the original input signals u to be estimated. So, there is no backward incident. Up to now, we have considered general models of processes and their quantities. Now, we will concentrate on measurement and observing procedures. All remarks made above will remain valid without exceptions. 3.3 Mathematics: Cause and Effect Concept Most sciences accept the concept of cause and effect. Causes as well as effects are considered as defined quantities of the real world. If we want and are able to quantitatively describe linear and / or nonlinear relations or dependencies between these quantities, we call the resulting set of equations a mathematical model: y = f{u, p, t}. The signals u are models of the input quantities (causes), the signals y are models of the output quantities (effects), p are models of the parameters of the relating equations, and t is the independent time. Thereby, it is assumed that the quantities and parameters may be time dependent. Often, three independent space coordinates extend this structure, once we have to consider spatial phenomena. Other independent variables like frequency, wavelength, and so on, are common. Given the relations between defined signals, where is the desired mathematical model of the process of interest? It is remarkable and important that the relationships between involved signals with structures and parameters act as proxy descriptions (models) of real processes. In other words, mathematical signal relations are descriptions and representations of processes. Summing up, we have models of individual or assembled quantities (vectors) and in addition, we have models of processes in form of quantity relations. We call mathematical models of quantities signals and mathematical models of processes systems. Thus, we have two different levels, the real world (quantities and processes) and the abstract world (signals and systems).

94

It is obvious that a chosen mathematical description of quantities and processes is not unique. Quite different mathematical or empirical concepts and methods may serve very particular needs. On the other hand, a single type of model structure may serve innumerable, completely different processes in different fields, which in that case all belong to a special class of processes. Moreover, models are always erroneous and their statements are uncertain. This is a typical situation, which needs optimal trade-offs. 3.4 Mathematics: Description and Behaviour We look for general descriptions, which include most current features, like multivariate quantity vectors, linear and nonlinear functions, nondynamic and dynamic phenomena, and deterministic and probabilistic issues. In principle, all these characteristics, defined by mathematical means, must apply to any process of interest. Process descriptions may become extremely complex, even for apparently simple situations. On the other hand, due to several reasons, we often are allowed to restrict ourselves to less demanding requests. This means that the mathematical structures will become simpler and simpler, eventually tending to the simplest extreme, the algebraic equation: model reduction. However, the discussion on approximation, representativeness, accuracy, and performance will expand simultaneously. An ordinary differential equation (ODE) of (N)th order serves as a popular example of a describing input-output model of a single-input, singleoutput (SISO) dynamic process, which is assumed to be linear and time independent (LTI): N n d y(t)  a y(t) aN d N y(t)  ...  an d n y(t)  ...  a1 dt 0

dt

dt

d u(t)  ...  b dm u(t)  ...  b dM u(t)  b0u(t)  b1 dt m m M M dt

dt

or summarising

 an dtd N

n 0  

n n

y(t) 

 bm dtd M

m 0

m m

u(t)

95

Unfortunately, this type of scalar differential equation keeps being taught in schools: It is not notably handy and comes at most as a model of third order by first principle modelling anyway. However, a mathematical transformation to an equivalent, generalising set of (N) differential equations of first order (temporal description) or the transformation to the frequency domain (spectral description) will avoid mathematical difficulties: State Description (SD) is the helpful tool [5; 7]. If probabilistic quantities are involved, which is often the case, they need an adequate handling also. Such quantities cannot enter the normal differential equations directly. However, their characteristic values and functions, like arithmetic mean values, variance values, correlation values, uncertainty values, as well as their characteristic functions, like probability density functions, correlation functions and so on, can in fact be variables in differential equations. They are independent or dependent abstract quantities, can individually be handled like normal quantities, but are caught within their own sets of equations. Finally, their behaviour will superimpose on the behaviour of the respective deterministic quantities at the output of a system. The structure of a mathematically appropriate description, which just shows the relation between input quantity u(t) and output quantity y(t), is insufficient in many cases. For example, we can’t assess and analyse relations within a process. Therefore we prefer the input-state-output concept, which accounts for the inner, time dependent and multivariable state signals x(t) of a dynamic process. Signal and System Theory (SST) provides such a model type, the so-called State Description (SD), which primarily is adequate for a linear, dynamic process model of Nth order. Since it allows multivariate quantities, Linear Algebra (LA) [5; 7] with vectors and matrices is the predominant tool. This universally valid concept still bases on a set of ordinary, scalar differential equations, at large, a vector-matrix differential equation. Four parameter matrices, A, B, C, and D, represent the internal relations and the external relations to and from the surroundings. These process parameters (individual transfer values) may respect nonlinear situations too. Insofar, this standard mathematical structure type offers far more possibilities than the ordinary differential equation of (N)th order.  x (t) A x(t)  Bu(t)  y(t) C x(t)  Du(t) with x

(inner) state

A, B, C, D

system matrices

96

According to Linear Algebra (LA), we have an even more compact form:

 x (t)  A B   x(t)  x (t) or    y(t)  C D  u(t)   y(t)         with P, the partitioned parameter matrix, containing all system matrices.

 x(t) P  u(t) 

Note the simplicity of the structure between input, state, and output signals and the interrelating parameters. What does it look like in a graphical representation? We get an entirely unusual, but exact and complete mathematical description of a real world process in the following form (Figure 3):

Figure 3. Description of a dynamic, linear, time invariant (LTI), input-state-output system.

Thus, we face a strange situation: This visualisation of the vector-matrix differential equation does not show the usual integration procedures. Yet, we know from Mathematics that we have to solve (integrate) differential equations, as soon as we are interested in the transitions of the output signals y(t) under given circumstances. This is the missing operation (Figure 4): x(t) 

t 0 x (t)dt t 

x (t t 0 ) x0

Figure 4. Description of the integrating (solving) system.

In Mathematics, this additional procedure is called solution (integration) of an ordinary differential equation.

97

For a total solution, we need a set of initial state signals x(0) at time t0 on the one hand, in order to make the expected transitions y(t) of the solution unique. On the other hand, the transitions of the input signals u(t) have to be given too. What does the graphical representation look like after the integration procedures (Figure 5)?

Figure 5. Behaviour of a dynamic, linear, time invariant (LTI), input-state-output system.

Preliminary Result: First, a set of model equations provides the general (representing) description of a process. Second, a set of solution equations regarding the output quantities y(t) provides the behavioural (simulating) description of a process (Figure 6).

Figure 6. General and behavioural description of a dynamic process.

Process description and process behaviour are not of the same category, but they are equivalent because of the connecting integrating transformation. This statement is important insofar, as there is the alternate issue, where we have the process behaviour, obtained empirically by experiments for example, and we need the process description. It is obvious that in this case a connecting differentiating transformation, the inverse operation of the integrating transformation, is necessary. This will be revealed later on. In the time domain, such a behavioural description of process P is given by the analytical solutions of all differential equations for specified input quantities u(t) and for the initial state signal values x(0) in form of an integral equation [5; 7]:  y(t) C e A(t) x(0) C  e A(t) Bu( )d t

0

Du(t)

98

One might recognize the similarity with the so-called convolution integral. Of course, all four system matrices appear and the two exponential functions indicate a linear time independent system (LTI). The output signal vector y(t), as a representative of the total response, is in principle a linear combination of (N) scalar exponential terms. The first term is the so-called zero-input response, the second term the so-called zero-state response, and the third term the so-called throughput response. Of course, for a simple single input, single output system (SISO), the behavioral description can be written down accordingly; values will remain instead of vectors and matrices. For theoretical and practical reasons, particular input signals u(t) are favored, as for example the delta impulse function, the step function, the ramp function, which are related by integrating and differentiating operations respectively. An important player as an impact signal is the harmonic function, leading to the well-known frequency response functions. Not to forget: The random functions with varying parameters are necessary at the input of a Monte Carlo Simulating Observer (MCSO). All these particular signals provide corresponding analytical response functions by means of the behavioral description. Admittedly, though the concept of State Description (SD) seems much more demanding than a usual set of simple, but unstructured equations, a minimal familiarisation will reveal major benefits. Yet, since several mathematical similarity transformations in Linear Algebra (LA) may be applied to a basic set of equations, the parameter matrices may change and hide the familiar physical parameters. But, the overall equivalence with other versions is always preserved. In other words, the behaviour of the diverse models remains the same. 3.5 Mathematics: Errors 3.5.1 Error Definition What about errors, which are so important, not just in Metrology? Do we need other structures and rules? The answer is “Not at all”! As indicated already, Signal and Systems Theory (SST) treats abstract quantities, like errors and uncertainties, just as any other quantities. Since we want to continue working with the previous models of processes, and particularly with measurement processes, we define, as a start, an ideal, nominal measurement process MN as a reference. Here, the simple but extremely important Fundamental Axiom of Metrology (FAM) comes in. It states that the resulting quantities yˆ (t) of the nominal measurement process MN with its nominal behaviour, have to equal the measurement quantities y(t) of interest at any time and at any

99

location. So far, so good. Fortunately, we are able to describe this request mathematically by the Fundamental Equation of Metrology: yˆ (t)  GN  y(t)  I  y(t)

GN signifies symbolically a transfer function matrix in the case of the ideal (nominal) system, which consequently has to equal the identity matrix I. We readily readjust this relation, so that it includes possible measurement errors ey(t). Thereby we define these errors as the differences (deviations) between the measurands y(t) and the measurement results yˆ (t) . For the desirable, yet fictitious ideal measurement process MN, this customised equation should always equal the zero vector o: yˆ (t)  y(t)  e y (t) = o

This definition enables us to visualise the general structure of a nonideal measurement system M, where G signifies symbolically a global transfer function matrix of the nonideal system (Figure 7).

Figure 7. Nonideal, dynamic measurement process M with error quantities ey(t).

3.5.2 Error Types By means of the previous mathematical structures, we now look out in a top down manner, which error structures we have to face. We start with the earlier model, where we decomposed input quantities as well as output quantities (Figure 8).

Figure 8. Measurement process M with disturbance signals vM(t) and loading signals zM(t).

100

This leads to the following set of algebraic equations, which we assign to the nonideal, nondynamic measurement process M of interest:  yM (t) Gyu    z (t)    M  Gzu

Gyv  uM (t)    Gzv   vM (t)

u (t)  G M   vM (t)

In a second step, we adapt and simplify this set of equations with regard to the ideal, nondynamic measurement process MN: The direct measurement path has to have the transfer matrix Gyu = I and we postulate ( ! ) that all other paths, which could contribute to measurement errors, are blocked:  yM (t)  I 0  uM (t)     z (t)  0 0   v (t)  M   M  

u (t)  GN  M   vM (t)

This delivers the corresponding structure of the idealised measurement process (Figure 9).

Figure 9. Idealised (nominal), nondynamic measurement process MN.

We recognise the nonideal situation, if these requirements are not fulfilled. We see the disturbance path and the disturbance-load path, which lead to disturbance errors and to load errors. These paths transmit all external influences vM(t) from outside the measurement process, like changes from defined reference states of temperature, pressure, vibration, radiation and so on. Moreover, we see the measurement path and the measurementload path, which lead to internal measurement errors and to load errors. Load errors interact by detour via the preceding process and its quantities uM(t) to be measured. These paths are only indicated here. Finally, internal errors arise due to the nonideal measurement path, which we will consider later on.

101

Thus, we have three and only three error types: internal errors, disturbance errors and loading errors. These errors may be systematic and / or random, multivariate and multidimensional. Temporarily, we disregard the disturbance and loading effects and concentrate on the internal errors, which may come from nonideal transfer characteristics, dynamic effects, modelling errors and so on. In order to gain an idea of the underlying structures, we decompose the nonideal measurement system M into the defined or assumed ideal (nominal) system MN and an empirically modelled error system E. The error system E merges all nonideal relations, gathered in the error transfer function matrix GE. The two parallel paths both act on the output signals y(t), which (t) y(t)  e y (t) (Figure 10). matches our earlier definition: yˆ

Figure 10. Nondynamic measurement process M with internal error system E.

It should be mentioned that there are three and only three possibilities for such an abstract separation of the error system E from the nominal system MN: series connection, parallel connection and feedback connection. All three are used and are equivalent. They may be chosen according to applicational needs (Figure 11).

Figure 11. Three possibilities of a separation of an error system E.

102

Considering all effects, the internal nonideal effects eyu(t), the external disturbance effects v(t) on the measurement system and the external backloading effects z(t) from the measurement system, we get an extended structure of error formation (Figure 12).

Figure 12. Extended, nonideal measurement system with error system E.

Now we are able to extend this model in the direction of dynamical systems with differential equations by means of the already well-known State Description (SD) with its standard inner structures. It combines all effects in one single package (Figure 13): x (t)  A x(t)  Bu(t)  E v(t) y(t)  C x(t)  Du(t)  F v(t) z(t)  Gx(t)  Hu(t)  Jv(t) with

v(t) disturbing quantities as a second input vector z(t) back-loading quantities as a second output vector

Again, we prefer the compact version:

 x (t)  A B E   x(t)  y(t)  C D F  u(t)          z(t)   G H J   v(t) 

 x(t) P u(t)   v(t) 

103

Figure 13. Nonideal dynamic measurement system with error system ME.

Note that this concept of three error types has been developed in a top-down approach in one single mathematical structure. Any practical error analysis has to assign all occurring errors to one of these three error types and has to implement them in this basic structure. Without further details, a final structure is presented: The important error topic appears in a remarkable, beautiful, mathematical symmetry. It reveals the systematic and consistent methodology of Signal and System Theory (Figure 14).

Figure 14. Nonideal dynamic measurement system with error system ME.

104

A factorisation (decomposition) leads in the direction of a standard structure (Figure 15):

Figure 15. Separation of ideal measurement system MN from error system ME.

And here the summarising total error vector e shows up: ey  ey  u  e      GE v  ez   z   

GEyu   GEzu

GEyv  u    GEzv   v 

A last concentrate is the so-called Redheffer Star Product Matrix S: y u   z    {GN ,GE }  v     

Again: For the ideal measurement process, all four error transfer matrices GE have to be zero. 3.5.3 Uncertainty Uncertainties represent the degree of lack of knowledge concerning phenomena and facts. Any statement is accompanied by some uncertainty. However, this is seldom declared. Regarding measurement uncertainties u, the same concept applies as for normal signals. But, since uncertainties are considered random signals, their respective characteristic values, like mean values, variance values, correlation values, as well as their characteristic functions, like probability density functions, correlation functions and so on, will become the variables instead. Though the mathematical structures and relations are the same, the signals, described by these characteristic values and functions cannot merge with the respective, deterministic signals. The twofold declaration of quantities and their uncertainties is the usual practise [6] (Figure 16).

105

Figure 16. Quantities, errors and their uncertainties.

4 Inverse Mathematical Structures Mathematically inverse procedures are undertaken in many fields. They primarily reverse other procedures. This means that mathematical operations, and functions respectively, have to be mathematically invertible. We know of many particular operations with corresponding inverting operations. Talking about mathematical models of real world processes, we also want to use inverse models. Yet, this arouses suspicion, since we know that in real world procedures, reversions are just possible under very restrictive conditions. Within the abstract reality, such restrictions do not seem so rigid. However, for mathematical inversions, additional limits and constraints exist, which prevent unproblematic inverse functions. Therefore, many detours must be made in everyday practise, to circumvent the strict constraints of mathematical inversion. Examples are so-called Pseudoinverses and Numerical Approximations. An inverse function inverts the original function, so that f −1{f(x)} = x. Considering practical applications, especially Metrology, we will use the previous direct relations between signals, provided by Signal and System Theory (SST), wondering, what inverse relations between signals could look like. Since we use vector-matrix equations by Linear Algebra (LA), the inversion of matrices will be an important topic. As shown earlier, one of the three task definitions of modelling uses inverting procedures to determine system parameters or system inputs, given system outputs and system structures. 4.1 Inverse Mathematical Sensor Model An example of utmost importance in Metrology is the inversion of a sensor process model. Normally, a sensor process S delivers quantities yS(t) in form of voltages, currents, charges, frequencies and so on. They do not particularly catch our interest (Figure 17).

106

Figure 17. Example: Measurand p, sensor output voltage up and measurement result pˆ .

The so-called reconstruction system R is the inverted sensor model S. It delivers the estimate of the measurand (Figure 18).

Figure 18. Sensor process S and reconstruction process.

Such an inferring procedure is commonly called reconstruction. The terms inversion, inference, reconstruction, restitution, deconvolution, decoding, reestablishment, diagnosis, estimation, conclusion are also accepted. We use the statement of the Fundamental Axiom of Metrology (FAM) again, which claims that measurement results yˆ (t) of a measurement process M have to equal the measurands y(t) of interest: yˆ (t) G  M y(t) I y(t) or yˆ (t)  y(t)  o or e y (t)  o

Thus, the model of the necessary reconstruction process R must be the inverse model of the sensor process S: GM = GR.GS = I, or GR = GS−1. Again, this means that a mathematical model of the sensor process S must be available.

107

4.2 Measurement Error Correction by an Inverse Mathematical Sensor Model We know that sensor processes are erroneous. If modelling of errors is possible, for example modelling of internal errors and / or of disturbance errors, then consequently, the reconstruction process R is able to execute the error correction automatically. This insight is of an enormous importance, and recommends right away the correct strategy for a measurement process M, presumed by the Guide to the Expression of Uncertainty in Measurement (GUM) [6]. Similar terms in this field, like scaling, inference, deconvolution, error correction, linearization, decoupling, inverse dynamics and so on, are just synonyms. Such a reconstruction process is more or less successful, which means that reconstruction errors remain. They depend on the quality of the given sensor model and on the reconstruction procedure itself. No other correction process will do better; only the concept reconstruction process complies with the basic concepts of Metrology. We saw three separation structures within the nonideal model: The model of the error process E and the model of the ideal sensor process SN can be connected in a series, parallel, and feedback manner (Figure 11). Consequently, there must follow three possibilities of the inversion of these three structures with the corresponding transfer values g. This general situation is indicated symbolically in the following signal relation graph (Figure 19).

Figure 19. Connections of two subprocesses and their respective reverse connections.

108

The correspondences of the structures due to inversion are obvious: • The inverse of a series connection remains a series connection. • The inverse of a parallel connection is a feedback connection. • The inverse of a feedback connection is a parallel connection. Note the obvious, nice vertical symmetry line, which is due to mathematical inversion. The following example shows the error correction by inversion for an extreme situation of a disturbing influence on a sensor process, in the specific case of a temperature disturbed pressure sensor (Figure 20). Again, note the symmetry of the measurement structure.

Figure 20. Correction of a disturbance error in a sensor process S by an inversion procedure in a reconstruction process R.

Another example needs a correction of an internal dynamic sensor error, characterised by a so-called time value T. The modelled dynamic error process E, here in the form of the well-known impulse response function, acts together with the defined nominal sensor process SN in a parallel connection on to the output quantity yS(t). So, the mathematically inverting reconstruction process R must show a feedback structure: The easy inverse of the simple nominal sensor process model SN is put into the forward path and the more complex error process E, fortunately not to be inverted, is put unchanged into the feedback path. This allows a very elegant implementation of the reconstruction process on any processor (Figure 21).

109

Figure 21. Correction of a dynamic error in a sensor process S by an inversion procedure in a reconstruction process R.

Remark: Sensor processes often show such a behaviour of a low pass filter, which is described by an integrating relation. So, the inverting reconstruction process R will be described by a differentiating relation, which corresponds to the behaviour of a high pass filter. In fact, this reconstruction process R describes a differentiating procedure, which may not be visible at first sight. By the way, this reconstruction example could show the equally justified inversion procedure of a dynamic sensor system, described by the convolution integral. The respective inverse procedure would be the deconvolution procedure. This quite common topic is known under the term Inverse Dynamics. This concept may produce amplified noise in the resulting signals, due to the ever existing differentiating operations. Special attention is therefore recommended! 5 Observational Mathematical Structures The procedure measurement is not the only source of quantitative perceptions and of the production of data and information in Metrology. We are familiar with other, but strongly related principles like observing, acquiring, collecting, monitoring, testing, inspecting, calibrating, identifying, surveying, simulating, estimating, diagnosing, predicting, inferring, perceiving, experiencing, and so on. Finally, they all lead to meaningful and reliable metrological results. This section concentrates on the two most notable terms, measurement and observation, which are our main interactions with the real world, and our source of data and information, as well as of understanding and knowledge. The results of both procedures represent the real world. The basic mathematical structures of this joint task within Metrology are the same. However, there are some subtle distinctions.

110

Measurement processes use sensoric sub-processes to acquire objective, quantitative, and accurate data and information about properly defined quantities. They establish physical interactions within the real world, and are the main components in the so-called measurement domain. These remarks concern sensory activities in other fields too, especially in nature. Unfortunately, similar definitions, which concern the procedures of observation, are not established. So again, we base on Signal and System Theory, which at least delivers an overall definition of a class of observers, one of which is the famous Kalman Filter (KF) [8]. First of all, we claim that observation means any active processing of any kind of data and information, and most important, on the basis of given sensory data of any kind. This means that Metrological Measurement is possible without Observation, but Metrological Observation is not possible without Measurement. Or: There is No Observation without Measurement. Since we distinguish between sensory and processing activities, we use the term Measurement plus Observation. Or: An observation process O simply extends the ordinary measurement process M and its capability. This explains the extended term metrological observation (Figure 22).

Figure 22. Measurement plus observation.

Coming back to Mathematics, we assume that the procedure of observation will reveal the same mathematical structure as any other procedure, dealing with signals and systems. This main structure has already been given (Figure 23).

Figure 23. Description of an input-state-output system.

We assume that we have a model of process P. About this process, we may already have some sensory data by means of the measurement process M and we would like to get some additional data, which we cannot raise by measurement. The observation process O shall deliver them (Figure 24).

111

Figure 24. Overall structure of process P, measurement process M and observation process O.

As the former table of the three and only three tasks around a given system shows, we look for a similar classification, stating, which mathematical observer structures can deliver process data and under which particular conditions. The conditions are the quantities around the process P. They are at our disposal for further data processing. Due to combinatorics, we get a set of four exclusive standard structures: 1. Simulating Observation Process SO:

no sensoric data about the process available 2. Open-Loop Observation Process OLO: only sensoric data about the input quantities u(t) available 3. Reconstructing Observation Process RO: only sensoric data about the output quantities y(t) available 4. Closed-Loop Observation Process CLO: sensoric data about the input quantities u(t) and the output quantities y(t) available

These structures are all in use, but in practise, they are not known by this systematic classification [10]. According to the term measurement estimates, we call the results of an observation procedure observation estimates. As usual, the performance of any observation procedure is not ideal and we must also define, analyse, and declare the observation error eobs(t) and observation uncertainty uobs(t). The original canonical model of a dynamic system has already been extended to two input quantity vectors and two output quantity vectors. A particular denomination concerning observers makes this model generally suitable for all of the four observation systems O to be treated. Adaptations are made by simplifications (Figure 25).

112

Figure 25. Extended inner structure of a dynamic observation process O.

Let’s keep in mind that this concept fortunately includes the description and structure of nondynamic systems, of dynamic systems in steady (equilibrium, static) state, and other special cases as well. The following sections will show the four observation processes by means of graphical structures. The corresponding sets of observation equations can be found in literature [10]. 5.1 Simulating Observation Process Contradictory to the statement made above, claiming that observation without sensoric data is not possible, this special case does not base on any real measurement process M concerning the process P of interest. We are forced to provide appropriate time domain or frequency domain estimates uest(t) (dummy signals) at the input. Such data have to be produced artificially by a Function Generator FG (Figure 26).

Figure 26. Simulating observation process SO.

The simulation results, namely the estimates of the state signals xobs(t) and of the output signals yobs(t), represent the behaviour of the process P of interest.

113

If input quantities are multivariate random quantities, the simulating observation process is called a Monte Carlo Simulating Observer (MCSO). The Function Generator FG will deliver individual random quantities, each with defined probabilistic properties, especially with defined probability density functions (pdf). Normally, the process model PM is deterministic. Nevertheless, the simulation results will be random quantities. We need their statistical characteristics (mean values, variance values, correlation functions, probability density functions, spectral power density functions and so on). 5.2 Open-Loop Observation Process Only the input quantity vector u(t) of the process P is acquirable by the measurement process MU. The output quantity vector y(t) and the state quantity vector x(t) are immeasurable and must be observed. The process model PM works parallel to and synchronously with process P. The two sub-vectors of interest, the estimated state quantity vector xobs(t) as well as the estimated output quantity vector yobs(t) again constitute the output quantity vector yOLO(t) (Figure 27).

Figure 27. Open-loop observation process OLO.

5.3 Reconstructing Observation Process In everyday life, we infer from some information to interesting causes and we thus subconsciously apply the inverse cause-and-effect principle. Until now, we have been trying to estimate certain or all state (inner) quantities x(t) and certain or all output quantities y(t) from an input measurement u(t). We now try to estimate certain or all state (inner) quantities x(t) and certain or all input quantities u(t) from an output measurement yˆ (t),

114

by means of inferring procedures. The already known reconstructing observation process RO with the inverse process model PMI accomplishes this procedure (Figure 28).

Figure 28. Reconstructing observation process RO.

One of the most remarkable examples is the reconstructing observation process RO following each sensor process S in a measurement process M, which thus provides the desired global transfer matrix GM = I. And here, severe constraints come in. The first problem is the invertibility of the model [9]. Additionally, we have to respect the so-called observability and controllability, a well-known task in Signal and System Theory [7]. At least, this means that structures or substructures, which have to be inverted, must have the same number of output quantities, as there are input quantities. Consequently, matrices and sub-matrices respectively have to be quadratic. Many reconstruction tasks fail because of the tight condition of invertibility and observability. Normally the number of state quantities x(t) is higher than the number of output quantities y(t). Of course, there are special constellations, like independent or uncoupled signal paths, which allow direct access to certain state quantities. 5.4 Closed-Loop Observation Process The most capable observation process, the closed-loop observation process CLO, measures the input quantities u(t) as well as the output quantities y(t) of process P [7; 8; 10]. This is an enormous advantage. We can now compare the measured output quantities yˆ (t) (set point) with the observed output quantities yobs(t). Should deviations e yobs (t) between them appear, which we call output error signals, we would assume that the model PM of the process P is erroneous and should be corrected.

115

As a measure for such a correction, we would utilise these measurable output error signals e yobs (t) , weigh them by the observation controller matrix L (control rule) and use the results of this observation controller within the process model PM. This procedure adds a feedback path to the observation structure, so that the open-loop observation process OLO becomes a closed-loop observation process CLO. System Theory describes this task in detail [7; 8; 10] (Figure 29).

Figure 29. Closed loop observation process CLO.

In this configuration, the main topic is the state reconstruction, the observation of the inner state quantities x(t). Variations of this main structure exist and head for important practical applications. Behaviour and stability of this observation process have to be evaluated. Precondition is controllability as well as observability. Both criteria depend on the process model PM and can easily be checked and proved. Note that suitable structures of the observation process enable the determination of (N) inner state quantities x(t) by help of one single output quantity y(t) only, if observability is guaranteed. This statement is at least surprising and certainly not intuitive. Summing up, the most versatile observer is the closed-loop observation process, which is the prime source pursuing extended structures, conventionally considered as filters. Of course, the procedure is based on correct sensoric results. Countless detailed descriptions and instructions concerning closed-loop observation processes can be found in literature [8].

116

5.5 Extended Observation Process Structures There are several extended and combined substructures of these four main observation processes. They are more complex and sophisticated, and serve additional practical needs: •

Least Squares Observation Process LSO (Kalman Observer KO)



Minimal (Reduced) Order Observation Process MOO



Unknown Input Observation Process UIO



Nonlinear Observation Process NLO

Without exception, they require a deeper understanding of theoretical prerequisites. Their design is demanding in most cases [8]. 6 Conclusion — Mathematical World? How do we get to know the real world? As soon as quantities within the real world are properly defined, Mathematics provides several tools to describe the real world, at least to some degree. The most important subtools are Process Modelling, Signal and System Theory, Stochastics and Statistics, Optimisation, and Measurement and Observation Theory. We always start with a linear, time independent (LTI), multivariable (MIMO), dynamic process. Such a versatile process includes most measurement and observation processes too. Consequently, we need signal vectors and parameter matrices for a multiple input, multiple output system. This fact leads us directly to the seemingly abstract State Description (SD), which provides a feasible, converging, and unifying description of measurement and observation procedures in a top-down perspective. It is remarkable that all common tasks, handled by Mathematics, use just one basic model, whose structure is adapted rudimentarily in order to respect special needs. Thereby, we have met two sorts of a mathematical description, first, the general description, which represents the process of interest, normally in form of differential equations, and second, the behavioural description, which describes the behaviour under particular, defined circumstances. It has been shown too, that real world processes are only indirectly described by Mathematics. This is done by using relations between the models of defined deterministic and random quantities.

117

Structures of system identification (calibration) reside in the range of the four observer processes with known input and output quantities too. They have not been covered here. It has to be mentioned and rightfully admitted that not all of the seemingly simple mathematical structures mentioned above, do solve the many tiny side problems at hand. Applicational needs and computational power on the one hand and disabilities and ignorance on the other hand can lead to considerable difficulties in particular fields. They force us, to apply for more additional mathematical tools. Mathematical difficulties may arise also from nonlinearities, from the numerical solution of equations, from the inversion of vector matrix equations, from the differentiation and integration by arithmetic means, from the application of linear and nonlinear least square tools, for example. Normally, they request compromises, which come along with errors and uncertainties. However, one of many inconveniences stems from the lack of proper analytical and / or empirical models, this means, from insufficient a-priory knowledge about processes. It is the aim of this paper, to show the general, but dedicated mathematical structures in charge. Minor details prove to be mostly field specific. Besides, it is doubtful, whether the presented facts and structures justify the famous claim that the “world is a mathematical one” [2; 11; 12]. It is a fact that today the real world can be best described by mathematical relations and appropriately represented by abstract models to an amazing completeness indeed. Still, it remains an open-end procedure. However, are such abstract descriptions and models the real world? Finally a famous example: The so-called wave function in quantum physics may collapse during an experiment. This formulation may suggest that a wave function is part of the real world. This, however, is not the case. It is more appropriate to communicate that this particular movement (state) of the observed particle is mathematically described by a probability density function (pdf) wave. References [1] Popper, K. R.; Three Worlds; The Tanner Lecture on Human Values The University of Michigan (1978); http://tannerlectures.utah.edu/_documents/a-to-z/p/popper80.pdf [2] Cartwright, N.; Why Physical Laws Lie Clarendon (1983)

118

[3] Ruhm, K. H.; Process and System - A Dual Definition, Revisited with Consequences in Metrology IOP Publishing, Journal of Physics: Conference Series 238 (2010) 012037 [4] D. L. Hall, D. L.; McMullen S.A.; Mathematical Techniques in Multisensor Data Fusion Artech House, Boston (2004, 2nd ed.) [5] Strang, G.; Differential Equations and Linear Algebra Wellesley-Cambridge Press (2014) [6] GUM: Guide to the Expression of Uncertainty in Measurement BIPM, JCGM 100:2008 (2008) [7] Brogan, W. L.; Modern Control Theory Prentice Hall, Englewood Cliffs (1991) [8] O’Reilly, J.; Observers for Linear Systems Academic Press, London (1983) [9] Buchhholz, J. J.; v. Grünhagen, W.; Inversion Impossible? Bremen (2008) http://www.buchholz.hs-bremen.de/inversion/inversion_impossible.pdf [10] Ruhm, K. H.; Measurement plus Observation – A New Structure in Metrology J. Measurement, Elsevier; Online 2017, In Press 2018 http://dx.doi.org/10.1016/j.measurement.2017.03.040 [11] Tegmark, M.; Our Mathematical Universe Random House (2014) [12] Barrows, J. D.; Perché il mondo è matematico Rom (1992); (in Italian)

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 119–127)

Fundaments of measurement for computationally-intensive metrology Paul J. Scott CPT, University of Huddersfield, Queensgate, Huddersfield, HD1 3DH, UK E-mail: [email protected] Results are presented from a project between European national laboratories (EUROMET), entitled “Traceability for Computationally-Intensive Metrology” (JRP NEW06 (TraCIM)) to demonstrate metrology software is fit for purpose. The paper begins by presenting a new mathematical model for measurement, based on “measurement as an inverse problem” that has consequences for reconstruction algorithms in many modern complex measuring instruments such as CT scanners and white light interferometers. Since metrology software is part of the measuring procedure the “measurement as an inverse problem” model applies and can be used as a foundation to derive properties and build tests for metrology software. Keywords: Metrology software testing, Metrology as an inverse problem, Adjoint functors.

1. Introduction There is much literature on testing general software, including using datasets (softgauges), with known properties, as input data in which the output results from a given algorithm are also known to a specified tolerance. The question is, is there anything special about measurement computations that can facilitate in the testing the correctness and limitations of measurement computations? This is especially true for computationally-intensive metrology where the software can be very complex indeed and methods are required that can ease the testing of the metrology software. Traceability, measurement standards, and quality systems all demand that computational links are demonstrated to be fit for purpose, but: •

There is no coherent framework for testing metrology software. Software developers and measuring equipment suppliers have to fill these gaps with their own ad hoc approaches.



For software performing complex computation, adequate testing is difficult without an effective method of knowing if the software is producing accurate results. 119

120



Computational software depends on finite-precision arithmetic, and it is not sufficient to claim that the software is bug-free.



For difficult computations, approximate solution methods are used but there is no way of demonstrating that such methods provide sufficiently accurate results.

This JRP delivered a new approach to validating computationally-intensive metrology, developed using new mathematics, numerical analysis and exploiting state-of-the-art ICT. A new mathematical model for measurement, based on “measurement as an inverse problem” is presented (Scott and Forbes [1]), which extends the current “representative model of measurement” paradigm. The properties of this new model are described in detail that has consequences for reconstruction algorithms that are now an essential part of many modern complex measuring instruments such as CT scanners and white light interferometers. As the ancient Chinese philosopher Lao Tzu says [2] “anything that is built must rest on a foundation”. Since metrology software is part of the measuring process the “measurement as an inverse problem” model applies and can be used as a foundation to derive properties and build the testing methodologies for metrology software. In particular how to generate suitable test data and how to find appropriate metrics to compare the software under test to either reference software or reference data results. 2. Measurement theory 2.1. State-of-the-art Measurement is fundamental to obtaining scientific knowledge. For over a hundred years philosophers, physicists, mathematicians, social scientists etc. have pursued the definition or analysis of the concept of measurement. The representational theory of measurement has gained wide support among measurement theorists and is the current dominant paradigm [3-6]. The representational theory of measurement considers measurement to consist of the following: 1

A set of objects on which a measurand (quantity to be measured, ISO/IEC Guide 99 [7]) is defined together with an empirical relational structure specifying the relationships between measurands.

2

A set of numbers (measured values) together with a numerical relational structure.

121

3

A set of mappings (homomorphisms), called the measurement procedure, from the set of measurands into a numerical one, under which the empirical relational structure of the measurands are preserved in the numerical relational structure.

The representational theory of measurement can be used to define the topological stability of a measurement procedure. Scott [8] considered that when a measuring procedure is topologically stable, a ‘small’ difference in the measured values implies a ‘small’ difference in the measurand. For finite sets there is a one-to-one correspondence between the topologies on the set and the partial pre-orders defined on the set, see Cameron [9], section 3.9. Scott [8] was able to demonstrate (using the one-to-one correspondence between finite topologies and finite partial pre-orders) that if for a measurement procedure the relational structures of the measurand and the measured values are both partial pre-orders and the homomorphisms between them are also increasing mappings, then the measurement procedure is topologically stable. In summary, for a stable measurement, in the representational theory of measurement the measurands are defined on a set of objects with a relational system defined on them which is a partial pre-order; this is mapped via an increasing mapping (structure preserving mapping) onto another set of objects (numerical) also with a relational system which is a partial pre-order; further one particular object is identified in the numerical relational system as being the observable result of the measurement. Due to the stochastic nature of physical measurement, repeated measurements result in different particular objects being identified which can be characterized by a probability distribution over the objects in the numerical relational system. This probability distribution contributes to the resulting measurement uncertainty (ISO/IEC Guide 98-3 [10]). 2.2. Measurement as an inverse theory Modern scientific instruments are becoming more and more complex with the observed measurement being a proxy for the value of the measurand (e.g. sinograms in CT scanners, see Mueller and Siltanen [11], stack of interferograms in white light interferometers see Balasubramanian [12]) with the value of the measurand having to be reconstructed from the proxy observable measurements. This reconstruction is an inverse mapping from the numerical relational system back to the relational system of the measurand, i.e. an inverse problem. The inverse problem model of measurement is an extension of the representational theory of measurement model. An example is CT measurement where the observed values are a sinogram constructed from different projections on an object. The inverse problem is then to use the sinogram to reconstruct an inverse solution that is the inference to the ‘best’ model that directly maps onto the observed data.

122

A general inverse problem is to compute either the input or the system, given the output and the other quantity (see figure 1).

One of these is known

Input

System

Known but with errors Output

Figure 1. Schematic of the structure of inverse problems. In 1902 Hadamard [13] stated three criteria for an inverse problem to be well posed: 1. Existence: There exists an input that exactly fits the observed data; 2. Uniqueness: There is a unique input that exactly fits the observed data; 3. Stability: There are small changes in the input when there are small changes in the observed data. An inverse problem is ill-posed if any one of these conditions is not satisfied. It is assumed in this paper that the measurement system is known. The reconstruction of the measurand from the observable data in a stable way that gives a meaningful interpretable measurand is the inverse problem of measurement. Measurement is essential to the scientific approach and stable meaningful interpretation of measurands is crucial to this endeavour. Thus the inverse problem of measurement is a very fundamental scientific problem. There has been a huge discussion on reconstruction methods within the metrology community. Many approaches are very ad hoc and create artificial features (e.g. “bat wings” on edges for optical surface measuring instruments, Gao et al. [14]). Many cannot give a meaningful scientific interpretation of the reconstructed measurand. This paper addresses this fundamental inverse problem of measurement by exploring the mathematical structure (in terms of category theory) of the problem and giving three properties that are sufficient and necessary for stable and meaningful reconstruction. Property one: The forward mapping is order-preserving The forward homomorphism from the measurand Q to the observed data P (forward mapping F: Q→P) must preserve the relational structure of the measurands; i.e. in terms of mapping compositions a > b implies F(a) > F(b). Property two: Topological stability

123

As defined above topological stable measurement implies both the relational structures on the measurand and the observed data are partial pre-orders and the homomorphism from the measurand to the observed data (Forward mapping F) is an increasing function. Property three: The inverse mapping is an Ockham solution The Inverse homomorphism from the observed data P to the measurand Q (Inverse mapping I: P→Q) is defined as:

 m  P , I ( m )  m in{n  Q : m  F ( n )}

This is Ockham’s Choice for the Inverse mapping: being the minimum measurand n (if it exists) that maps onto the observable measurement value m. Lemma 2.1 The following assertions are equivalent for any pair of mappings I : P  Q , F : Q  P of pre-ordered classes. i) I is left adjoint to F and F is right adjoint to I; ii) F is order-preserving and I  m   min{ n Q : m  F  n } holds for all

m P ;

iii) I is order-preserving and F  n   max{ m  P : I  m   n} holds for all

nQ ;

iv)

I and F are order-preserving and m  F  I  m   and I  F  n    n holds

for all m  P , nQ .

Proof: see [15] . Using the above lemma, It can be shown that Properties P1 to P3 imply that the forward and inverse mappings form a Galois Correspondence (Adjoint Functors). That is to say: X > I(A) iff F(X) > A. From Property two the measurand and the observed data are partial preorders and so satisfy the first part of the lemma. Assertion ii) is true from Properties one and three. Thus from Lemma 2.1 assertion i) is true that the forward and inverse mapping are a pair of left and right adjoint functors. For a pre-ordered class, a pair of adjoint functors always take the form of a Galois correspondence. Galois correspondence implies that the corresponding measurement is well posed according to Hadamard’s three criteria and that a stable and meaningful reconstruction is possible (called the Ockham inverse solution) and is given by: I(A) = min{X s.t. F(X) > A) (see Dikranjan and Tholen, [15]).

124

The Galois correspondence model also has additional structure useful for the interpretation of the measurement procedure. The subsets of measurand values that forward map onto particular observable data points partition the measurands into an equivalence relation called the measurand resolution, since the observed data cannot distinguish between measurand values within a subset. Similarly observed data resolution can be defined from the equivalence relation constructed from the subsets of observed data that inverse map onto particular measurands. Both these resolutions can be used in the calculation of the measurement uncertainty due to the metrology software. Although others have used inverse problem theory to reconstruct the measured values from the observable measurements, see Mueller and Siltanen [11], they tend to be ad hoc solutions. What is presented here is a universal solution to the inverse reconstruction problem. Many inverse measurement problems are naturally ill-posed (e.g. CT reconstruction from the sinogram), but the full weight of inverse problem theory techniques can be used (such as regularization) to rectify this situation by turning an ill-posed problem into a well-posed problem that is a close approximation to the original such that useful reconstructions result. Throughout the rest of this paper it is assumed the software under test is part of a well-posed measurement procedure. 3. Testing metrology software Since metrology software is part of the measuring procedure the “measurement as an inverse problem” model applies and can be used as a foundation to derive properties and build tests for metrology software. A very common approach to test metrology software is through softgauges. These are reference pairs of data comprising reference input data and reference output data for a particular computational aim of the metrology software. The computational aim metrology software is tested by processing the reference input data using the test software and comparing the results returned (in an appropriate way) with the reference output data. Softgauges are constructed with data generators which are generally implemented in one of the following two approaches: • Forward data generation: taking reference input data using Reference Software to produce corresponding reference output data. • Reverse data generation: taking reference output data and using Ockham inverse solution to produce corresponding reference input data. The forward approach to data generation involves developing reference software to take the input reference data to produce the reference output data which in practice can be difficult and costly. The reverse approach to data generation requires a mathematical understanding of the forward process,

125

from which the Ockham inverse solution, via an inverse problem, can be reconstructed to produce the corresponding reference input data. This is often simpler to implement than the forward data generation approach. Knowing that constructing softauges can be considered as an inverse problem Hadamard’s criteria three criteria for an inverse problem to be well posed [13] can be used to construct ‘extreme’ softgauges that do not satisfy one or more of his criteria: particularly the stability and uniqueness criteria. The following examples show examples of this approach. 3.1. Example CT reconstruction Reconstruction from the sonogram in a CT scan is ill-posed. As a result adding 0.1% noise to the sinogram makes the reconstruction (figure 2b) unrecognizable from the original image (figure 2a). The reconstruction of the illposed problem cannot have an uncertainty statement associated with it as a result of the global instability of the ill-posed problem. A well-known inverse problem technique to regularize ill-posed inverse problems, is to use a limited number of Eigenvalues in the reconstruction. Taking the first 1500 Eigenvalues, converts the reconstruction into a well-posed approximate model allowing a useful reconstruction with the Ockham solution, with only 48% relative error and also allows uncertainty statements to be made. This is an example of an illposed reconstruction where inverse problem theory is used to reconstruct the measured values from the observed proxy measurements. The regularization technique can also be used for reverse data generation by using a limited number of Eigenvalues in the reconstruction.

a)

b)

c)

Figure 2. a) original modified Shepp-Logan phantom image; b) reconstruction of a sonogram with 0.1% noise; c) regularization reconstruction using 1500 Eigenfunctions. 3.2. Example maximum inscribing circle The maximum inscribing circle problem consists of finding the largest radius circle in which a setoff data all lie outside the circle. This is a problem in which multiple solutions can exist (see figure 3) and is locally unstable since most

126

examples have an unique solution. This is a case where reverse data generation approach is easier than the forward data generation approach. Starting with the solution (the two large circles), it is easy to add the points of contact with the two circles (filled dots in figure 3) and then add the rest of the data (open dots).

Figure 3. An example of a set of data in which there are multiple solutions to the maximum inscribing circle problem. 4. Conclusions The “metrology as an inverse problem” has been introduced. Three properties are described that are necessary and sufficient for the forward and inverse mappings to form a Galois correspondence ensuring that the measurement is well posed. The formula for Ockham reconstruction, from the observed data to the measurands, has also been given for well-posed measurement procedures. Further the measurand resolution and the observed data resolution have been given as part of the structure of the Galois correspondence model. There has been a brief discussion on data generators for softgauges to test metrology software, in particular forward and reverse data generators and how the Ockham solution can help with reverse data generators for well posed metrology software. Finally an example from CT scanning illustrates how inverse problem techniques are useful for reconstruction in metrology instrumentation. Acknowledgement The author gratefully acknowledges funding through the European Metrology Research Programme (EMRP) Project NEW06 TraCIM. The EMRP is jointly funded by the EMRP participating countries within EURAMET and the European Union.

127

References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

[15]

Scott P J and Forbes A 2012 Mathematics for modern precision engineering, Phil. Trans. R. Soc. A, 370 (1973), pp. 4066–88. Lao Tzu 6th centry BCE, Tao Te Ching. Roberts F S 1979 Measurement theory with applications to decision making, and the social sciences. In Encyclopaedia of mathematics and its applications, vol. 7. Addison-Wesley. Finkelstein L 1982 Theory and philosophy of measurement. In Handbook of measurement science. Vol. 1. Fundamental principles (ed. P. H. Sydenham). Wiley. Narens L 1985 Abstract measurement theory. London: MIS Press. Hand D J 1996 Statistics and the theory of measurement. J. R. Stat. Soc. A 159(3), pp. 445–92. ISO/IEC Guide 99:2007 International vocabulary of metrology — Basic and general concepts and associated terms (VIM). Scott P J 2004 Pattern analysis and metrology: the extraction of stable features from observable measurements. Proc. R. Soc. A 460, pp. 2845–64. Cameron P J 1994 Combinatorics: topics, techniques, algorithms. Cambridge University Press. ISO/IEC Guide 98-3:2008 Uncertainty of measurement — Part 3: Guide to the expression of uncertainty in measurement (GUM:1995). Mueller J L and Siltanen S 2012 Linear and Nonlinear Inverse Problems with Practical Applications, SIAM. Balasubramanian N 1980 Optical system for surface topography measurement, U.S. patent 4,340,306. Hadamard J 1902 Sur les problèmes aux dérivées partielles et leur signification physique. Bull. Univ. Princeton 13, pp. 49–56. Gao F, Leach R K, Petzin J and Coupland J M 2008 Surface Measurement Errors using Commercial Scanning White Light Interferometers. Measurement Science and Technology, 19 (1). 015303. Dikranjan D and Tholen W 1995 Categorical Structure of Closure Operators, Springer Science + Business Media, B.V.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 128–137)

Study of gear surface texture using Mallat’s scattering transform W. Sun∗ , S. Chr´ etien, R. Hornby and P. Cooper National Physical Laboratory, Hampton Road, Teddington, TW11 0LW, UK ∗E-mail: [email protected] R. Frazer and J. Zhang Newcastle University, Newcastle upon Tyne, NE1 7RU, UK Gears are machine elements that transmit rotary motion and power by the successive engagements of teeth on their periphery 1 . Gears commonly in use include: spur gears; bevel gears; helical gears; internal gears and worm gears. They are widely used in automotive, aerospace, power generation and even medical applications. The manufacturing process of metal gears usually involves cutting, hobbing, shaving, milling, grinding and honing 2 . Although gears have been used for many years in industry, the evaluation of their surface texture and the relationship between surface parameters and surface functionality are not well understood. Conventional profile measurements and surface roughness parameters, such as Ra and Rq, are still predominantly used in the industry. However, increasing investigations show that 2D profile measurements and data analyses cannot represent the surface condition due to nonuniformity of measured surfaces. In this study, we use a new mathematical tool for the characterisation of the surface irregularities, namely the scattering transform, recently introduced by S. Mallat 3 . This new transform is almost invariant to group actions such as rotation and translation of the image and has been successfully applied to machine learning problems such as classification. This approach is applied to the characterisation of gear surfaces. Results obtained on areal surface data based measurements from focus variation instruments and conventional tactile CMM are presented and discussed. Keywords: Surface metrology; Scattering transform; Gaussian Mixture Models.

1. Introduction Gears are a key component of power transmission systems, with applications spanning the automotive, aerospace, medical and power generation sectors, to name a few. A gear works by engaging its peripheral teeth with those of another gear, allowing rotary power to be transferred from one gear to the other gear. A system of gears can be arranged to modify the speed and direction of a rotation and torque to match the requirements for the efficient operation of the prime mover(electric motor, diesel engine etc.) with requirements of the driven load. This ensures the maximum operational efficiency and minimises manufacturing costs and environmental impact. The interaction between two gears is a combination of rolling and sliding. Different sections of gear teeth experience different

128

129

meshing conditions, which influence the performance and the failure modes of the gear. The characterisation of the surface texture in the sliding direction root to tip for a driving gear or tip to root in a driven gear) is therefore of prime importance. The schematic of a gear tooth is shown in Figure 1. Gears are commonly manufactured from case-hardened steel and are commonly produced by soft cutting followed by heat treatment and a hard-finishing process such as form grinding, honing or other processes to reduce the gear meshing surfaces roughness and form deviation. The root fillet regions also undergo grinding in some instances. The grinding process results in a grinding lay, illustrated in Figure 2, which is aligned approximately perpendicularly to the gear sliding direction. The lay is the direction of the predominant surface pattern. Lay usually derives from the actual production process used to manufacture the surface and results in directional striations across the surface. Figure 3 shows the common gear manufacturing and lay directions. Among many gear manufacturing processes, grinding process is increasingly used. Grinding is an abrasive machining process that uses a grinding wheel to shape and finish components. The surface finish obtained through grinding can be significantly better than the surface finish obtained through other common processes, such as turning or milling. The process tends to produce repeatable form deviations on each gear tooth, but the grinding wheel can become blunt and its surface can be contaminated over time.

Fig. 1.

Gear geometry vocabulary relevant to the measurements.

Fig. 2.

Grinding lay from a gear form grinding machine.

130

Fig. 3. Common gear manufacturing processes and lay directions. Reproduced from BS ISO TR 10064-4 4 .

It is common to redress or re-cut the form of the grinding wheel after the grinding wheel has cut a specified number of teeth. Gear teeth ground after the grinding wheel has been redressed will likely have different surface deviations and cutting characteristics from those ground before it. Additional surface texture measurements are necessary to fully quantify the parameters of the gear if the grinding wheel has been redressed. Knowledge of the redressing process, and when it was performed during gear grinding, is required to develop an efficient and effective measurement strategy. The characterisation of surface properties is essential in many industrial applications. A surface profile can be decomposed into three components: primary (often defined in terms of geometric element such as a line or a circular arc), the waviness and roughness. Among them, surface roughness is the important components that affects the friction, durability and also the failure mode of components. The common metrics in metrology are the roughness R parameters 15 . The procedure of surface roughness analyses includes filtering of surface irregularities (noise etc.), removing surface form and surface waviness. The resulting surface roughness can then be parameterised using roughness parameters such as Ra, Rz, Rsq, and etc. However, the drawback of using surface profile is that the information taken is dependant of the orientation of the scan relative to the surface lay, see examples in Figure 4.

Fig. 4. The effect of measuring in different directions to the surface lay, courtesy of Taylor Hobson Ltd. Note that the direction of lay runs across the page.

131

In recent years, international standards ISO 25178 introduced S parameters for surface texture analyses. The fundamental theory behind R and S parameters is the same, but S parameters have advantages over conventional R parameters as S parameters take into the consideration of areal data rather than single profile. However, for both sets of parameters, it can be difficult to correlate the measured values of the parameters with the surface functionality. In fact, different manufacturing processes may generate different surface components in different wavelength regimes and these affect the function of the surface differently. Filtering surface profiles to finer descriptors of the subtle variations in the process can provide a better idea of the surface. The process can be performed using Wavelet analysis 3 . Previous researchers used wavelets for texture analysis and a very interesting trend emerged that progressively refined the wavelet approach to metrology 5 . While very efficient for some applications, the wavelet transform cannot aggregate (local) spatial patterns in a natural fashion and we need to turn to new techniques which can help extract the main metrological spatial features of surfaces by building relevant invariants under local translations and rotations. This can be achieved using Mallat’s scattering transform. In this paper, Mallat’s Scattering Transform is used as an promising new tool for surface characterisation and metrology. Mallat scattering transform is a new technique enjoying fruitful connections with deep learning 3 and a sound mathematical theory. The advantage of the scattering transform over traditional techniques is that it leverages the intrinsic multi-scale analysis of wavelet analysis while remaining almost invariant to certain local group actions. In this paper, we show via numerical experiments how these new metrics reflect the properties of the surface. This study is a step towards a wider use of the scattering transform in metrology. 2. Mallat’s scattering transform In this section, we present the Scattering Transform proposed by Mallat 3 and describe its construction and main properties. The scattering transform provides a compact description of images or surfaces that are invariant with respect to the action of certain local transformations. The application of the scattering transform to metrology is a promising avenue for a precise multiscale analysis of surfaces and a new approach to the study of roughness and of surface description in general. The scattering transform builds up on Wavelet theory. Wavelet analysis have been widely used for texture analysis and application to metrology 6,7 . Wavelets are able to efficiently delineate roughness from waviness, a difficult but essential problem in surface metrology. One of the main pitfalls of wavelets is that they are not so much invariant to group actions such as rotations or translations. In order to overcome this issue the scattering transform 3 was recently proposed so as to represent images and

132

to capture their main features for the task of classification in machine learning. Moreover, one of the main advantages of the scattering transform over competitors such as deep neural networks is that it is underpinned by a sound theory 3 . The scattering transform provides an architecture that computes interesting features that are invariant to local transformations. It consists of a succession of linear filters and ‘pooling’ non-linearities. The same type of architecture is deployed in e.g. deep neural networks with the main difference that the weights have to be computed from the data, a task that is not known to be efficiently achievable despite interesting recentadvances 8–10 . Mallat’s scattering transform proposes stable and informative affine invariant representations which can quickly be obtained via a scattering operator defined on the translation, rotation and scaling groups. It is implemented by a deep convolution network with wavelets filters and modulus non-linearities, hence allowing interesting connections with deep neural networks. 2.1. Invariants Building representations that are invariant to certain actions is important in many applications. In the metrological applications we want to address, the features that should be retrieved from the analysis should not be impacted by e.g. small translations, small deformations or small rotations. This is what we can achieve using the scattering transform. The scattering transform can be described as a cascade of wavelet coefficient computations composed with the non-linear operation of taking the modulus and averaging, all this being repeated at several scales. As is usual in wavelet analysis, one defines a scaling function φ whose translates define the 0th resolution space V0 and φJ (·) = φ(2J ·). Let f ∈ L2 (R), the Hilbert space of squared integrable functions 11 . Then, the convolutions of f with the scaling function φJ , which has characteristic support size of order 2J f 7→ f ∗ φJ

(1)

where ∗ denote the discrete convolution operator 11 . This convolution filters out higher frequencies from the signal and gives the average behaviour of the function. This average is not much perturbed by small translations, or in other words, f ∗ φJ ≈ f (· − τ ) ∗ φJ when |τ | ≪ 2J . 2.2. The scattering mechanism and higher frequency invariants In order to refine the analysis by incorporating higher frequencies, one introduces the wavelet functions ψj , j = 1, . . . , J which are obtained in such a way that their translates form a orthogonal basis of Vj \ Vj−1 where A \ B here and hereafter

133

will denote the orthogonal complement of vector space B inside vector space A. One proceeds to computing the vector |f ∗ ψ1 | ∗ φJ ,

|f ∗ ψ2 | ∗ φJ ,

···

which contains information about the higher frequencies contained in f . Again, the averaging is crucial at this point because it helps robustifying the transform against translation-type perturbations. On the other hand, the high frequencies are again lost by this averaging procedure. The scattering transform then consists of the following list of coefficients: ||f ∗ ψ1 | ∗ ψ1 | ∗ φJ , ||f ∗ ψ2 | ∗ ψ1 | ∗ φJ , · · · ||f ∗ ψ1 | ∗ ψ2 | ∗ φJ , ||f ∗ ψ2 | ∗ ψ2 | ∗ φJ , · · · ||f ∗ ψ1 | ∗ ψ3 | ∗ φJ , ||f ∗ ψ2 | ∗ ψ3 | ∗ φJ , · · · .. .

,

.. .

2.3. Sparsity of scattering coefficients One fundamental feature of the wavelet expansion is that it sparsifies functions from many standard function spaces such as Besov spaces, piecewise differentiable functions, functions of Bounded Variation, etc; see 11 for more details. In other words, most coefficients in the wavelet expansion will be negligible, allowing for a sparse representation of the function under study. These properties mainly stem from the fact that wavelet bases are unconditional bases for these spaces and unconditionality implies providing nearly sparse (also called compressible) expansions as proved in 12 . The study of the decay of the scattering coefficients was completed in 13 . In that paper, it is proved that most coefficients in the scattering decomposition are small for functions in L2 (R). More precisely, we have X kSm [f ]k22 m≥n



Z

  ω  dω |fˆ(ω)|2 1 − exp − n ra R

(2)

and, as easily seen, the function  ω  ω → 1 − exp − n ra is a high pass filter with a bandwidth proportional to ran .

(3)

2.4. Going from 1D to 2D Since the Scattering Transform can be applied to images and is invariant to rotations 14 , the material’s features will be detected independent of the direction. This might represent a great advantage over the previous approach in many situations.

134

Note that the 1D Scattering Transform can still be used in the case one wants to compare with the usual 1D roughness criteria. 3. Computational experiments Three gear tooth surface replica, provided by Design Unit at University of Newcastle, were studied. One gear tooth was finished by grinding process (as ground), one was used for short period, little load (run in), and the other sample was used for a long time, significant load, high speed (test 1).

Fig. 5. Top down view of Microset 202 black replica with points used for alignment highlighted.

Figure 6 below shows the scan results of these three components using the focus variation system.

Fig. 6.

From top to bottom, as grand, after run in and after test 1.

3.1. Results One example of surface we have to analyse is shown in Figure 7. The sparsity of the scattering transform can be observed in Figure 8. This figure shows the scattering coefficients sorted in absolute value. The cluster containing the 5% largest coefficients of the scattering transform are displayed in Figure 9 below. The support is the vector which containts the index of the components displayed in Figure 9. The first difference of the support is the vector of all successive differences between consecutive values of the support. The second difference of the support, i.e. the first difference of the first difference of the support, is shown in Figure 10 below. Figure 11 shows a zoomed version of the middle part of Figure 10.

135

Fig. 7.

Surface image from the “test 1” set.

Fig. 8. Scattering coefficients of the previous image sorted by increasing absolute value. The y-axis represents the magnitude of the Scattering Transform coefficients. In particular, this figure shows an example of extreme sparsity of the Scattering Transform for our data set. Largest coefficients of the scattering transform (sorted)

0.5

0.4

0.3

0.2

0.1

0

-0.1

-0.2 as฀ground run฀in test฀1

-0.3

-0.4

-0.5

0

2000

4000

6000

8000

10000

12000

14000

Fig. 9. The 5% largest coefficients sorted in increasing order. The y axis represents the coefficient’s magnitude.

It can be seen from the previous plots that the scattering transform efficiently compresses the information contained in the surface figures. Only a few large coefficients have significant magnitude. Figure 9 and Figure 11 show that the information contained in the scattering coefficient sequence contains very interesting information. Firstly, the sorted magnitude grows linearly in Figure 9, which is a very interesting feature. Next, the second difference of the support is spiky (sparse). Moreover, Figure 10 shows that the sample as ground and run in have very similar scattering coefficients whereas the sample test 1 has a very different slope from the two other samples. This results seem correspond well to

136

1.5

× 104

Second difference of the support as฀ground run฀in test฀1

1

0.5

0

-0.5

-1

-1.5

Fig. 10.

0

2000

4000

6000

8000

10000

12000

14000

The second difference of the support of the 5% largest coefficients. × 104

Second difference of the support (zoom) as฀ground run฀in test฀1

1

0.5

0

-0.5

-1

5000

Fig. 11.

5500

6000

6500

7000

7500

8000

Zoom into the second difference of the support of the 5% largest coefficients.

the physical difference between these three samples, where the sample test 1 is rough comparing to the other two samples. On the other hand, the support of the three samples are very different as shown in Figure 11. In particular, the second difference of the support of as ground and run in can be easily distinguished: the ‘two-dashed’ coefficients are larger than the ‘solid’ ones. Moreover, the blue coefficients are sparser than the red ones.

4. Conclusion and future work The goal of the present paper was to introduce the scattering transform as a tool for the study of surface metrology. The preliminary study shows interesting results when applying this method to gear surfaces. The scattering transform is a first step to produce invariant features (the transform coefficients) that can be used to classify or predict the functionality the surface. In order to fully understand the method, the future research is to further investigate the metrological interpretation of the coefficients of Mallat’s scattering transform, and in particular: • the distribution of the magnitude of the largest coefficients • the position of the largest coefficients in the sequence.

137

Acknowledgements The project is jointly funded by the European Metrology Research Programme and the National Measurement System Programme for Engineering & Flow Metrology. We are grateful to Prof Alistair Forbes for comments on an early draft of this paper. References 1. J. R. Davis, Gear materials, properties, and manufacture (ASM International, United States, 2005). 2. G. Goch, Gear metrology CIRP Ann-Manuf Techn. 52 (2) 659-95, (2003). 3. S. Mallat, Group invariant scattering Communications on Pure and Applied Mathematics 65 (10) 1331-98, (2012). 4. BS ISO/TR 10064-4:1998 Code of inspection practice Part 4: Recommendations relative to surface texture and tooth contact pattern checking, (1998). 5. S. H. Bhandari and S. M. Deshpande, Feature extraction for surface classification — An approach with wavelets. International Journal of Computer and Information Science and Engineering 1.4: 215-219 (2007). 6. X. Jiang and L. Blunt, Third generation wavelet for the extraction of morphological features from micro and nano scalar surfaces. Wear, 257(12), 12351240 (2004). 7. R. X. Gao, Robert and Y. Ruqiang, Wavelets: Theory and applications for manufacturing. Springer Science & Business Media, (2010). 8. P. Hand and V. Voroninski, Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk. arXiv preprint arXiv:1705.07576 (2017). 9. Q. Nguyen and M. Hein, The loss surface of deep and wide neural networks, arXiv preprint arXiv:1704.08045 (2017). 10. M. Soltanolkotabi, A. Javanmard and J. D. Lee, Theoretical insights into the optimization landscape of over-parameterized shallow neural networks, arXiv preprint arXiv:1707.04926 (2017). 11. S. Mallat, A wavelet tour of signal processing. Academic press (1999). 12. D. Donoho, Unconditional bases are optimal bases for data compression and for statistical estimation. Applied and computational harmonic analysis 1.1 (1993): 100-115. 13. I. Waldspurger, Exponential decay of scattering coefficients, arXiv preprint arXiv:1605.07464 (2016). 14. L. Sifre and S. Mallat, Rotation, scaling and deformation invariant scattering for texture discrimination, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1233-1240 (2013). 15. BS EN ISO 4287:1998+A1:2009 Geometrical product specification (GPS) –Surface texture: Profile method – Terms, definitions and surface texture parameters, 1998.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 138–152)

The evaluation of the uncertainty of measurements from an autocorrelated process Nien Fan Zhang Statistical Engineering Division, National Institute of Standards and Technology, Gaithersburg, MD 20899, USA E-mail: [email protected] In metrology, when repeated measurements are autocorrelated, it is not appropriate to use the traditional approach to evaluate the uncertainty of the average of repeated measurements. In this paper, we propose approaches to evaluate the measurement uncertainty when the data are autocorrelated. For various cases of process autocorrelation, some examples are presented for illustration. Keywords: Best linear unbiased estimator; Ergodic theorem; Nonlinear regression; Nonstationary process; Stationary process.

1. Introduction The need to have appropriate approaches to evaluate measurement results are emphasized and discussed in ISO/IEC Guide 98-3 (GUM) [1]. Since the publication of GUM, many papers and documents discussed different statistical approaches for the evaluation and interpretation of measurement uncertainty. Among them, ISO/TR 1587: 2012 [2] comprehensively documented three statistical approaches for the evaluation of measurement uncertainty. In metrology, as indicated in the Section 4.2.3 of GUM, it is a common practice that the dispersion or standard deviation of the average of repeated measurements is calculated by the sample standard deviation of the measurements divided by the square root of the sample size. Namely, for given repeated measurements, { X (1),..., X ( N )} of size N , the standard uncertainty of the sample mean X is calculated as uX  SX N , (1)

138

139

where S X is the sample standard deviation of the measurements. However, in many cases, repeated measurements are autocorrelated. As stated in Section 4.2.7 of [1] “If the random variations in the observations of an input quantity are correlated, for example, in time, the mean and experimental standard deviation of the mean as given in 4.2.1 and 4.2.3 may be inappropriate estimators (C.2.25) of the desired statistics (C.2.23).” [3] proposed an approach to calculate the uncertainty of the mean of measurements when the measurements are from a stationary process. However, autocorrelated measurements are not necessarily from a stationary process. This paper will discuss approaches to deal with general autocorrelated measurements on uncertainty evaluation. Section 2 will review the approaches to evaluate uncertainties of measurements when they are from a stationary process. In Section 3, a general approach is discussed when the measurements are from a nonstationary process. Sections 4 and 5 will discuss methods to deal uncertainty evaluation for measurements from some particular nonstationary processes followed by conclusions. 2. Measurements from stationary process When measurements are from a covariance stationary, or in this paper simply a stationary process indexed by time, they have same mean and same variance at each time and the autocorrelations depend only on the lags. Mathematically, a stationary process is a natural extension of an i.i.d. sequence regarding the mean, variance, and covariance. Uncertainty evaluations of the sample mean of measurements from a stationary process were discussed in [3]. For a discrete stationary process { X (t ), t  1,...} , as in the case for an i.i.d. sequence, for a realization with N consecutive observations { X (1),..., X ( N )} , the sample mean

 X (t ) N

X 

t 1

N

(2)

is often used to estimate the process mean. From [3], the variance of X with N  1 is given by

140

where  X

N 1    2  ( N  i )  (i )  2 X , Var[ X ]  1  i 1   N N    

(3)

is the process standard deviation and  (i ) is the process

autocorrelation at lag i . Given a sequence of stationary measurements { X (1),..., X ( N )} ,  X and  (i ) ’s can be estimated by the corresponding sample statistics and the standard uncertainty of X is given by

2  ( N  i ) ˆ (i ) N 1

uX  1 

i 1

N

SX N

,

(4)

where S X is the sample standard deviation based on { X (1),..., X ( N )} and ˆ (i ) is an estimate of the autocorrelation  (i ) at lag i. Note that from (3), for a stationary and invertible autoregressive-moving average (ARMA) process, it can be shown that Var[ X ]  1 N for a large N. In particular, for an AR(1) process with dependent parameter  , from [3], (2) becomes Var[ X ] 

N  2  N 2  2 N 1 N 2 (1   ) 2

.

For a moving average process of order one (MA(1)) with   1 , which is stationary but not invertible, Var[ X ]  1 N 2 [4]. For a stationary and invertible fractional difference ARIMA (ARFIMA) process with −0.5 < d < 0.5, Var[ X ]  1 N 1 2 d [5]. Note that from [6] and the Wold Decomposition Theorem [7], under some regularity conditions, the central limit theorem applies to a stationary process. Namely, the sample mean X is asymptotically normally distributed. Thus, for a set of measurements from a stationary process with a large sample size, a confidence interval for the measurand can be computed. We will use an example to illustrate the approach. This is a data set from a study of the voltage difference of two Josephson voltage standards. There are 218

141

pressure adjusted voltage measurements taken in equal time intervals. From the data demonstrated in Figure 1, we may reasonably treat the process as stationary. 8

voltage฀difference฀(nV)

6 4 2 0 −2 −4 −6 −8 −10 0

50

100

150

measurement฀number

200

250

Figure 1. Measurements of voltage difference of two voltage standards. From the autocorrelation function (ACF) plot with an approximate 95 % confidence band for white noise shown in Figure 2, it is obvious that the measurements are autocorrelated because the autocorrelations of lag 1 and 9 equal to 0.230, and 0.232, respectively are significant. Thus, it is inappropriate to treat the measurements as an i.i.d. sequence.

Figure 2. ACF plot of the voltage measurements with an approximate 95 % confidence band for white noise. We use the sample mean x = -0.75 to estimate the process mean. By (3), the standard uncertainty of the sample mean is given by 0.27. Note that if we

142

inappropriately treat the process as an i.i.d. sequence, from (1) the corresponding standard uncertainty = 0.20, which is 26 % smaller than 0.27. However, autocorrelated processes are not always stationary. In the remainder of this paper, we will discuss uncertainty evaluation for the measurements from nonstationary processes. 3. Measurements from nonstationary processes

For a discrete random time series, { X (t ), t  1,...} , where the mean of X (t ) ,

denoted by  (t ) or the variance of X (t ) , denoted by  X2 (t ) is time-dependent or both of them are time-dependent, { X (t )} is a nonstationary process. Also when

the covariance between X (t1 ) and X (t 2 ) does not solely depend on t1  t2 ,

{ X (t )} is nonstationary. When measurements are made from a nonstationary

process, the mean at time t,  (t ) is time-dependent in general. For an estimator of the function  (t ) denoted by ˆ (t ) , we need to evaluate the corresponding uncertainty as a function of time, which is called dynamic uncertainty in [8]. In [9], the evaluation of uncertainties for such measurements from a linear dynamic measurement system is called dynamic metrology. For a nonstationary process with a time-dependent mean function, an estimate

of the mean function  (t ) is not appropriate, in general, if it is just based on one realization of the process, for example, a time average of X (t ) , i.e., X in (2) as in the case of a stationary process. Instead, multiple realizations are needed to estimate the ensemble (or population) average, i.e., E[ X (t )]   (t ) for each t.

Roughly speaking, multiple realizations in metrology mean that a time (or space) dependent sequence of measurement results of a physical quantity are repeated multiple times under the same measurement conditions including the same time (or space) intervals between the consecutive measurements for different replications. Specifically, assume we have M realizations of { X (t ), t  1,..., N }

with the same length. For example, the jth realization of { X (t )} is denoted by

{x j (t ), t  1,..., N } , j  1,..., M . At each t *  {1,..., N } , {x j (t * ), j  1,..., M } is a

143

random sample of size M from a probability distribution of the random variable X (t * ) . Namely, { X j (t * ), j  1,..., M } are i.i.d. random variables. Then  (t * ) as

a mean of X (t ) at t * is estimated by ˆ (t * )   x j (t * ) M . Similarly, the M

j 1

variance of X (t ) at t* ,  X2 (t * )  Var[ X (t * )] is estimated based on multiple

[ x (t )  ˆ (t )] M

realizations, e.g., by

j 1

*

j

*

2

( M  1) . Thus, the corresponding

uncertainty of ˆ(t * ) can be evaluated. When { X (t )} is a stationary process with some regular conditions satisfied, based on the ergodic theorems (see [10] and [11]) the common mean  can be estimated by the time average of the measurements of one realization (e.g., the jth realization) X j   x j (t ) N with the property that E[( X j   ) 2 ]  0 when N

t 1

N   . Under similar conditions, the common variance Var[ X (t )] can be

estimated by the sample variance of one realization of the process with similar properties. In that case, the uncertainty of the estimator of the process mean of a stationary process can be evaluated as discussed in [3] and Section 2. For some particular nonstationary processes, however, we may estimate the mean(s) and assess the corresponding uncertainties based on one realization of the process. In the remainder of the paper, we will discuss the estimates of the mean function and evaluate the associated uncertainties for the measurements from these processes.

144

4. Measurements from a process with a constant mean and a time-dependent variance

For a nonstationary process with a constant mean but a variable variance, an example is the random walk process [12], p. 192, which is expressed as. X (t )  X (t  1)  e(t ),

(5)

where the noise {e (t )} are i.i.d. with zero mean and variance  e2 . Usually, for a random walk process, it is assumed that X (0)  0 . Here, in general we assume X (0)   , a constant. From (5), when t  1 , X (t )     e(i ). t

i 1

(6)

For t  1,... , it is obvious that  (t )  E[ X (t )]   and Var[ X (t )]  t e2 .

(7)

The autocovariance and autocorrelation between X (t ) and X (t   ) if   0 is given by Cov[ X (t ), X (t   )]  t e2

and

[ X (t ), X (t   )]  t (t   ).

(8)

It is clear that the variance, autocovariance, and autocorrelation depend on time

while the mean is  for all t . Thus, { X (t )} is nonstationary. If we know the model of the process, for a given sequence, { X (1),..., X ( N )} , from (6)

{( X (t )  X (t  1)), t  1,..., N }  {e(t ), t  1,..., N } . We can estimate the noise

variance by the sample variance of {e(t ), t  1,..., N } , i.e., ˆ e2  Se2 . Note that the

estimator of the process variance at t is given by tˆ e2 . If we do not know the

model, based on the data we can test whether the model is random walk or not by testing whether the series formed by the first order differences is an i.i.d. sequence [13]. The mean of the process  can be estimated based on { X (1),..., X ( N )} . We may use the sample mean

X

in (2) to estimate  . It is an

145

unbiased estimator. From (8), it can be shown that the variance of the sample mean is given by Var[ X ] 

6 N  N ( N  1)(2 N  5) 6N 2

 e2 .

(9)

Based on that, the standard uncertainty of the sample mean u X is obtained from (9). On the other hand, for t  1,..., N , (6) can be expressed by a linear model X  1   E,

(10)

where X  ( X (1),..., X ( N ))T , E  (e(1),...,  e(i ))T , and 1 is a N by 1 vector N

i 1

with each element being one. The covariance matrix of X denoted by V is obtained from (8). By the Gauss-Markov-Aitken Theorem [14], the generalized least squares (GLS) estimator of  given by

ˆ  (1T V 11) 11T V 1X

(11)

is the best linear unbiased estimator (BLUE) of  . Namely, ˆ has the minimum variance among all the linear unbiased estimators of  . It can be shown that for

a random walk process with a mean of  , the BLUE is ˆ  X (1) with

Var[ ˆ ]   e2 leading the standard uncertainty uˆ  ˆ e . Obviously, from (9),

uˆ  u X when N  1 . Though ˆ is an unbiased estimator of the process mean

with minimum variance, it solely depends on the first measurement. Alternatively, we may consider a weighted mean of X given by

   w(t ) X (t ), N

t 1

(12)

where {w(t ), t  1,..., N } are the weights based on the reciprocal of the variances of the process [15] given by

146

w(t ) 

1t

1 i N

(13)

.

i 1

Note that the weighted mean in (12) is a linear combination of correlated { X (t ), t  1,..., N }

and an unbiased estimator of  . It is certain that

Var[  ]  Var[ ˆ ]   e2 . It can be shown that the variance of  is given by 2 N  1 i N

Var[  ] 

i 1

   1 i   i 1  N

2

 e2 .

(14)

From (9) and (14), Var[  ]  Var[ X ] when N  1 . That is, between the two unbiased estimators of  ,  has a smaller variance than that for X when N  1

and thus is a better estimator than X . The standard uncertainty of  is given by 2 N  1 i N

u 

i 1

1 i N

ˆ e ,

(15)

i 1

which is larger than ˆ e when N  1 and increases when N increases. For illustration, a random walk process with N = 1000 and mean of   0.5 is

generated by simulation and shown in Figure 3. The noise {e (t )} is an i.i.d. sequence with standard normal distribution.

For this realization, the GLS estimate of the mean is ˆ  x (1)  0.82 and

the corresponding standard uncertainty is given by uˆ  ˆ e  se  1.0002 . The

sample mean X  7.19 and the standard uncertainty u X  18.27 from (9). The weighted mean estimate based on (12) is   1.28 with a standard uncertainty

147

15 10 5 0 −5 −10 −15 −20 −25 −30 0

100

200

300

400 500 600 observation number

700

800

900

1000

Figure 3. A realization of a random walk process and the weighted mean with a 95 % confidence band of the mean. of u  5.96 from (15). The limits of a 95 % confidence band of the mean based

on  are given by   1.96u and plotted by two red dashed lines while the blue solid line indicates  in Figure 3. 5. Measurements from a process with a time-dependent mean and a constant variance

When measurements are from a nonstationary process which does not have a constant mean but a constant variance, we may be able to assess the uncertainty associated with the estimate of the time-dependent mean based on one realization. In particular, we assume that a process is defined by X (t )   (t )  e(t ),

(16)

where the mean is a non-random function of t and {e (t )} is a stationary process with zero mean and a constant variance. The mean function  (t ) is the timedependent measurand at t We assume that the standard deviation of e (t ) is  e and the corresponding autocorrelation function  ( ) at lag of  . Given a data set of { X (t )} , we may fit it by a statistical model and obtain an estimate of  (t ) as well as its associated uncertainty. Note that for the model in (16), although { X (t )} has a constant variance, the estimator of the mean function and its associated uncertainty are time-dependent in general. In this section, we will discuss a case with corresponding examples on the mean function estimation and assessment of the associated uncertainties. Consider a process given by

148

X (t )     t  e(t ),

(17)

where    t is the time-dependent mean of the process and {e (t )} is a

stationary process with zero mean, standard deviation of  e , and autocorrelation

function {  ( )} . The process is sometimes called a linear trend plus an autocorrelated noise process. For t  1,..., N , (17) can be written as

X  TB  E,

where

X  ( X (1),..., X ( N ))T ,

1 1  1  Τ  , 1 2  N  T

B  ( ,  )T ,

and

E  (e(1)  e( N ))T . The parameter ( ,  ) can be estimated by the GLS

estimator, Bˆ  (TT R 1T) 1 TT R 1X,

(18)

where R is the correlation matrix of {e(1),..., e( N )} . The corresponding covariance matrix is given  e2 R . The covariance matrix of the estimate of B is

given by Cov[Bˆ ]  (TT R 1T) 1 e2 .

(19)

An estimate of the mean of { X (t )} and its corresponding standard uncertainties can be obtained. However, if R is unknown, we may first treat {e (t )} in (17) as an i.i.d. sequence and obtain the ordinary least squares (LS) estimator of B denoted by B , and use the residuals, X  TB to estimate R , denoted by Rˆ . Then, the GLS estimator of B denoted by Bˆ is obtained from (18) by replacing R by Rˆ . For illustration, here is an example for 500 measurements of instrument noise with an equal time interval of 0.2 second from a system for electrochemical detection. The unit of the measurements is millivolt (MV). The data are plotted in Figure 4. An LS fit to the data gives   1.56297 and  = -0.0000111, which are statistically significant. The residuals, { X (t )  (   t ), t  1,..., N } are plotted in Figure 5. The ACF plot of the residuals and the corresponding 95 % confidence band for white noise are shown in Figure 6.

149

1.568



data mean function upper limit of CI lower limit of CI upper limit of PI lower limit of PI

instrument noise (MV)

1.566 1.564 1.562 1.56 1.558 1.556 1.554 ฀ 0

50

100

150

200 250 300 observation number

350

400

450

500

Figure 4. Measurements of instrument noise and the estimated mean function with a 95 % confidence band and a 95 % prediction band, respectively. −3

4

x฀10

residuals

2 0 −2 −4 0

100

200 300 observation number

400

500

Figure 5. The residuals of the instrument noise after an LS fit.

Figure 6. ACF plot of the residual with a 95 % confidence band for white noise.

150

The sample autocorrelations are significant up to lag = 55. For example, ˆ (1) = 0.7967, ˆ (2) = 0.4694, ˆ (3) = 0.2573. The autocorrelations for lags from 13 to 18, 30 to 31, 39 to 40, and 53 to 55 are also significant. Since autocorrelations are significant until lag = 55, we can estimate  ( ) by ˆ ( ) for  = 1,…,55 and by 0 for  > 55. Based on that, Rˆ as an estimate of R is obtained. Then a GLS estimate of B , Bˆ is obtained from (18) with ˆ = 1.56294 and ˆ = -0.00001090, which are almost the same as the LS estimates. The noise standard deviation is estimated by

ˆ e 

ˆ 1 ( X  TBˆ ) ( X  TBˆ )T R  0.00108 N 2

(20)

(see [15], (4.66), p. 98). The covariance matrix of Bˆ is obtained from (19) with

ˆ and ˆ . The estimate of the mean function is given by ˆ (t )  ˆ  ˆ t , which R e

is time-dependent. The corresponding uncertainty is evaluated based on

ˆ2ˆ  0.011877ˆ e2 , ˆ 2ˆ  0.000000149265ˆ e2 , and the covariance between these,

 ˆ , ˆ ]  0.00003739ˆ 2 , which are obtained from (19). Then, the standard Cov[ e

uncertainty of ˆ (t ) is given by

 ˆ , ˆ ], ˆ 2ˆ (t )  ˆ2ˆ  t 2ˆ 2ˆ  2t Cov[

(21)

which is also time-dependent. Note that 0.0000539  ˆ ˆ (t )  0.000117 for

t  1,...,500 . Since the residuals are treated as normally distributed, we can build

a 95 % confidence band for the mean function with a coverage factor of 1.96 given

by ˆ (t )  1.96ˆ ˆ (t ) . In addition, the variance of a prediction of X (t ) is given by 2 ˆ predict  ˆ 2ˆ (t )  ˆ e2 . A prediction band with a coverage factor of 1.96 is given by

ˆ (t )  1.96ˆ predict . In Figure 5, the data, the estimated mean function, 95 %

confidence intervals (CI) for the mean function and prediction intervals (PI) for new X (t ) are shown. The plot demonstrates a strong noise variance of the data.

151

6. Conclusions

In this paper, we discuss the evaluation of the uncertainty of measurements from general autocorrelated processes. For the measurements from a stationary process, the evaluation is done by an extension of the methods for measurements from an i.i.d. sequence. For the measurements from a nonstationary process, in general, measurements from multiple realizations of the process are needed. We also discuss uncertainty assessment associated to the estimates of the mean function for some particular nonstationary processes, for which measurements from a single realization are used. References

1. 2. 3. 4. 5. 6. 7. 8. 9.

10. 11. 12. 13.

ISO/IEC Guide 98-3, Uncertainty of Measurement – Part 3: Guide to the Expression of Uncertainty in Measurement (GUM: 1995). ISO/TR 13587, Three statistical approaches for the assessment and interpretation of measurement uncertainty (2012). N. F. Zhang, Calculation of the uncertainty of the mean of autocorrelated measurements, Metrologie, 43, S276-S281 (2006). N. F. Zhang, Allan variance of time series models for measurement data, Metrologie, 45, 549-561 (2008). A. Samarov and M. S. Taqqu, On the efficiency of the sample mean in longmemory noise, Journal of Time Series Analysis, 9, 191-200 (1988). E. J. Hannon, Multiple Time Series (Wiley, New York, 1970). L. H. Koopmans, The Spectral Analysis of Time Series (Academic Press, San Diego, 1995). C. Elster and A. Link, Uncertainty evaluation for dynamic measurements modeled by a linear time-invariance system, Metrologie, 45 (2008). J. P. Hessling, Dynamic metrology – an approach to dynamic evaluation of linear time-invariance measurement system, Measurement Science and Technology, 19 (2008). E. Parzen, Stochastic Processes (Holden Day, Oakland, CA, 1962). S. Karlin and H. M. Taylor, A First Course in Stochastic Processes, 2nd ed. (Academic press, Boston, 1975). W. A. Woodward, H. L. Gray and A. C. Elliott, Applied Time Series Analysis (CRC Press, Boca Raton, 2012). A. W. Lo and A. C. Mackinlay, The size and power of the variance ratio test in finite sample: a Monte Carlo investigation, Journal of Econometrics, 40, 203-238 (1989).

152

14. C. R. Rao and H. Toutenburg, Linear Least Squares and Alternatives, (Springer, New York, 1995). 15. N. F. Zhang, The uncertainty associated with the weighted mean of measurement data, Metrologie, 43, 195-204 (2006).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 153–161)

Dynamic measurement errors correction in sliding mode based on a sensor model* M. N. Bizyaev† and A. S. Volosnikov Information and Measurement Technology Department, School of Electrical Engineering and Computer Science, South Ural State University (National Research University), Chelyabinsk, Russian Federation †E-mail: [email protected] www.susu.ru/en The paper considers the sliding mode theory-based method of recovering an input signal of measuring systems. The high efficiency of the method is demonstrated in reducing the dynamic measurement error. Keywords: Dynamic Measurement Error, Dynamically Distorted Signal Recovery, Sliding Mode, Measuring System.

1. Introduction Significant dynamic errors of measuring systems are a common problem. In general, recovering an input signal, dynamically distorted by a measuring device, is regarded as an inverse problem of the measurement science. The well-known approach [1] for the measured signal recovery is based on the numerical solution to the Fredholm integral equation of the first kind. To remedy the ill-posedness of the original problem, special regularization methods are applied. The recovery of measured signals by methods of automatic control theory is discussed in [2] and [3]. It is known that systems in sliding modes show high dynamic accuracy and reduced perturbation sensitivity [4]. Therefore, it is advisable to use sliding modes in dynamic measuring systems for the recovery of input measured signals. The scientific novelty of the approach is the use of sliding mode in the measuring system with the sensor model [3]. The work was supported by Act 211 Government of the Russian Federation, contract № 02.A03.21.0011.

*

153

154

2. Measuring system in sliding mode 2.1. Measuring system synthesis Let a primary measuring transducer (sensor) be represented by the following differential equations:

x1  x2 x 2  x3 

x n  U  a0 x1  a1 x2    an1 xn Y  x1b0  x2b1    xm1bm

,

(1)

where U and Y are input and output signals of the sensor; a0 , a1 , ..., an 1 and b0 , b1 , ..., bm are constant coefficients; x1 , x2 , ..., xn are sensor state variables. Consider the measuring system (MS) with both sensor and its model as a real module [2]. Let the differential equations of the sensor model be identical to the differential equations of the sensor (1):

x1M  x2 M x 2 M  x3M 

x nM  U M  a0 x1M  a1 x2 M    an1 xnM YM  x1M b0  x2 M b1    xm1M bm

,

(2)

where U М and YМ are input and output signals of the sensor model; x1М , x2 М , , xnМ are sensor model state variables. In order to implement a sliding mode in the system, let us choose the following type of a sliding surface: S  Y  YM .

(3)

When the MS moves in the sliding mode S  0 , the output signals of the sensor Y and the sensor model YМ are equal. Since the sensor and its model are described by the identical differential equations, then their input signals are equal, U  U М . This makes it possible to determine the input signal of the MS U by the signal of the model U М . The sliding mode is to be implemented under the following conditions [4]: if S  0 , the derivative S is to be negative ( S  0 ), while if S  0 , the derivative S must be positive. Having differentiated (3), we obtain the following equality:

155

S  Y  YM .

(4)

Substituting the derivatives Y and YM from (1) and (2) in (4) and making transformations results in the following equation: S  x1  x1M b0  x 2  x 2 M b1    x m1  x m1M bm .

(5)

On the basis of the suggested sliding surface, a MS is constructed. The block diagram of the system is shown in Figure 1.

Fig. 1. Block diagram of the measuring system in the sliding mode.

To provide the sliding surface of S  0 in the proposed block diagram, similar to that in [2], a sensor and a correction unit containing the sensor model are used. The correction unit in this case serves as a follow-up system, which reduces YM at S  Y  YM  0 , providing the condition of S  0 , and increases YM at S  Y  YM  0 with this difference approaching to zero, providing the

156

condition of S  0 . As a result, YM tends to Y and U M tends to U , respectively. When recovering the measured signal, regularization lies in the structure of the MS since the sensor model with its multiple feedback works as a low-pass filter for the high-frequency noise. 2.2. Elimination of self-oscillations in the nonlinear closed loop of the measuring system

In a closed circuit with a sensor model and a nonlinear element, self-oscillations occur, impairing the dynamic accuracy and leading to failure of sliding modes. Consider the closed loop shown in Figure 2 with the sensor model of the norder W sm  p  

bm p m  bm 1 p m 1    b1 p  b0 R p   p n  a n 1 p n 1    a1 p  a 0 Q p 

(6)

and a switching element.

Fig. 2. Nonlinear closed loop.

Suppose the differential equation of the linear part of the system is as follows:

Q( p) x   R( p) x1 .

(7)

The equation of nonlinear element

x1  Ksignx 

(8)

in the oscillatory process after the harmonic linearization is as follows:

q  A, w   x1   q A, w  x , w  

(9)

157

where q and q are coefficients of the harmonic linearization, and A and w are amplitude and frequency of self-oscillations, respectively. Based on (7) and (9), the harmonically linearized characteristic equation can be written as

q  A, w   Q(p)  R(p) q A, w  p  0 . w  

(10)

We use a substitution p  jw in the harmonically linearized characteristic equation (10). That results in the following equation:

Q jw  R jwq A,w  jq  A,w  0 .

(11)

Let us split (11) into real and imaginary parts:

X  A,w  jY  A,w  0 .

This implies two equations:

 X  A,w  0 ,  Y  A,w  0

(12)

(13)

being used to determine the unknown frequency w and amplitude A of the periodic solution. If the equations (13) have no positive real solutions for A and w , then there are no periodic solutions (and, therefore, self-oscillations) in this nonlinear system. Consider the case of the harmonic linearization of a second-order system with a transfer function of the linear component: Wsm  p  

b2 p 2  b1 p  b0 . p 2  a1 p  a 0

(14)

The coefficients of the harmonic linearization [5] for ideal relay are as follows: q

4C , A

q  0 ,

(15) (16)

where C is the relay element coefficient and A is the desired amplitude. Using (11), we obtain the characteristic equation of the linearized system:

 w 2  a1 jw  a 0 A   b2 w 2  b1 jw  b0 4C  0 .

(17)

158

 w 2  a 0 πA   b2 w 2  b0 4C  0 .  a1 wππ  b1 w4C  0

Splitting (17) into real and imaginary parts results in a following system: (18)

The second equation of the system (18) implies an equation a1 wA  b1 w4C  0 ,

(19)

with a unique solution w  0 at the positive A . Thus, the presented system of equations has no solution different from zero and, respectively, there are no self-oscillations. The above example demonstrates that there are no self-oscillations at any parameters of the sensor model in a closed circuit of the second order with a relay element. In the contours of a higher order, self-oscillations occur most commonly. It is possible to eliminate self-oscillations by diminishing the model order to the second one (by reduction). However, the difference between the original and the reduced models leads to additional errors in recovering a signal. In this connection, a very important issue is a choice of the reduction method and criterion upon which the reduction of the order is made. The analysis of well-known works [6]–[10] shows that the method which utilizes the frequency response values of the measured impact spectrum range as a basis for reduction criterion is optimal for solving this problem [11]. It is possible to increase the dynamic accuracy of the MS by using sensor state variables values. 3. Dynamic measurements data processing 3.1. Simulation study

In order to illustrate the efficiency of the proposed correction method, we consider the sensor described by the second-order transfer function as follows:

W ( p) 

T12 p 2

1  2ξT1 p  1

(20)

with the time constant T1  0.01 s and the damping coefficient ξ  0.7 . The measured harmonic signal U t   A sin ωt  is sent to the sensor input with the amplitude A  1 and frequency ω  200 rad/s . To make the simulation conditions close to the real ones, a noise perturbation is generated at the sensor output as a harmonic signal U п  Aп sin ωп t  with the amplitude Aп  0.02 A

159

and the frequency ωп  25ω . Since the signal at the output of the sensor has high-frequency components, to remove them a low-pass filter is used. Figure 3 shows the results of computational simulation. The sensor output signal amplitude Y t  is significantly smaller than the input signal amplitude U t  . As a result, there is almost complete dynamic correction of the output measuring signal, that is, the amplitude coincidence with a small phase shift between the sensor input U t  and recovered U M t  signals after finishing the transient process.

Fig. 3. Simulation results.

3.2. Experimental study

When using the method of sliding modes for the temperature dynamic measurement, the measurement time is reduced by several times allowing the use of, for example, sheathed thermocouples and tracking the dynamics of fast processes. Transient response of the thermoelectric transducer “Metran 251-01” heating was received due to exploiting the assembled experimental facility. According to the received time-dependency, after the transient response identification, the second-order transfer function was obtained: W ( p) 

p2

p  0.286 10 2  0.527 10 3 .  p  0.546 10 1  0.527 10 3

(21)

The results of experimental data processing using the developed method are shown in Figure 4. They demonstrate the measurement time reduction from

160

250 s for the transducer output signal YReal t  to 40 s for the recovered signal U MF t  . This proves the efficiency of the proposed approach.

Fig. 4. Experimental results.

4. Conclusions

The described approach to recovering the dynamically distorted signals of MSs based on a sensor model in the sliding mode showed high efficiency. The sliding mode implementation in this approach is effective for a sensor and its secondorder model. If a sensor model is of more than the second order, the system generates self-oscillations (i.e. a phenomenon referred to as ‘chattering’ [5]), which leads to coming out of the sliding mode. In addition, the sensor is affected by noise, which as a result of the signal recovery, is amplified. These two factors require to select a filter and configure its parameters. The proposed method allows it to recover dynamically distorted signals and significantly increase the dynamic accuracy of measurements in the presence of a noise component. The future research direction is associated both with increasing the dynamic accuracy of the MS based on the sensor state variables values and developing the model of the dynamic measurement error evaluator [2], [3] in the sliding mode. Using the additional information from the evaluator output as a criterion for the dynamic errors minimization will improve the additive noises control in the sensor output signal.

161

References

1.

A. N. Tikhonov and V.Y. Arsenin, Solution of Ill-posed Problems (V.H. Winston & Sons, Washington, 1977). 2. A. L. Shestakov, Dynamic error correction method, IEEE Transactions on Instrumentation and Measurement 45 (1), 250 (1996). 3. A. L. Shestakov, Dynamic measurements based on automatic control theory approach, in Advanced Mathematical and Computational Tools in Metrology and Testing X, eds. F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A.B. Forbes (World Scientific Publishing Company, Singapore, 2015), pp. 66–77. 4. V. Utkin, J. Guldner and J. Shi, Sliding Mode Control in Electromechanical Systems (Taylor & Francis, Philadelphia, 1999). 5. W. S. Levine (ed.), The Control Handbook, Second Edition: Control System Fundamentals (CRC Press, 2010). 6. B.C. Moore, Singular value analysis of linear systems. Part I, Dep. Elec. Eng., Univ. Toronto, Toronto, Ont., Canada, Syst. Contr. Rep. 7801, July 1978; and Singular value analysis of linear systems. Part II, Dep. Elec. Eng., Univ. Toronto, Toronto, Ont., Canada, Contr. Rep. 7802 (1978). 7. S.-Y. Kung, A new identification and model reduction algorithm via singular value decompositions, in Proc. 12th Asilomar Conf. Circuits, Syst., Comput. (1978), pp. 705–714. 8. S.-Y. Kung and D.W. Lin, A state-space formulation for optimal Hankelnorm approximations, IEEE Trans. Automat. Contr. 26 (4), 942 (1981). 9. L. M. Silverman and M. Bettayeb, Optimal approximation of linear systems, presented at the JACC, San Francisco, CA, (1980). 10. E. I. Verriest and T. Kailath, On Generalized Balanced Realizations, IEEE Trans. Automat. Contr. 28 (8), 833 (1983). 11. C. Villemagne and R. Skelton, Model reduction using a projection formulation, International Journal of Control 46 (6), 2141 (1987). 12. V. Utkin and H. Lee, Chattering problem in sliding mode control systems, in Proc. IEEE Int. Workshop on Variable Structure Systems (VSS’06), (Alghero, Italy, 2006).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 162–169)

The Wiener degradation model with random effects in reliability metrology* E. S. Chetvertakova* and E. V. Chimitova Novosibirsk State Technical University, Novosibirsk, Russia *E-mail: [email protected] www.nstu.ru In this paper, we consider the application of the Wiener degradation model for the assessment of reliability function. This model is based on the assumption that degradation increments are independent normally distributed random variables. On the basis of the Wiener model, it is possible to predict the reliability (the probability of non-failure operating) of tested items during some given period of time. Using the computer simulation method, we investigate the accuracy of estimates of the model parameters and the reliability function. Moreover, we have considered an example of constructing the Wiener degradation model for semiconducting laser modules data. Keywords: degradation process, Wiener degradation model, random effect, maximum likelihood estimation, reliability, testing goodness-of-fit.

1. Introduction There are two ways of obtaining the reliability estimate: the first one is based on the failure information only, and for the second way we should use all information about the degradation of tested items. These approaches can be combined if the degradation process and failure occurrence are observed under the high stress. In this case, it is necessary to measure the degradation index of tested items during the investigation period and the time moment, when the degradation index achieves the critical value (threshold), which is taken as the failure time [1]. The degradation models are used for the analysis and the further research of degradation data. In our previous works, we considered different types of the degradation models such as models with covariates and random effect. The gamma degradation model, which was investigated by us in [2-3], is widely used This work is supported by the Russian Ministry of Education and Science (project № 1.1009.2017/4.6). *

162

163

in practice [4-8]. However, the gamma degradation model is not appropriate for the reliability analysis, if increments of degradation index are not positive [9]. For that case, it is offered to apply the Wiener degradation model [10]. This model uses the normal distribution as the distribution of degradation index increments and the trend function can be included into the shift parameter. The main advantage of the Wiener model is that the reliability function can be obtained analytically as for the gamma model. 2. The Wiener degradation model Stochastic process Z(t) characterizes degradation process, if:  Z(0) = 0;  Z(t) is a stochastic process with independent increments. For the Wiener degradation model increments Z (t )  Z (t  t )  Z (t ) have the normal distribution with probability density function f Norm  t ; 1 , 2  

1

2  2

e

1  t 1     2  2 

2

,

where 1      t  t     t   is the shift parameter and 2   is the scale parameter,   t  is a positive increasing function.

Let degradation process be observed under a constant in time stress (covariate) x , the range of values of which is defined by the conditions of the experiment. There are various ways to parameterize the dependence of the degradation path on covariates. For example, we can assume that the covariate influences the degradation as in the accelerated failure time model [11]:  t  Z x (t )  Z  ,  r( x; )  where r ( x; ) is the positive covariate function,  is the vector or scalar

regression parameter. Denote the mathematical expectation of degradation process Z x (t ) by

M  Z x  t    mx  t  ,

where mx  t       t r ( x; ) ;   is a trend function of the degradation process,  is the vector or scalar trend parameter. Taking into account the assumptions, the stochastic process Z x (t ) at time moment t  tk has the normal distribution with the shift parameter equal to mx (tk ; , ,  ) . The time to failure, which depends on covariate x , is equal to

164

Tx  sup{t : Z x (t )  z} , where z is the critical value of the degradation path. Then, the reliability function for the Wiener degradation model is given by:  z  mx (t ; , ,  )  S x (t )  P{Tx  t}  P{Z x (t )  z}    ,    where  () is the cumulative distribution function of the standard normal distribution. The random effect can be included into the model by considering the parameter  as a random variable from truncated normal distribution with the density function f Norm (t ; , ) f trunc (t ; ,  )  1  FNorm (0; ,  )

where  is a shift parameter and  is a scale parameter. Assume that we have the degradation path Z i (t ) and covariate value x i for

n items, i  1, n . The degradation path for i -th item is given by





Z i  (0, Z 0i ), (t1i , Z1i ),..., (tki i , Z kii ) ,

where ki is the number of moments of measuring degradation. Suppose, that the initial value of the degradation index Z0i  0 , i  1, n . Denote the sample of increments of the degradation path by  X 1  Z 1  Z 1 , j  1, k , x1 ,...,  j j j 1 1   Xn   . n n n n  X j  Z j  Z j 1 , j  1, k n , x    Following the assumption that the observed stochastic processes Z xi i (t ) ,

 

   

i  1, n are the Wiener degradation processes with trend function mx  t ; , ,   , we can estimate the model parameters (scale parameter  , trend parameters  and regression parameters  , as well as the parameters  and  ), maximizing the logarithm of the likelihood function: ( X j  ) (  )2   1    1 1 2 2  e 2 e 2  d . ln L( X n )   ln    0  2  1  FNorm (0; ,  )  2 i 1 j 1   n

ki

i

2

3. Investigation of the parameter estimation accuracy for the Wiener degradation model with random effects

It is natural that the degradation processes are different for various units. Thus, the construction of the degradation model with random effects seems to be

165

reasonable. However, the random-effect Wiener degradation model is more complicated, and the dimension of parameters vector is larger. So, we need to understand that taking into account the unit-to-unit variability allows to obtain more accurate estimates of the trend and regression parameters. By means of Monte-Carlo simulations, we have compared the accuracy of estimates of the trend and regression parameters for the fixed-effect Wiener degradation model and the Wiener degradation model with random effects for different values of parameter δ, when data are generated from the random-effect model. We have considered the case of the exponential trend function and loglinear stress function: t  e x  mx  t ;  ,  ,     e 1 .





The moments of measuring degradation index for all items started from 0 and finished to 8500 hours with the step, which is equal to 250. The true values of parameters for both models were taken equal to   350 ,   0.5 and  1. Let     ,   be the vector of trend and regression parameters. We have

compared the accuracy of estimation of parameter  for considered models, calculating the Euclidean norm of relative deviation of estimates from the true value:

 1  ˆ1

 

s  ˆs   , s 

 1 where s is the dimension of the parameter vector  . The result was averaged by M  1000 samples: 1 M    i . M i 1 The obtained values of relative accuracy  of estimates of model parameters ˆ for the fixed-effect and random-effect Wiener degradation models are presented in Table 1. As can be seen from Table 1, the following results have been obtained:  When the unit-to-unit variability is small (parameter   0.05 ), the estimates obtained for the fixed-effect model turned out to be more accurate than ones obtained in the case of the random-effect model.  For the parameter   0.5 in the case of sample size n  20 , the better estimates have been obtained for the fixed-effect Wiener degradation model and for the other sample sizes, the better estimates have been obtained for the random-effect model. Such results can be explained by the fact, that the number of unknown parameters in the random-effect model is larger: only the scale parameter σ of the degradation increments distribution and trend and regression parameters are estimated in the case of fixed-effect model, ,...,

166

but in the case of model with random effects, the shift parameter  and the scale parameter  of the distribution of  are estimated as well as trend and regression parameters.  For larger unit-to-unit variability (parameter δ = 1.0), it can be seen that all the estimates obtained for the Wiener degradation model with random effects are more accurate than ones obtained for the fixed-effect model. Table 1. The relative accuracy of parameters estimates for Wiener degradation models in the case of the exponential trend function. Wiener degradation model

20

Fixed-effect model Random-effect model

0.0782 0.1354

Fixed-effect model Random-effect model

0.0901 0.1024

Fixed-effect model Random-effect model

0.1265 0.1089

Sample size 50 100   0.05 0.0413 0.0298 0.0970 0.0662   0.5 0.0630 0.0454 0.0517 0.0241   1.0 0.1093 0.0702 0.0599 0.0317

200 0.0103 0.0250 0.0239 0.0107 0.0540 0.0095

4. The analysis of the lasers degradation data

We have considered the construction of the Wiener degradation model for estimating the reliability of the semiconducting laser module ILNP-134. The data include the degradation paths of 15 laser modules, which were divided into three groups of 5 items. The laser diodes were tested under the temperature 70оС, 80оС and 90оС in three groups, respectively. It was necessary to maintain the power of emission equal to 3 mW during the 8500 hours of accelerated tests. The results of the experiment were given in [12]. The item fails when the current value is 20% higher than the initial value. Thus, the degradation index is the values of current at the fixed time moment of the experiment. Obtained increments of the degradation index are positive and negative, so it is reasonable to use the Wiener degradation model for the further analysis. Let the trend function is an exponential function and the covariate function is loglinear function: t   e x  mx  t    e 1 ,





where  is the trend parameter,  is the regression parameter.

167

We have obtained the following values of the model parameters estimates: ˆ  1.12, ˆ  0.97 , ˆ  367.87, ˆ  0.49. The degradation paths and estimated trends for different groups of the experiment are shown on Figures 1–3. After constructing the model, let us obtain the non-failure operating times for considered lasers with fixed probability (reliability estimate) and environmental temperature. The obtained non-failure operating times are given in Table 2. As can be seen from Table 2, lasers operate without failure during 12500 hours (about one and a half years) under the stress of 70оС; during 8420 hours (about a year) under the stress of 80оС and during 4200 hours (about a half of the year) under the stress of 90оС with the high probability. Z

t

Fig. 1. The degradation paths for items tested under 70оС.

Z

t

Fig. 2. The degradation paths for items tested under 80оС.

168

Z

t

Fig. 3. The degradation paths for items tested under 90оС. Table 2. Non-failure lasers operating times for the different experimental conditions. Heating temperature 70оС 80оС 90оС

Reliability estimate 0.996 0.992 0.983

Non-failure operating time 12500 hours 8420 hours 4200 hours

Conclusions In this paper, we have considered the problems of constructing the Wiener degradation model with covariates. The comparison of the accuracy of maximum likelihood estimates of model parameters for the fixed-effect Wiener degradation model and Wiener degradation model with random effects has been carried out. It has been shown, that in the case of smaller random effect, the more accurate estimates have been obtained for the fixed-effect Wiener degradation model and in the case of larger unit-to-unit variability, the better results have been obtained for the Wiener degradation model with random effects. So, the fixed-effect Wiener degradation model can be recommended to use for small unit-to-unit variability. Then, we have constructed the degradation model for the degradation data of semiconducting laser modules ILPN-134. We recommend to use the Wiener degradation model with random effects for these data to obtain more accurate model parameter estimates. References 1.

Meeker W.Q. Statistical Methods for Reliability Data / W.Q. Meeker, L.A. Escobar. – New York: John Wiley and Sons, 1998. – 680 p.

169

2.

Chimitova E., Chetvertakova E. The construction of the gamma degradation model with covariates // Tomsk State University Journal of Control and Computer Science, 2014. – №4 (29) – pp.51-60 3. Chimitova E., Chetvertakova E. A Comparison of the “fixed-effect” and “random-effect” gamma degradation models // Applied Methods of Statistical Analysis. Nonparametric Approach - AMSA'2015, Novosibirsk, Russia, 14-19 September, 2015: Proceedings of the International Workshop. - Novosibirsk: NSTU publisher, 2015. - P. 161-168. 4. Tsai C.-C., Tseng S.-T., Balakrishnan N. Optimal design for degradation tests based on gamma processes with random effects // IEEE Transactions on reliability, 2012, Vol. 61, No. 2, P. 604 – 613. 5. Tang L.C. Planning of step-stress accelerated degradation test / L.C. Tang, G.Y. Yang, M. Xie. – Los Angeles: Reliability and Maintainability Annual Symposium, 2004. 6. Bordes L., Paroissin C., Salami A. Parametric inference in a perturbed gamma degradation process. Preprint/Statistics and Probability Letters, 2010, Vol. 13. 7. Lawless J., Crowder M. Covariates and Random Effects in a Gamma Process Model with Application to Degradation and Failure. Life Data Analysis, 2004, Vol. 10, P. 213-227. 8. Park C., Padgett W.J. Accelerated degradation models for failure based on geometric Brownian motion and gamma process. Lifetime Data Analysis, 2005. – 11. – P. 511–527. 9. Tsai C.-C., Tseng S.-T., Balakrishnan N. Mis-specification analysis of gamma and Wiener degradation processes // Journal of Statistical Planning and Inference, 2011. – 12. – P. 25-35. 10. Chetvertakova E., Chimitova E. The Wiener degradation model in reliability analysis // 11 International forum on strategic technology (IFOST 2016): proc., Novosibirsk, 1–3 June 2016. – Novosibirsk: NSTU, 2016. – Pt. 1. – P. 488-490. 11. Nikulin, M. and Bagdonavicius, V. Accelerated Life Models: Modeling and Statistical Analisys, Chapman & Hall/CRC, Boca Raton, 2001. 12. Zhuravleva O.V., Ivanov A.V., Kurnosov V.D., Kurnosov K.V., Romantsevich V.I., Chernov R.V. Reliability estimation for semiconductor laser module ILPN-134 / // Fizika i Tekhnika Poluprovodnikov, 2010, Vol. 44, No. 3, P. 377–382.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 170–177)

EIV calibration of gas mixture of ethanol in nitrogen ˇ s Stanislav Duriˇ Faculty of Mechanical Engineering, Slovak University of Technology, Bratislava, Slovakia E-mail: [email protected] ˇ sov´ Zuzana Duriˇ a Slovak Institute of Metrology, Bratislava, Slovakia E-mail: [email protected] Miroslav Dovica Faculty of Mechanical Engineering, Technical University of Koˇsice, Slovakia E-mail: [email protected] Gejza Wimmer Mathematical Institute, Slovak Academy of Sciences, Bratislava, Slovakia Faculty of Natural Sciences, Matej Bel University, Bansk´ a Bystrica, Slovakia E-mail: [email protected] The paper deals with the errors-in-variables (EIV) linear calibration of the gas mixture of ethanol in nitrogen with unknown concentration. Suggested is a new approach for determination of the unknown value of the ethanol concentration in the gas mixture using the calibration line obtained by modeling the calibration experiment as an EIV model. Keywords: Linear calibration; gas mixture; uncertainty evaluation.

1. Introduction In statistics, errors-in-variables (EIV) models or measurement error models are regression models that account for measurement errors in the dependent as well as the independent variables. The present paper deals with EIV calibration model of the gas mixture of ethanol in nitrogen with unknown concentration. Suggested is an approach using the calibration curve obtained by modeling the calibration experiment as an EIV model 1–3 . In particular, the linear EIV models can be represented by the linear 170

171 Table 1. Ethanol concentrations µ1 , µ2 , µ3 , µ4 , µ5 , µ6 , in six reference gas mixtures of ethanol in nitrogen used for calibration, determined by gravimetric method. Quantity

Estimate

Standard uncertainty

Type of evaluation

Probability distribution

µ1 µ2 µ3 µ4 µ5 µ6

0.00008127 0.0001508 0.0002594 0.0004887 0.0006486 0.0007976

0.00000026 0.0000072 0.0000024 0.0000020 0.0000056 0.0000042

Type Type Type Type Type Type

normal normal normal normal normal normal

A A A A A A

regression model with type-II constraints, which allows to derive the best linear unbiased estimator (BLUE) of the required model parameters 2–5 . The first step in each calibration is the task of fitting the calibration curve, i.e. finding the best estimates of the calibration curve parameters and their uncertainties, based on a well-designed calibration experiment. Calibration curve expresses functional relationship between the ideal (errorless) measurements of the same object by two measurement techniques, say ν = f (µ). The comparative calibration deals with the situation when one instrument or measurement technique is calibrated against another and both are subject to the measurement errors. Gas mixtures of the ethanol concentration in nitrogen are used for the calibration 6–8 . For such calibration we use six certified reference material — mixtures with specified concentrations. These gas mixtures are the basis for evaluation of the calibration curve (linear and/or quadratic). The concentrations are determined by using the Nondispersive Infrared Spectrometer (NDIR). The particular value of the signal belongs to each concentration of ethanol in the mixture. Afterwards NDIR can be used to measure the unknown concentration of ethanol in the sample of mixture. The output signal value is a result of 90 readings of which is evaluated the uncertainty by type A method. The uncertainty evaluated by type B method consist of two components from calibration of multimeter (normal distribution) and from multimeter resolution (rectangular distribution). For simplicity and illustration purposes, here we assume only linear relationship (linear calibration curve) between the two measurement techniques (instruments). Further, here we shall assume that the calibration experiment is specified by the state-of-knowledge distributions of the measurand(s), measured by both measurement techniques, in specific number of calibration points, i = 1, . . . , m. This allows to incorporate and combine

172 Table 2. Ethanol concentrations ν1 , ν2 , ν3 , ν4 , ν5 , ν6 , in six reference gas mixtures of ethanol in nitrogen used for calibration, determined by using the nondispersive infrared spectrometer. Quantity

Estimate

Standard uncertainty

Type of evaluation

Probability distribution

ν1

0.000081027

ν2

0.000150957

ν3

0.00026165

ν4

0.000488587

ν5

0.00064851

ν6

0.00079307

0.000000013 0.00000037 0.000000019 0.00000036 0.00000012 0.00000050 0.000000016 0.00000076 0.00000014 0.0000010 0.00000016 0.0000013

Type Type Type Type Type Type Type Type Type Type Type Type

normal rectangular normal rectangular normal rectangular normal rectangular normal rectangular normal rectangular

A B A B A B A B A B A B

full information about the measurement uncertainties based on type A as well as type B evaluations. Table 1 reports the ethanol concentrations in six reference gas mixtures of ethanol in nitrogen determined by gravimetric method. Table 2 reports the ethanol concentrations in six reference gas mixtures of ethanol in nitrogen determined by NDIR. 2. EIV model as a mathematical-statistical model for linear calibration Let Xi and Yi be the random variables representing our current stateof-knowledge (probability distributions) about the values which could be reasonably attributed to the measurand at the i-th calibration point, say µi and νi , based on direct measurements by two instruments or measurement techniques and using the expert judgment (here µi represents the true value of measurand in units of the instrument X and νi represents the true value of measurand in units of the instrument Y). Further, let xi = E(Xi ) and yi = E(Yi ) denote the best estimates 9 of the measurand at the i-th experimental point, and ξi and ηi represent the random variables with a known centered (zero-mean) probability distribution based on our current state-of-knowledge, such that Xi = xi + ξi and Yi = yi + ηi , i = 1, . . . , m. Assuming a linear functional relationship between the true measurand values, νi = a + bµi for all i = 1, . . . , m, brings additional information in a form of constraints, which should be used to improve our knowledge about the measurand values µi and νi , and consequently about the calibration

173

curve parameters a and b. Let ξ µ and η ν represent the random variables modeling the measurement process with their observed values (realizations) ξ µ(real) = x = (real)

(x1 , . . . , xm )′ and η ν

= y = (y1 , . . . , ym )′ , such that

ξ µ = µ + ξ,

and

η ν = ν + η,





(1) ′

where µ = (µ1 , . . . , µm ) , ξ = (ξ1 , . . . , ξm ) , ν = (ν1 , . . . , νm ) , η = (η1 , . . . , ηm )′ , and E(ξ µ ) = µ,

E(η ν ) = ν,

represent the true unknown values of the measurands expressed in units of the two measuring instruments or measuring techniques, ξ µ and η ν are uncorrelated with their covariance matrices cov(ξ µ ) = Σµ and cov(η ν ) = Σν , respectively. In our example we get Σµ = Diag(u2A,µ1 , ..., u2A,µm ), Σν = Diag(u2A,ν1 +u2B,ν1 , ..., u2A,νm +u2B,νm ), where uA,µi is the standard uncertainty of the measurement result xi and and u2B,νi are he standard uncertainties of the measurement result yi , derived by Type A and Type B evaluations. The additional information (the calibration curve is linear) is given by the relation

A,νi

ν = a1 + bµ,

(2)

where a and b are parameters of the calibration line. This means that the relations between parameters a, b, µ1 , ..., µm , ν1 , ..., νm are nonlinear. The nonlinear conditions on parameters (2) we shall linearize using Tay(0) (real) (real) (0) (real) (0) lor expansion in values µ1 = ξµ1 , µ2 = ξµ2 , ..., µm = ξµm and Pm (0) (real) Pm (0) Pm (real) m i=1 µi ηνi − i=1 µi i=1 ηνi b(0) = Pm Pm (0) 2 (0) 2 m i=1 (µi ) − ( i=1 µi )

and neglect the terms of order higher than 1. So the nonlinear conditions on parameters (2) become     .. (0) . a ∆µ(0) (0) (0) (0) . , (3) + (1.µ ) 0 = b µ + (b I. − I) ∆b(0) ν (0)

(0)

where µ(0) = (µ1 , . . . , µm )′ and ∆µ(0) = µ − µ(0) . Further we denote the matrices . . (0) (0) (b(0) I.. − I) = B1 and (1..µ(0) ) = B2 . (4)

174

This model of calibration can be rewritten as ξ µ − µ(0) = ∆µ(0) + ξ,

ην = ν + η

(5)

where ξ = (ξ1 , . . . , ξm )′ and η = (η1 , . . . , ηm )′ , and E(ξ µ − µ(0) ) = ∆µ(0) , cov



ξ µ − µ(0) ην



=

E(η ν ) = ν, 

Σµ 0 0 Σν



and the unknown parameters here are ∆µ(0) , ν, a, ∆b(0) . The model (5) with constraints (3) is in statistics known as the linear regression model with type-II constraints 5 . The best linear unbiased estimators (BLUEs) of the unknown parameters are 4,5 !  (0)   d  ∆µ   Σµ 0 (0) ′ (0) (0)   ν B1 Q11  (0)  b ! = −  0 Σν b + (0)   b (0)   a Q21 (0) c ∆b 

    Σµ 0 (0) ′ (0) (0) I − 0 Σ B1 Q11 B1  ξ µ − µ(0)   ν ην (0) (0) −Q21 B1 

(6)

where b(0) = b(0) µ(0) and (0) Q11 (0) Q21

(0) Q12 (0) Q22

!



(0) B1

=



−1  Σµ 0 (0) ′ (0) B1 B2  0 Σν  . (0) ′

B2

0

3. The iteration procedure Now we proceed the iteration procedure to obtain the BLUE of parameters of interest a, b. We put (1)

µi

}| { (0) (0) (1) d µi = µi + ∆µi +∆µi , i = 1, 2, ..., m z

and

z

b=b

(0)

b(1)

}| { (0) c + ∆b +∆b(1) .

After the first iteration the calibration model is given by equations (5) with constraints (3), where b(1) is substituted instead b(0) , µ(1) instead µ(0) , ∆µ(1) instead ∆µ(0) and ∆b(1) instead ∆b(0) . The BLUE estimator of the

175 (1)

c , respectively, where b parameters a and b is b a(1) and b(0) + ∆b a(1) and (1) c ∆b is given by (6) with the same substitution (b(1) is substituted instead

b(0) , µ(1) instead µ(0) , ∆µ(1) instead ∆µ(0) and ∆b(1) instead ∆b(0) ). After some few iterations (according to our opinion 5-8 steps) the estimates converge. Let it happen in k−th step. So the finally obtained (k) c . The covariance matrix of the BLUEs are b a=b a(k) and bb = b(k−1) + ∆b estimators b a, bb is 4,5   a ˆ (k) (7) cov ˆ = Q22 . b 4. Estimation of the linear calibration line parameters

In our example of linear calibration of ethanol concentration in nitrogen the state-of-knowledge distributions of measurement errors used in the EIV model are given in Table 1 and Table 2. That is ξi is normally distributed random variable, ξi ∼ N (0, u2A,i ) for all i = 1, . . . , m with m = 6, and uA,i denotes the standard uncertainty of type A given in Table 1. Similarly, ηi is a random variable with convolution type distribution, ηi = ζi + δi , with normally distributed ζi ∼ N (0, u2A,i ) and uniformly distributed δi ∼ √ √ R(− 3uB,i , 3uB,i ), where uA,i denotes the standard uncertainty of type A and uB,i denotes the standard uncertainty of type B, given in Table 2. By using the last iteration of (6) we get the best estimates of the calibration curve parameters a ˆ = −0.000000080159356 with its standard uncertainty uaˆ = 0.000000603835713 and ˆb = 0.998950697102301 with its standard uncertainty uˆb = 0.003713198513815, with the covariance matrix     a ˆ 0.000000003646176 −0.000015693371048 −4 cov ˆ = 10 × . (8) −0.000015693371048 0.137878432029965 b

The marginal state-of-knowledge distributions of a ˆ and ˆb are computed by numerical inversion of their characteristic functions 10,11 , which are derived from the coefficients of equation (6) and distributional specifications of ξi and ηi , i = 1, . . . , m. With given covariance matrix (8) we can further generate random sample from the joint state-of-knowledge distribution using Gaussian copula 12 , see Figure 1. 5. Conclusions

In this paper, we suggested a new method of linear calibration based on EIV calibration model with an illustrative real data example from calibration of

176

Joint distribution of the calibration line parameters (a,b) 1.015

1.01

1.005

b

1

0.995

0.99

0.985

0.98 -2.5

-2

-1.5

-1

-0.5

0

a

0.5

1

1.5

2 10-6

Fig. 1. Random sample of size n = 10000 from the joint bivariate copula-type distribution, specified by the Gaussian copula with given correlation matrix R (with ̺ = −0.7), and the convolution-type marginal state-of-knowledge distributions of the input quantities a and b.

the ethanol concentration in nitrogen. Using the joint distribution of the calibration curve parameters we can further obtain the coverage interval for the future observations of the ethanol concentration measured by NDIR.

Acknowledgements The work was partly supported by the Slovak Research and Development Agency, projects APVV-15-0295 and APVV-15-0149, and the Scientific Grant Agency of the Ministry of the Ministry of Education of the Slovak Republic and the Slovak Academy of Sciences, projects VEGA 2/0047/15, VEGA 1/0748/15, VEGA 1/0556/18 and KEGA 006STU-4/2018.

177

References 1. G. Casella and R. L. Berger, Statistical Inference (Duxbury Pacific Grove, CA, 2002). 2. G. Wimmer and V. Witkovsk´ y, Univariate linear calibration via replicated errors-in-variables model, Journal of Statistical Computation and Simulation 77, 213 (2007). ˇ s, Evaluation of 3. G. Wimmer, R. Palenˇc´ar, V. Witkovsk´ y and S. Duriˇ the Gauge Calibration: Statistical Methods for Uncertainty Analysis in Metrology (In Slovak) (Slovak Technical University, Bratislava, 2015). 4. L. Kub´ aˇcek, Foundations of Estimation Theory (Elsevier, 2012). 5. E. Fiˇserov´a, L. Kub´ aˇcek and P. Kunderov´a, Linear Statistical Models: Regularity and Singularities (Academia, 2007). 6. ISO 6143:2001, Gas analysis comparison methods for determining and checking the composition of calibration gas mixtures, International Organization for Standardization (ISO), Geneva (2001). ˇ sov´a, S. Duriˇ ˇ s and R. Palenˇc´ar, Gravimetric preparation of 7. Z. Duriˇ primary gas mixtures with liquid component in air matrix, in MEASUREMENT 2015, Proceedings of the 10th International Conference on Measurement, (Smolenice, Slovakia, May 25-28, 2015). 8. M. Milton, F. Guenther, W. Miller and A. Brown, Validation of the gravimetric values and uncertainties of independently prepared primary standard gas mixtures, Metrologia 43, p. L7 (2006). 9. JCGM100:2008, Evaluation of measurement data – Guide to the expression of uncertainty in measurement (GUM 1995 with minor corrections), in JCGM - Joint Committee for Guides in Metrology, (ISO (the International Organization for Standardization), BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2008). 10. V. Witkovsk´ y, Numerical inversion of a characteristic function: An alternative tool to form the probability distribution of output quantity in linear measurement models, Acta IMEKO 5, 32 (2016). 11. V. Witkovsk´ y, CharFunTool: The characteristic functions toolbox for MATLAB (2017), https://github.com/witkovsky/CharFunTool. ˇ sov´a, S. Duriˇ ˇ s, R. Palenˇc´ar and 12. V. Witkovsk´ y, V., G. Wimmer, Z. Duriˇ J. Palenˇc´ar, Modeling and evaluating the distribution of the output quantity in measurement models with copula dependent input quantities, in International Conference on Advanced Mathematical and Computational Tools in Metrology and Testing XI, (University of Strathclyde, Glasgow, Scotland, 2017).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 178–185)

Models and algorithms for multi-fidelity data Alistair B. Forbes National Physical Laboratory, Teddington, Middlesex, UK E-mail: [email protected] One of the issues associated with analysing data coming from multiple sources is that the data streams can have markedly different spatial, temporal and accuracy characteristics. For example, in air quality monitoring we may wish to combine data from a reference sensor that provides relatively accurate hourly averages with that from a low cost sensor that provides relatively inaccurate averages over finer temporal resolutions. In this paper, we discuss algorithms for analysing multi-fidelity data sets that use the high accuracy data to allow a characterisation of the low accuracy measurement systems to be made. We illustrate the approaches on data simulating air quality measurements in colocation studies in which a reference sensor is used to calibrate a number of other sensors. In particular, we discuss approaches that can be applied in cases where sensors are outputting averages over different time intervals. Keywords: air quality, calibration, data analysis, multi-fidelity.

1. Introduction Many organisations are looking to exploit the availability of data from potentially a large number of sensors in order to gain a better understanding of their systems, processes and environments. One of the issues associated with analysing data coming from multiple sources 5 is that the data streams can have markedly different spatial, temporal and accuracy characteristics. For example, in air quality monitoring 9 we may wish to combine data from a reference sensor that provides relatively accurate hourly averages with that from a low cost sensor that provides relatively inaccurate averages over ten minute time intervals (or indeed, much finer temporal resolutions). A second example comes from coordinate metrology in which we want to combine, say, one hundred accurate contact-probe measurements with, say, five thousand data points gathered by a less accurate vision system. 178

179

In analysing such data, the goal is to exploit the accuracy of the low density data while retaining the coverage of the higher resolution data. In order to achieve this synthesis, it is necessary to be able to assess the accuracy characteristics of different data sources. While the high accuracy reference systems may be well characterised, it is likely that the behaviour of the low cost systems is less well understood. In this paper, we discuss algorithms for analysing multi-fidelity data sets that use the high accuracy data to allow a characterisation of the low accuracy measurement systems to be made. In section 2, we describe a general approach to analysing data coming from multiple data streams. In section 3, we illustrate the approaches on data simulating air quality measurements in co-location studies in which a reference sensor is used to calibrate a number of other sensors in situ. In section 4, we extend these approaches to cases in which sensors are outputting measured time averages over different time intervals. Our concluding remarks are given in section 6. 2. General model We assume that data for the kth data stream, k = 1, . . . , K, is generated according to the model y k = f k (a, bk ) + ǫk ,

ǫk ∈ N(0, Vk (σ k ))

(1)

where y k is the vector of data recorded by the kth instrument, a are the parameters of the model of the system of primary interest, bk are parameters describing the response of the kth instrument, such as calibration parameters and systematic effects, ǫk represent random effects associated with the kth instrument drawn from a multivariate Gaussian distribution with variance matrix Vk (σ k ) depending on further hyper-parameters σ k . We expect to have prior information p(bk ) and p(σ k ) about bk and σ k from a previous characterisation of the instruments. We may also have prior information p(a) about a. Letting y represent the concatenated vector of data, etc., the model in (1) determines the likelihood p(y|a, b, σ). The knowledge about a and other parameters derived from the data and the prior information is encoded in the posterior distribution p(a, b, σ|y) ∝ p(y|a, b, σ)p(a)p(b)p(σ). In general, the posterior distribution will not be a standard distribution and computational approaches have to be employed to determine

180

summary information such as means and variances associated with the model parameters. One approach is to determine the parameter estimates that maximise the posterior distribution, equivalently that minimise E(a, b, σ) = − log p(a, b, σ|y). The parameter estimates approximate the mean of the posterior distribution while the inverse of the Hessian matrix of second order partial derivatives of E with respect parameters evaluated at the parameter estimates is an approximation to the variance matrix associated with the posterior distribution. Thus, the parameter estimates and the Hessian matrix supply information sufficient to provide a multivariate Gaussian approximation to the posterior distribution. If the functions f k are linear in a and bk , the posterior distribution can be marginalised analytically 10 to determine the posterior distribution ˆ that minimises p(σ|y). Optimisation can be applied to determine the σ − log p(σ|y) and, taking these estimates as representative values for σ, inferences about a and b can made using these representative values. For linear f and Gaussian likelihoods, these inferences can often be made using standard linear least squares methods. Even if the f k are mildly nonlinear, given representative values of a and bk , the functions f k can be linearised 3 and an approximate posterior distribution p(σ|y) derived, from which representative values for σ can be determined. For the more general case, sampling methods such as Markov chain Monte Carlo (MCMC) can be used to sample from the posterior distribution in order to make inferences about the parameters. 7,8 The following two sections discuss an application in air quality measurement that is an example of the general approach summarised above. 3. Calibration of sensors in a co-location experiment In air quality measurement, co-location experiments are used to characterise the behaviour of a number of test sensors against a reference sensor. All the sensors are physically located close together so that it can be assumed that all sensors are measuring the same signal. The primary goal is to characterise and/or calibrate the test sensors against the reference sensor. At time tj , the kth sensor measures the quantity zj to produce an observation modelled as yi = b1,k + b2,k zj + ǫi ,

ǫi ∈ N(0, σi2 ),

(2)

where nominally b1,k = 0 and b2,k = 1. The parameters bk = (b1,k , b2,k )⊤ model a linear response of the kth sensor to the stimulus variable z. More general responses can also be implemented.

181

From the data y = (y1 , . . . , ym )⊤ , the signal, represented by parameters z = (z1 , . . . , zn )⊤ , and the 2K vector of parameters b are to be estimated. We assume that at least one of the sensors (the reference sensor) has been calibrated, so there is prior calibration information about b that can be represented by Bb ∼ N(b0 , Vb ). Associated with the data vector y are indices j = j(i), k = k(i), specifying that yi is a measurement of the signal at the jth time step by the kth instrument. Assuming that the statistical parameters σi and Vb are known, estimates of the model parameters are found by solving a nonlinear least squares problem. The variance matrix associated with the fitted parameters can be estimated from the Jacobian matrix of partial derivatives of the summand ˆ of b, functions with respect to the model parameters. 2 Given estimates b calibrated outputs yˆ for the sensors are given by yˆ = (y − b1,k )/b2,k ,

k = k(i).

(3)

If the reference sensor measures at all times, then all parameters of the model can be determined. However, if the reference sensor misses some times, the system may still be solvable. If the reference sensor and a second sensor measure at least two distinct z’s in common, then the calibration parameters associated with the second sensor can be estimated, along with all the quantities zj measured by the second sensor. A boot-strapping scheme can cycle through the data and, at each cycle, increase, if possible, the number of sensors that can be calibrated and the number of zj ’s that can be estimated. The scheme terminates when no new information is gathered during a cycle, with some or all the parameters identified. 4. Time-averaged data The model in (2) assumes that each sensor is responding to the same signal zj . In practice, many sensors produce time-averaged data and different sensor designs will often average over different time periods. For example a reference sensor might provide hourly averages while a low cost sensor might produce 10 minute averages. In the discussion below we assume that all time intervals associated with the sensors are constructed from a set of time of equal-length time intervals centred at tq , q = 1, . . . , M . Approaches for arbitrary time-averaging schemes are considered by Forbes, 6 with the emphasis in reconstructing the signal from the time averages on the assumption that the signal has some underlying temporal correlation. 1,4 Let xq be the average of a signal over the time interval centred at tq .

182

We extend the model in (2) to yi = b1,k + b2,k zj (x) + ǫi ,

ǫi ∈ N(0, σi2 ),

k = k(i),

(4)

where now zj (x) = d⊤ j x,

j = j(i),

(5)

represents the time average of a segment of the signal x specified by dj , a vector typically of the form (0, . . . , 0, 1, . . . , 1, 0, . . . 0)⊤ /mj . These vectors can be stored as columns in a design matrix D so that quite heterogeneous time averaging schemes can be incorporated by forming D appropriately. Associated with the measurements yi are indices j = j(i) and k = k(i) specifying that yi is a measurement of the signal averaged over the jth time interval by the kth instrument. As described, the collocation experiment is a special case of the analysis of time-averaged data in which the matrix D is simply the identity matrix. The goal of the analysis is firstly to estimate the signal x at the finest time resolution based on the measurements y gathered according to (4) and the prior information 6 and, secondly, to update the calibration parameters b for the sensors based on the complete set of data. Assuming that all the statistical parameters are treated as known, estimates of the parameters x and b can be found by solving a nonlinear least squares problem involving the observations y and the prior information on b. The problem is only mildly nonlinear through the interaction between the parameters b2,k and x. 5. Numerical example In this section, we illustrate the performance of the algorithm on timeaveraged data, as described in section 4, simulating measurements associated with air quality monitoring. The simulation involves one calibrated reference sensor, k = 1, and five test sensors, k = 2, 3, . . . , 6, gathering data according to the model (4). For the reference sensor, σi = σR = 0.02, while for the test sensors, σi = σT = 0.08. The prior calibration information associated with the reference sensor is summarised as b1,1 ∼ N(0.00, 0.052 ) and b2,1 ∼ N(1.00, 0.052 ), while that for the test sensors is summarised as b1,k ∼ N(0.00, 0.202 ) and b1,k ∼ N(1.00, 0.202 ). The reference sensor gathers hourly averages, sensors 2 and 3, 10 minute averages, sensors 4 and 5, 20 minute averages and sensor 6, 30 minute averages. Each sensor records estimates of zj = d⊤ j x, where x is a signal sampled every two minutes over a 12 hour period. The data yi represents the simulated measurements of

183

zj from the six sensors, giving a total of up to 252 measurements. In the simulation, approximately 20 % of the measurements have been removed at random. Figure 1 plots the signal xq , and the data yi , simulated according to the model in (4). Figure 2 graphs the true five minute averages zj calculated from the signal x and the uncertainty band zˆj ± 2u(zj ) centred on the estimates zˆj calculated from all the data. It is seen that the uncertainty band derived for the estimates contains almost all the true averages. 1 0.5

Signal/au

0 -0.5 -1 -1.5 -2 -2.5 0

2

4

6

8

10

12

Time/hour Fig. 1. Analysis of simulated collocation data involving one calibrated reference instrument and five test instruments to be calibrated. The figure graphs the signal xq , solid line, and the simulated data yi , crosses, from all six sensors.

Table 1 gives the parameters b1,k and b2,k used to generate the data yi , the estimates ˆb1,k and ˆb2,k of these parameters derived from the data, and their associated standard uncertainties u(b1,k ) and u(b2,k ), respectively, for all six sensors, k = 1, . . . , 6. Points to note are i) the uncertainties associated with the parameters relating to the test instruments, k = 2, . . . , 6, are similar to those associated with the reference instrument, k = 1, indicating a successful transfer of the calibration associated with the reference instrument to the test instruments, and ii) there is a dependence of the

184

1 0.5

Signal/au

0 -0.5 -1 -1.5 -2 -2.5 0

2

4

6

8

10

12

Time/hour Fig. 2. Analysis of simulated collocation data involving one calibrated reference instrument and five test instruments to be calibrated. The figure graphs the ‘true’ 5 minute averages zj calculated from the signal x, solid line, and the uncertainty band zˆj ±2u(zj ), dashed lines, centred on the estimates zˆj calculated from all the data. Table 1.

Estimates of the calibration parameters b.

k

b1,k

ˆb1,k

u(b1,k )

b2,k

ˆb2,k

u(b2,k )

1 2 3 4 5 6

0.010 0.030 -0.300 0.000 -0.010 -0.090

0.015 0.024 -0.308 0.024 0.003 -0.110

0.045 0.054 0.035 0.049 0.045 0.042

1.050 1.180 0.720 1.020 0.930 0.830

1.017 1.151 0.699 1.018 0.912 0.763

0.045 0.055 0.036 0.051 0.045 0.043

uncertainties on the value of the slope parameters b2,k ; the relative uncertainties u(b2,k )/ˆb2,k are approximately constant. 6. Concluding remarks In this paper, we have looked at how to combine information from different sensors with different characteristics in terms of accuracy and frequency of measurements. In particular, we have considered how an ensemble of

185

sensors can be calibrated in situ on the basis of a calibrated reference sensor measuring the same signal. We have illustrated the cross-calibration of sensors on time-averaged data in which different sensors are averaging over different time intervals. The algorithms can cope with missing data. In particular, it is not necessary to have a complete set of measurements for the reference sensor. The approach can be used, for example, to analyse data in co-location experiments in air quality monitoring. Acknowledgements This work was funded by the UK’s National Measurement System Programme for Data Science. I thank colleagues Nick Martin and Peter Harris for discussions on aspects of calibration of a network of sensors and Kavya Jagan for comments on an earlier draft of this paper. References 1. N. Cressie and C. K. Wikle. Statistics for Spatio-Temporal Data. Wiley, Hoboken, New Jersey, 2011. 2. A. B. Forbes. Nonlinear least squares and Bayesian inference. In F. Pavese, M. B¨ ar, A. B. Forbes, J.-M. Linares, C. Perruchet, and N.-F. Zhang, editors, Advanced Mathematical and Computational Tools in Metrology VIII, pages 103–111, Singapore, 2009. World Scientific. 3. A. B. Forbes. Weighting observations from multi-sensor coordinate measuring systems. Measurement Science and Technology, 23(online:025004), 2012. 4. A. B. Forbes. Empirical functions with pre-assigned correlation behaviour. In F. Pavese, W. Bremser, A. Chunovkina, N. Fischer, and A. B. Forbes, editors, Advanced Mathematical and Computational Tools for Metrology X, pages 17–28, Singapore, 2015. World Scientific. 5. A. B. Forbes. Traceable measurements using sensor networks. Transactions on Machine Learning and Data Mining, 8(2):77–100, October 2015. 6. A. B. Forbes. Reconstruction of a signal from time-averaged data. In 18th International Congress of Metrology, 18–21 September 2017, Paris. EDP Sciences, 2017. 7. A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian Data Analysis. Chapman & Hall/CRC, Boca Raton, Fl., third edition, 2014. 8. K. Jagan and A. B. Forbes. NLLSMH: MCMC software for nonlinear leastsquares regression. In A. B. Forbes et al., editors, Advanced Mathematical and Computational Tools in Metrology XI, pages 211–219, Singapore, 2018. World Scientific. 9. L. Mittal and G. Fuller. London air quality network: Summary report 2016. Technical report, King’s College, London, June 2017. 10. C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Cambridge, Mass., 2006.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 186–191)

Uncertainty calculation in the calibration of an infusion pump using the comparison method A. Furtado1, E. Batista1†, M. C. Ferreira1, I. Godinho1 and Peter Lucas2 1IPQ,

Portuguese Institute for Quality, Caparica, 2828-513, Portugal †E-mail: [email protected] www.ipq.pt

2VSL,

Dutch Metrology Institute, Delft, 2629 JA, The Netherlands www.vsl.nl

Nowadays, several types of infusion pumps are commonly used for drug delivery, such as syringe pumps and peristaltic pumps. These instruments present different measuring features and capacities according to their use and therapeutic application. In order to ensure the metrological traceability of these flow and volume measuring instruments it is necessary to use suitable calibration methods and standards. Two different calibration methods can be used to determine the flow error of infusion pumps. One is the gravimetric method, considered as a primary method and commonly used by National Metrology Institutes. The other is a comparison method where an Infusion Device Analyser (IDA) is used as flow generator, and is typically used in hospital maintenance offices. The uncertainty calculation for the gravimetric method is very well described in the literature but for the comparison method no information regarding the uncertainty evaluation and components is available. This paper will describe in detail the measurement model along with the standard uncertainties components, the sensitivity coefficients values, the combined standard uncertainty and the expanded uncertainty of the comparison calibration method using an IDA, considering GUM methodology. This work has been developed in the framework of the EURAMET EMRP MeDD (HLT07) and EMPIR InfusionUptake (15SIP03) projects. Keywords: medical infusion instruments, flow, calibration, uncertainty.

1. Introduction Medical infusion instruments are widely used, as they are fundamental for primary health care, namely for providing drugs, nutrition and hydration to patients. Hence, it is crucial that the volume and flow generated by these devices be accurate and precise. To ensure this, it is necessary to have appropriate calibration methods. 186

187

The “Metrology for Drug Delivery” (MeDD - HLT07) project [1], funded by the European Metrology Research Programme (EMRP), had as main concern the improvement of such calibration methods. Aiming to disseminate the knowledge obtained from MeDD project, a follow-on project, “Standards and elearning course to maximize the uptake of infusion and calibration best practices” (InfusionUptake - 15SIP03) funded by the European Metrology Programme for Innovation and Research (EMPIR) started in May 2016. The 15SIP03 project has two main goals:  to develop an online module available on the E-learning platform of the European Society for Intensive Care Medicine (ESICM), with the aim to create awareness and understanding of multi infusion risks and thereby reducing dosing errors, thus decreasing adverse patient incidents and increasing the quality of medical treatment;  to incorporate the best metrology practices relating calibration of infusion devices in ISO standards, namely ISO 7886-2 [2] and IEC 60601-2-24 [3].

In order to develop the E-learning modules and to identify the relevant information to incorporate in the standards it is necessary to identify, study and validate the different calibration methods. 2. Calibration methods 2.1. Gravimetric method The gravimetric method is considered as a primary method and is commonly used by National Metrology Institutes [3, 4] to calibrate syringe pumps. The detailed calibration procedure and uncertainty budget are described in previous publications [4-6]. 2.2. Comparison method with an Infusion Device Analyzer The other calibration method, used to determine the flow rate of an infusion pump, mainly by hospitals maintenance offices, is a comparison method, and thereby considered a secondary calibration method. This method involves comparing the flow generated by the infusion pump under calibration with the flow generated by an Infusion Device Analyser (IDA) (Figure 1).

188

Figure 1. IDA comparison method setup.

During the calibration, the temperature of the water and the air temperature, relative humidity and atmospheric pressure are continually measured. The syringe to be used in the syringe pump is filled with ultrapure water [7]. Before attaching the Teflon tube and mounting it on the pump, air bubbles are removed by inverting the syringe so that the nozzle lumen is uppermost and depressing the plunger. The line is then filled by running the syringe pump at a high rate until a steady flow of drops come at the end of the tube. Finally, the tube is connected to the IDA. The target flow is then programmed in the syringe pump. Data acquisition of the IDA begins after 10 minutes of steady flow and lasts 15 minutes. Data can then be directly recorded by software or directly in the machine average reading. In the latter case at least three repetitions are performed. 3. Uncertainty calculation The measurement uncertainty of the calibration method using an IDA is evaluated following the Guide to the expression of Uncertainty in Measurement (GUM) [8]. The measurement model is presented along with the standard uncertainties components, the sensitivity coefficients values, the combined standard uncertainty and the expanded uncertainty. 3.1. Measurement model The measurement error of the infusion pump is determined by comparing the target flow rate (� ) with the flow value obtained by the IDA (� ).The flow rate of the infusion pump � at 20 °C is given by Eq. (1). �



� �





(1)

189

3.2. Uncertainty evaluation The main contributions for the standard uncertainty of the comparison method for calibration of an infusion pumps flow (QIP) are: the uncertainty of IDA (u(QIDA)), the instrument resolution to be calibrated (u(QIP)), the uncertainty of the measurement of temperature of the water (u(T)), the uncertainty of the temperature expansion coefficient of the syringe used in the test () and at last, the uncertainty due to water loss (u(δloss)). Detailed information regarding the uncertainty components is described in Table 1. Also as can be seen in Table 1 there are several uncertainty components that are determined by combining several sources of uncertainties. Table 1. Uncertainty components in the calibration of an infusion pump by comparison method using an IDA. Standard uncertainty component

Source / Symbol

Evaluation type

Distribution

Calibration Resolution Flow measurements standard-deviation

A B

Normal Rectangular

B

Rectangular

u(QIP)res

Resolution

B

Rectangular

u(T)cal u(T)res

Calibration Resolution Temperature measurements standard-deviation

A B

Normal Rectangular

B

Rectangular

Literature

B

Rectangular

Estimation

B

Rectangular

u(QIDA)cal u(QIDA)res

IDA flow / QIDA

u(QIDA)rep

IP flow / QIP

Water temperature / T

u(T)rep Syringe temperature expansion coefficient /  Water loss / δloss

u() u(δloss)

Evaluation process

The combined uncertainty u(QIP), for the comparison method is given by Eq. (2). � �





� �



� �

� �

� � �

� � �

(2)

From the determined values of the coverage factor k and the combined standard uncertainty of the measurand � � , the expanded uncertainty � � is deduced by Eq. (3). � �

�� �

(3)

190

4. Results As an example, we consider the results of an infusion syringe pump calibration by comparison method, for a nominal flow rate of 2 mLh−1. The average measurements results were 1.96 mLh−1 with an expanded uncertainty of 0.06 mLh−1. In Table 2 are indicated the estimated values for the output quantity with the associated standard uncertainties and the expanded uncertainty. Table 2. Example of the uncertainty budget of an infusion syringe pump calibration by comparison method with an IDA for a nominal flow rate of 2 mLh−1. Uncertainty component / Unit

Estimate

u(xi)

ci

(ci u(xi)) 2

QIDA / mLh−1 QIP resolution/ mLh−1 Water T / ºC Syringe expansion coefficient / ºC−1 δloss / mLh−1 k � � / mLh−1 � � / mLh−1 � � /%

1.96 0.01 23.7 2.4×10−4 0.002 2 0.03 0.06 3

0.03 5.77×10−3 0.025 1.20×10−5 1.15×10−3

1 1 4.66×10−4 −5.63 1

9.0×10−4 3.33×10−5 1.35×10−10 4.56×10−9 1.33×10−6

5. Conclusions

In this communication was described uncertainty evaluation in the calibration of an infusion pump using the comparison method by GUM methodology. There are several influence factors that have a major contribution to the uncertainty budget due to the low amount of liquid used, namely the evaporation of the fluid and the calibration of the IDA. Considering that the maximum permissible error described by the manufacturer of the pump is 2 % [9] one can verify that this calibration method is not recommended to be used for flows equal or lower than 2 mLh−1. In this flow range it is recommended to use the gravimetric method for the calibration of syringe pumps. Acknowledgments The EMPIR project “15SIP03 – Infusion Uptake” is carried out with funding of European Union under the EMPIR. The EMPIR is jointly funded by the EMPIR participating countries within EURAMET and the European Union. References 1.

P. Lucas and S. Klein, Metrology for drug delivery, Biomedical Engineering, 60, 4 (2015).

191

2. 3.

4.

5. 6. 7. 8.

9.

ISO 7886-2:1996, Sterile hypodermic syringes for single use, Part 2: Syringes for use with power-driven syringe pumps. IEC 60601-2-24:2012, Medical electrical equipment - Part 2-24: Particular requirements for the basic safety and essential performance of infusion pumps and controllers. H. Bissig et al., Primary standards for measuring flow rates from 100 nl/min to 1 ml/min – gravimetric principle, Biomedical Engineering, 60, 4 (2015). E. Batista et al., Assessment of drug delivery devices, Biomedical Engineering, 60, 4 (2015). E. Batista, N. Almeida, I. Godinho and E. Filipe, Uncertainty calculation in gravimetric microflow measurements, AMCTM X (2015). ISO 3696:1987, Water for analytical laboratory use – Specification and test methods. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML (2008). Evaluation of Measurement Data - Guide to the Expression of Uncertainty in Measurement GUM 1995 with minor corrections. Joint Committee for Guides in Metrology, JCGM 100. BBraun Perfusor space manufacturer specifications, at www.BBraun.com

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 192–202)

Determination of measurement uncertainty by Monte Carlo simulation Daniel Heißelmann∗ , Matthias Franke, Kerstin Rost and Klaus Wendt Coordinate Metrology Department, Physikalisch-Technische Bundesanstalt (PTB), Bundesallee 100, Braunschweig, 38116, Germany ∗ E-mail: [email protected] Thomas Kistner Carl Zeiss Industrielle Messtechnik GmbH, Carl-Zeiss-Straße 22, 73446 Oberkochen, Germany Carsten Schwehn Hexagon Metrology GmbH, Siegmund-Hiepe-Straße 2-12, 35578 Wetzlar, Germany Modern coordinate measurement machines (CMM) are universal tools to measure geometric features of complex three-dimensional workpieces. To use them as reliable means of quality control, the suitability of the device for the specific measurement task has to be proven. Therefore, the ISO 14253 standard requires, knowledge of the measurement uncertainty and, that it is in reasonable relation with the specified tolerances. Hence, the determination of the measurement uncertainty, which is a complex and also costly task, is of utmost importance. The measurement uncertainty is usually influenced by several contributions of various sources. Among those of the machine itself, e.g., guideway errors and the influence of the probe and styli play an important role. Furthermore, several properties of the workpiece, such as its form deviations and surface roughness, have to be considered. Also the environmental conditions, i.e., temperature and its gradients, pressure, relative humidity and others contribute to the overall measurement uncertainty. Currently, there are different approaches to determine task-specific measurement uncertainties. This work reports on recent advancements extending the well-established method of PTB’s Virtual Coordinate Measuring Machine (VCMM) to suit present-day needs in industrial applications. The VCMM utilizes numerical simulations to determine the task-specific measurement uncertainty incorporating broad knowledge about the contributions of, e.g., the used CMM, the environment and the workpiece. Keywords: Measurement uncertainty; Monte Carlo simulation; virtual coordinate measuring machine.

192

193

1. Introduction Knowledge about the measurement uncertainty is a basic requirement for quality-driven and economic production processes and involves the entire production technique. However, in coordinate metrology the determination of measurement uncertainties is a complex process that has to be performed task-specific. Particularly, size, shape, form deviations, and accessibility of the feature significantly influence the achievable measurement uncertainty. Thus, the large spectrum of parts with several variations and narrow tolerances at the same time require methods to determine the measurement uncertainty in a simple and efficient, yet universal manner. Generally, there are three different approaches for the calculation of the measurement uncertainty: analytical budgets, experimental determination, and numerical simulations. All three require suitable mathematical models as well as the description and quantification of uncertainty contributions. The uncertainty determination by an analytical uncertainty budget based on detailed knowledge of all individual contributions is described by the GUM 1,2 . The experimental approach according to ISO 15530-3 3 demands a calibrated workpiece and multiple measurements. ISO/TS 15530-4 4 describes general requirements for the application of simulation methods, like the one presented in this work. 2. Measurement uncertainty determination by numerical simulation Numerical simulations are an efficient and versatile technique to universally determine the measurement uncertainty in coordinate measurements. The basic elements of the realization of this method are described by GUM Supplement 1 5 . The “Virtual Coordinate Measuring Machine” VCMM is a software tool that implements the simulation method. It is based on an approach to consider input parameters for all influences occurring during the measurement procedure, including those that are not machine specific (Fig. 1). All parameters are described by suitable probability distribution functions and are linked to mathematical models evaluating their effect on the measurement of an individual position. The evaluation of the measurement uncertainty is performed in a six-step process, where (1) and (2) are the same as in any measurement: (1) The CMM records the measurement points of all probed features of the workpiece.

194

Fig. 1. Principle sketch of the virtual coordinate measuring machine VCMM: The CMM software passes the measured values to the VCMM. The statistical analysis is performed by a separate module.

(2) The CMM’s analysis software calculates the features’ characteristics (i.e., measurement values) from the recorded data points. After these tasks are accomplished, four additional steps are required: (3) Based on the measured coordinates, the VCMM software calculates a set of new coordinates that have been modified considering systematical and random measurement deviations. (4) These coordinates are analyzed in the same way as the original coordinates. (5) Steps (3) and (4) are repeated n times until a stable calculation of the measurement uncertainty has been achieved. (6) The measurement results and their uncertainties are calculated from the distributions and the mean values of all n repeated simulations. The first version of PTB’s VCMM was published 15 years ago 6 . Since then, the software has been thoroughly revised, modularized and implemented into object-oriented code. The mathematical models have been adapted to the technical advancement of CMM, wherever necessary. Furthermore, advanced algorithms for the creation of random numbers 7 and the analysis of scanning data and consideration of undersampling of form deviations

195

of features have been implemented. Additionally, the VCMM was augmented by a statistics module, checking the stability of the computation of the measurement uncertainty according to predefined criteria. Depending on the specific application the VCMM offers different opportunities to choose the input parameters. In order to achieve the highest precision it is recommended to precisely determine the machine-specific measurement deviations after commissioning the CMM. Alternatively, the machine-specific input parameters may be determined using manufacturer’s specifications, such as maximum permissible errors (MPE). The simple parametrization of the VCMM does not consider detailed characteristics of the individual parameters in the same way as a complete determination of residual errors of the individual CMM (Fig. 2). Therefore, the measurement uncertainties derived for a machine characterized by threshold values will in general be larger than those of a CMM that was individually parameterized using calibrated standards. 3. Input parameter models The VCMM software is capable of calculating the measurement uncertainty for measurements of individual characteristics on CMMs. Therefore, the it takes into account (i) geometry errors of the guideways, (ii) measurement errors of the probing system for recording individual data points or scanning measurements, (iii) thermally-induced deformations of the CMM structure, (iv) influences of the workpiece, and (v) error contributions of the measurement strategy. The individual error components treated in the VCMM are comprised of up to three different types of contributions: • A known and systematic contribution that is not varied during the runtime of the VCMM software imposes a systematic bias on the measurement results. • A stochastic contribution, which is altered before each VCMM run or on the occasion of certain events. It is varied depending on the specified uncertainties and reflects residual systematic errors. • Entirely random measurement uncertainty contributions are independently varied for each measurement point. The aforementioned contributions depend on the positions p~ of the CMM guideways, the stylus tip offset ~l and also the probing direction ~n during recording of a measurement point. They can either be described as a constant value that is considered independent of the position, a length-

196

6

Deviation / µm

4

2

0

-2

-4

a

-6 0

100

200

300

400

500

600

700

800

900

1000

700

800

900

1000

700

800

900

1000

Position [mm] 3

Deviation / µm

2

1

0

-1

-2

b

-3 0

100

200

300

400

500

600

Position [mm] 2 1.5

Deviation / µm

1 0.5 0 -0.5 -1 -1.5 -2

c 0

100

200

300

400

500

600

Position / mm

Fig. 2. Example of machine-specific input parameters from manufacturer’s specifications. The solid, dashed, and dash-dotted curves represent a set of simulated deviations in position (a), rotation (b), and straightness (c) within the maximum permissible errors (MPE) specified by the CMM manufacturer (black dotted curves).

dependent contribution, linearly increasing with increasing length of travel, a linear combination of several harmonic functions, or look-up table. In general, these options can also be combined. 3.1. Guideway geometry The static guideway geometry errors are treated by the VCMM based on a rigid body model. The errors are determined from the translational and rotational deviations of the guideways, their respective squareness deviation, the uncertainty of the length determination as well as short-periodic

197

deviations of guideways and scales. The three axes of the CMM are considered as independent and the results are obtained by superposition of the various contributions. The position error ~eGi resulting from static guideway errors can thus be described by ~eGi = ~eGTi + EGRi · p~i + EGPi · ~l + (1 + sGM ) · p~i ,

(1)

where ~eGTi is the translational component, EGRi · p~i resembles the rotational contributions, EGPi · ~l comprises the contributions of the probing element, and the term (1 + sGM ) · p~i adds the uncertainty of the length measurement. Within the VCMM software the uncertainty errors are concatenated using a systematic and a stochastically changing contribution, where the latter is evaluated during the VCMM’s runtime according to the specified uncertainties. 3.2. Thermal deformation of the machine Besides the CMM’s static geometry errors, additional contributions result from thermal effects, such as spatial and temporal temperature gradients or variations of the mean temperature. The information about the strength of these effects is generally obtained from long-term observations of the CMM and its environment. The implementation of the VCMM considers the thermal expansion of the scales and the guideways as well as the thermallyinduced straightness errors, rotational errors, and squareness errors. Besides changes in the mean temperature of the measurement environment, the variation of thermal gradients within the CMM and the measurement room and the uncertainty of the temperature measurement are considered. With respect to the thermal expansion of the scales, the VCMM distinguishes between CMM with and without temperature compensation. For the latter, the deviation from the mean temperature and temperature of the scale are considered to contribute proportional to both the length of travel and the coefficient of thermal expansion. The uncertainty of the determination of the thermal expansion coefficient requires the addition of another term. Following (1), the uncertainty arising from thermally-induced deformation can be described by the model equation ~eTi = ~eTTi + ETRi · p~i + ETPi · ~l . The vector ~eTTi combines the thermally-induced straightness deviations, while the terms ETRi · p~i and ETPi · ~l add the pitch, yaw and squareness errors without and with consideration of the vector of the stylus, respectively.

198

3.3. Probing system In addition to the deviations resulting from geometric errors the probing process itself is also a source of errors. The VCMM implementation is capable of simulating the systematic and stochastic contributions for contacting probing systems in discrete-point or scanning mode. The simulated probing error ~ePj for the measurement point ~xi in the kinematic machine coordinate system is described by the model equation   ~ePj (~xi ) = fstyj eBasej + eradialj + erandomi · ~ni + eMPj · ~tj + ~eTj + Rj · ~l , with i being the index of the measurement point and j indicating the simulation cycle, respectively. The model includes the influences of the stylus fstyj , the base characteristics of the probing system eBasej , the radius deviations of the probing sphere eradialj , and a random contribution erandomi in the direction of the surface normal vector ~ni of the workpiece. Furthermore, contributions of the multi-probe positioning deviation eMPj and the translational and rotational influences of the exchange of the probe or the probing system ~eTj and Rj · ~l are considered. 3.4. Workpiece influence The workpiece causes deviations influencing the measurement uncertainty as it expands and shrinks due to the thermal expansion coefficient of the material. Additionally, all real workpieces inherit form deviations, such as surface roughness and waviness. 3.4.1. Thermal effects of the workpiece Most available CMM are capable of measuring the workpiece’s temperature, and thus, use information about the material’s coefficient of thermal expansion αW,a to compensate for thermal expansion. However, depending on the quality of the compensation, a residual position deviation eP with its Cartesian components   eP,{xyz} = p{xyz} αW,{xyz} ∆TW,{xyz} + ∆αW{xyz} TW,{xyz} − T0

remains uncorrected. On the one hand it depends on the deviation of the workpiece temperature TW from the reference temperature of T0 = 20 ◦ C and on temperature fluctuations of the temperature measurement system ∆TW . On the other hand, additional deviations arise from deviations of the value αW from the considered expansion coefficient.

199

3.4.2. Form deviations and roughness of the workpiece The VCMM considers the contributions of the surface roughness to the measurement uncertainty solely stochastically. The deviation is varied for each individual measurement point independently according to a given probability distribution function. The user is guided by predefined measurement and analysis procedures to help determine the required input parameters. However, to obtain suitable parameters representing the surface properties determination procedures at real surfaces must be performed. Particularly, the effect of filtering depending on the size of the chosen probing sphere plays an important role and has to be considered appropriately. Therefore, resulting uncertainty contributions have to be determined for each combination of probing sphere and surface characteristic individually 8 . Furthermore, form deviations of the workpiece or of the probed feature contribute to the measurement uncertainty. The strength of their influence depends on the chosen measurement strategy and the geometric specifications of the feature. In general, all features have to be captured entirely in order to obtain a complete picture of the real geometry. However, in practice only a subset of the entire surface can be assessed. Consequently, in this case the measurement uncertainty is underestimated unless the influence of form deviations is considered separately. The VCMM provides the option of simulating the probed features including their oversampling. This is achieved by generating additional measurement points and profiles to simulate the complete sampling of the feature. In addition, the features are varied by adding synthetic form deviations resulting in different measurement data for the same feature. The difference between the results is taken into account for the determined measurement uncertainty. Modeling of realistic form deviations is very costly and only feasible, if the form deviations of the entire workpiece are known. In order to reduce the complexity, the VCMM treats the form deviations in a different manner. In their respective local feature-specific reference coordinate system, the measurement points are distorted using a linear combination of harmonic oscillations, like e.g.,   m X k · xi Ak · cos ϕk − xi = xi + λk k=1

for the x coordinates. The amplitudes Ak could be extracted from tolerances given by the technical drawings, while for the determination of the wavelengths λk detailed knowledge about the production process itself should be taken into account. The VCMM varies these parameters and the

200

phase ϕk within a wide range of values covering a broad spectrum of possible form deviations. However, it should be noted that the VCMM approach is not meant to estimate the real form deviations, but to provide an additional measurement uncertainty contribution arising from undersampling of surfaces with possible form deviations within a given parameter space. The simulated uncertainty component is decreasing, if the measurement procedure covers the real form deviations more accurately. If the form deviations are completely covered, the analysis using the VCMM simulations does not add any additional uncertainty contribution. In reverse, any undersampling of the feature using only few individual points adds the superposition of the amplitudes to the uncertainty computation. 3.5. Scanning In scanning measurements, the probe is moved along the surface of the workpiece, while constantly touching it, and thus allowing measurement points to be captured continuously. This method enables users to reduce the measurement duration, and is therefore of increasing importance for typical applications in industry. However, the procedure of scanning data collection requires consideration of additional, scanning-specific uncertainty contributions. Forces acting during the scanning process may generally alter the measurement points. The reasons for the additional contributions are different behavior of the probing system and the dynamic behavior of the machine structure compared to measurements of individual points. There are different ways to implement the additional uncertainty contributions 9 . Extensive studies of state-of-the-art CMM using scanning technology have shown that systematic and random measurement deviations are typically of the same order. Therefore, the VCMM treats uncertainty contributions from tactile scanning as additional random probing uncertainties perpendicular to the workpiece’s surface. These stochastic uncertainties can easily be determined empirically for combinations of styli, probing sphere radii, scanning speeds, and probing forces using reference standards. They have to be derived separately using both, a calibrated spherical standard and a real surface of the workpiece to be measured. 3.6. Statistics The calculations within in the first version of PTB’s VCMM were operated using a fixed number of simulation runs, typically of the order of 200.

201

However, it was observed that the stability of the computed measurement uncertainty was insufficient for some measurement tasks. To allow stability testing for all quantities using a single stability threshold value, a normalized stability criterion was introduced into the VCMM: ∆s (z) s (z)

2

2


n, a point estimate of σ 2 is given by σ ˆ 2 = F (a)/m − n. In this case, we can associate with a the variance matrix Vˆa = σ ˆ 2 (J ⊤ J)−1 . 2.2. Bayesian inference In a Bayesian setting 3 the knowledge about α and φ derived from y is encapsulated in the posterior distribution p(α, φ|y) where p(α, φ|y) is such that p(α, φ|y) ∝ p(y|α, φ)p(α, φ), where p(α, φ) is the prior distribution for α and φ, encapsulating what is known about α and φ prior to observing the data (or from other sources independent from y). We employ a noninformative prior p(α) ∝ 1 for α but an informative, possibly vague, gamma prior for φ of the form φ ∼ G(m0 /2, m0 σ02 /2). The term σ02 is the prior estimate of 1/φ and m0 ≥ 0 is a measure of the strength of belief in this prior estimate, the larger m0 , the greater the belief. For m0 = 0, the prior for φ reduces to its Jeffrey’s prior p(φ) ∝ 1/φ. The posterior distribution corresponding to priors above is given by    φ (2) p(α, φ|y) ∝ φ(m+m0 )/2−1 exp − m0 σ02 + F (α) . 2 The posterior distribution p(α, φ|y) can be marginalised analytically with respect to φ and the resulting marginalised posterior distribution p(α|y) is such that  −(m+m0 )/2 p(α|y) ∝ m0 σ02 + F (α) . (3) 3. The Metropolis-Hastings algorithm Suppose we wish to sample {aq } from a target distribution p(a). Given a draw aq−1 , a proposed draw a∗ for the next member of the sequence is generated at random from a jumping distribution p0 (a∗ |aq−1 ). Then aq is set to a∗ with acceptance probability Pq = min{1, rq } rq =

p(a∗ )p0 (aq−1 |a∗ ) . p(aq−1 )p0 (a∗ |aq−1 )

If a∗ is not accepted, then aq is set to aq−1 .

(4)

214

The acceptance scheme in (4) guarantees that there is a M0 > 0 such that the set {aq : q ≥ M0 } is a sample from the target distribution. M0 is known as the burn-in length. Means, standard deviations, coverage regions, etc., can be determined from this sample as for the case of the Monte Carlo Method (MCM) described in GUM 4,5 Supplements 1 and 2. For MCM, the samples generated are independent so the rate of convergence of the sample mean to the distribution mean is known. For MCMC algorithms, there are the issues that i) M0 is unknown in advance and ii) the sampling is generally not independent so that the rate of the convergence sample statistics to the corresponding distribution parameters is not known. For this reason, multiple chains with different starting points are generated and behaviour of the chains relative to each other are used to judge the degree of convergence. We have implemented a convergence assessment scheme described in Gelman et al. 3 . For each parameter, the variance between chains B and the variance within a chain W are computed. A convergence ˆ a function of B and W , is then calculated and represents the statistic R, potential reduction in the estimate of the standard deviation of the posterior ˆ will approach 1 from above distribution as M → ∞. It is expected that R ˆ and a value of R close to 1 indicates convergence. Jumping distributions for a Metropolis-Hastings algorithm are broadly of two types, random walk and independence chains. A random walk scheme often has the form a∗ = aq−1 + eq where eq is a draw from a fixed distribution. For random walk chains, we want the chain to take sufficiently large steps so that it can quickly sample from all the areas of significant probability mass. On the other hand, the steps should not be so large that they almost always represent a step into areas of very low probability mass and are hardly ever accepted. Good practice suggests that the step size should be scaled in order to achieve an acceptance rate of between 15 % and 30 %. If the target distribution represents significant correlation between parameters, then the jumping distribution should also have a similar correlation. An independence chain scheme often has the form a∗ = eq , where eq is a draw from a fixed distribution p0 . For an independence chain, we would like the distribution p0 to be as close as possible to the target distribution so that there is little autocorrelation in a chain with the consequence that sample statistics converge as quickly as possible to the corresponding distribution parameters. If p0 has low probability mass in regions where p has significant probability mass, it may take a long time for the chain to converge to a representative sample from the target distribution. For this reason, it is

215

usually beneficial to have the approximating distribution p0 somewhat more dispersed than the target distribution. 4. Metropolis-Hastings sampling schemes associated with nonlinear least-squares regression We consider two types of MH algorithms, random walk chains and independence chains, to sample from each of the posterior distributions p(α, φ|y) and p(α|y) given by (2) and (3), respectively, giving four algorithms in total. These four algorithms have been implemented in MATLAB, forming a package NLSSMH. All four algorithms are derived by linearising the posterior distributions around the nonlinear least-squares estimates of parameters, a. The two sets of algorithms essentially involve the same form of linearisation to produce a proposal scheme. If the linearisation is a good approximation to the nonlinear function then we expect an independence chain to be the more effective sampling scheme in terms of convergence of the sample statistics to the distribution parameters. However, if the linearisation is not a particularly good approximate, the performance of an independence chain will be degraded. A random walk chain tends to cope better with less good approximations and also with less good starting points. 4.1. Gaussian random walk chain A standard approach for approximating a distribution by a multi-variate Gaussian is to determine a quadratic approximation to the log of the distribution about its model. We apply could this approach to p(α, φ|y) but we work instead with p(α, τ |y) where τ = log φ since p(τ |y) is more likely to be Gaussian-like. We have    φ p(α, τ |y) = |φ|p(α, φ|y) ∝ φ(m+m0 )/2 exp − m0 σ02 + F (α) , (5) 2 with φ = eτ , is likely to be similar to a multivariate Gaussian. We note that the nonlinear least-squares solution a gives the values of the parameters α at the mode. This approach leads to a jumping proposal of the form       ∗  eq eq a aq−1 ∈ N(0, κV ), (6) , + = sq sq tq−1 t∗ where κ is a scaling parameter chosen to achieve a suitable acceptance rate.

216

4.2. Normal-gamma independence chain If the functions hi (α) are linear in α, then the posterior distribution p(α, φ|y) is a normal-gamma distribution. This suggests approximating p(α, φ|y) by the normal-gamma distribution determined by linearising hi (α) about the least-squares estimate a. This approach leads to the approximating density p0 (α, φ|y) ∝    φ φ(m+m0 )/2−1 exp − m0 σ02 + F (a) + (α − a)⊤ J ⊤ J(α − a) . (7) 2 Using the product-rule, p0 (α, φ|y) ∝ p0 (φ|y)p0 (α|φ, y) which can be evaluated analytically to be φ|y ∼ G(ν/2, ν σ ¯ 2 /2),

α|φ, y ∼ N(a, φ−1 (J ⊤ J)−1 ),

(8)

where ν = m + m0 − n and σ ¯ 2 = (m0 σ02 + F (a))/ν. Thus, sampling from p0 (α, φ|y) can be achieved by sampling from a gamma distribution and then sampling from a Gaussian distribution. 4.3. Multivariate t random walk If the posterior distribution of φ = 1/σ 2 is not of interest, samples from the marginal distribution of α in Equation (3) can be generated. The linearised posterior distribution in Equation (7) can be marginalised analytically with respect to φ to give the approximate marginalised posterior p0 (α|y) for α given by the t-distribution tν (a, σ ¯ 2 (J ⊤ J)−1 ), leading to the proposal scheme a∗ = aq−1 + eq ,

eq ∈ tν (0, κ¯ σ 2 (J ⊤ J)−1 ).

(9)

4.4. Multivariate t independence chain An alternative to a random walk chain, we can regard the multivariate t-distribution p0 (α|y) as an approximation to p(α|y) in an independence chain. 4.5. Automatic scale determination for random walk chains The performance of random walk chains can be sensitive to the choice of the scale parameter κ in (6) and (9). A quick way to evaluate an approximate scale value for a given acceptance probability is to draw MCMC samples of a modest chain length for a range of scale values κ and use the acceptance

217

rates to predict a κ that gives an acceptance rate in the desired range. Experience shows that the logarithm of the acceptance rate is approximately linear in κ so that a suitable value for κ can be found easily. 5. Numerical example: Exponential decay The NLSSMH software has been used to sample from the posterior distributions of the parameters α = (α1 , α2 , α3 )⊤ and φ of the exponential decay model yi = α1 exp{−α2 (xi − x0 )} + α3 + ǫi ,

ǫi ∼ N(0, φ−1 σi2 ),

given the radioactive decay data in Figure 1. 100 chains of length 25, 000 were generated using all four jumping distributions in the NLSSMH package: Gaussian random walk (GRW), normal-gamma independence chain (NGIC), t random walk (TRW) and t independence chain (TIC). Histograms of samples from the posterior distribution are shown in Figure 2. The distribution of the samples are nearly identical for each of the four sampling algorithms which is to be expected as all the algorithms generate samples from the same target distribution. The convergence indices for each parameter have been reported in the second to fifth numerical columns of Table 1, based on a burn-in length of M0 = 5000, and show very satisfactory convergence for all four algorithms. ˆ statistic for assessing convergence of parallel MCMC chains can The R

Fig. 1.

Radioactive decay data.

218

also be used to determine whether the samples generated using each of the four jumping distributions do in fact belong to the same distribution. The ˆ ∗ values are shown in the first numerical column of Table 1. The resulting R ˆ ∗ is very close to 1 for all parameters which provides additional fact that R evidence that the samples generated by the different algorithms represent samples from the same target distribution. Table 1. Convergence factor indicating that algorithms generate samples from the same distribution and convergence of chains to the target distribution for each parameter.

α1 α2 α3 φ

ˆ∗ R

GRW

NGIC

TRW

TIC

1.0001 1.0002 1.0002 1.0001

1.001 1.001 1.001 1.000

1.004 1.001 1.002 1.000

1.001 1.001 1.001 N.A.

1.001 1.000 1.001 N.A.

Fig. 2. Posterior distribution estimated from MCMC samples for model parameters and precision parameters using various jumping distributions for an exponential decay model.

219

6. Concluding remarks We have described four variants of a Metropolis-Hastings algorithm for sampling from the Bayesian posterior distribution associated with nonlinear least-squares regression problems. The algorithms exploit the information available from nonlinear least-squares estimates in order to design chains with good convergence properties. The performance of these algorithms have been demonstrated on radioactive decay data. MATLAB software that implements the Metropolis-Hastings sampling algorithm discussed in this paper have been developed at NPL and will be available to download from the NPL website. Software uses vector calculations so that multiple chains can be generated in parallel and hence a large number of samples can be generated efficiently. The software uses the parallel chains to assess their convergence. Acknowledgement This work has been supported by the UK’s National Measurement System Programme for Data Science. References 1. A. B. Forbes, Parameter estimation based on least squares methods, in Data modeling for metrology and testing in measurement science, eds. F. Pavese and A. B. Forbes (Birkh¨ auser-Boston, New York, 2009). 2. A. B. Forbes, Nonlinear least squares and Bayesian inference, in Advanced Mathematical and Computational Tools in Metrology VIII, eds. F. Pavese, M. B¨ ar, A. B. Forbes, J.-M. Linares, C. Perruchet and N.-F. Zhang (World Scientific, Singapore, 2009). 3. A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari and D. B. Rubin, Bayesian Data Analysis, third edn. (Chapman & Hall/CRC, Boca Raton, Fl., 2014). 4. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation of measurement data — Supplement 1 to the “Guide to the expression of uncertainty in measurement” — Propagation of distributions using a Monte Carlo method. JCGM 101:2008. 5. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation of measurement data — Supplement 2 to the “Guide to the expression of uncertainty in measurement” — extension to any number of output quantities. JCGM 102:2011. 6. A. Gelman, J. B. Carlin, H. S. Stern and D. B. Rubin, Bayesian Data Analysis, second edn. (Chapman & Hall/CRC, Boca Raton, Fl., 2004).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 220–228)

Reduced error separating method for pitch calibration on gears F. Keller∗ , M. Stein and K. Kniel Department of Coordinate Metrology, Physikalisch-Technische Bundesanstalt (PTB), D-38116 Braunschweig, Germany ∗ E-mail: [email protected] For highly-accurate pitch measurements on gears the so-called three-rosette method is the procedure of choice 6,7,9 . The main drawback of this self-calibrating and error-separating procedure is its high measurement effort proportional to the square of the number of teeth. Therefore, a reduced method is proposed, which is based on the same error model, but requires much less measurements.

1. Introduction For a gear, a single pitch deviation is defined as the difference between the actual and the nominal value of a pitch between two successive left or right flanks. According to standard ISO 1328-1, this value is measured as arc length with respect to a certain reference circle. If the teeth are numbered from 0, 1, . . . , N − 1, the single pitch deviation aj with 0 ≤ j ≤ N − 1 is measured between the teeth j − 1 and j. Here and in the following, indices referring to the number of teeth are always considered modulo N , so a0 is measured between tooth N − 1 and 0. Since the pitches of a gear sum up to 2πR, where R is the radius of the reference circle, the pitch deviations PN −1 aj must sum up to zero, i.e. j=0 aj = 0. Cumulative pitch deviations Ak can be recursively defined by Ak+1 = Ak + ak+1 and some initial value A0 . The usual choice is A0 = a0 , so Pk Ak = However, other choices are possible: j=0 aj and AN −1 = 0. Throughout this paper all cumulative pitch deviations are defined such PN −1 that k=0 Ak = 0. This paper will cover the mathematical aspect of an error separating method for pitch calibrations, which allows to separate systematic errors of the measuring instrument from the pitch deviations of the gear. Such error separation methods are known at least since the end of the 19th century. For a historical review of the subject please consult the paper by 220

221

Probst 8 and the references therein, while for the experimental verification by measurements on coordinate measuring machines we would like to refer to the paper 5 . 2. The error model When cumulative pitch deviations of a gear are measured, the measurement results are sums of pitch deviations Ak +Ck of the gear and of the measuring instrument (e.g. positioning errors of a rotary table). In order to separate these deviations from each other the gear has to be measured in different relative angular positions with respect to the measurement system. Given a gear with N teeth, we choose a number 2 ≤ R ≤ N and the relative positions 0 ≤ q1 < q2 < . . . < qR ≤ N − 1. Then the gear has to be measured in the relative angular positions 2π N qr for r = 1, 2, . . . , R between the gear and the measuring instrument. If R = N we obtain the threerosette method as described e.g. in the papers 3,6,7 , where the reader can also find detailed information about the measurement setup. The reduced method, i.e. for R < N , appears in Debler 2 in the context of interferometric angle graduation measurements. The measured cumulative pitch deviation Wjqr of tooth j in the relative position qr is now given by Wjqr = Aj + Bqr + Cj−qr + K + ejr .

(1)

Here Bqr denotes an additional pitch deviation of the relative angular position between the gear and the measurement system, and K is a leveling value which is necessary since we do not impose any leveling condition on the individual pitch deviation measurements like for example PN −1 j=0 Wjqr = 0 or WN −1,qr = 0. Furthermore, ejr is a random measurement error. The measurement result for the cumulative pitch deviations A = (A0 , A1 , . . . , AN −1 ) is then obtained by minimising N −1 X R X j=0 r=1

e2jr =

N −1 X R X j=0 r=1

(Wjqr − Aj − Bqr − Cj−qr − K)2

(2)

with respect to the parameters (A, B, C, K), subject to the constraints N −1 X j=0

Aj =

R X r=1

Bqr =

N −1 X

Cj = 0.

(3)

j=0

At this point the question arises if the values for A are uniquely determined by the minimising condition given above. To discuss this issue, define for

222

0 ≤ q1 < q2 . . . < qR ≤ N − 1 with 2 ≤ R ≤ N the linear maps and

φ = φq1 ,...,qR : R2N +R+1 −→ RN ×R ∼ = RN R φ′ = φ′ : R2N +R+1 −→ RN ×R × R3 ∼ = RN R+3 q1 ,...,qR

(4) (5)

by φ(A, B, C, K)ir = (Ai + Bqr + Ci−qr + K) for i = 0, 1, . . . , N − 1 and r = 1, 2, . . . , R, and further ! N −1 R N −1 X X X ′ (6) Ci . B qr , Ai , φ (A, B, C, K) = φ(A, B, C, K), i=0

r=1

i=0

(The isomorphism RN ×R ∼ = RN R is obtained by writing all columns of a matrix below each other.) With the previous definition (4), the best-fit values for the parameters A, B, C and K are obtained by minimising the expression kφ(A, B, C, K) − Y k22 =

N −1 X R X i=0 r=1

(φ(A, B, C, K)iqr − Wiqr )2

(7)

subject to the constraints (3). Here φ is considered as a map into RRN , and Y denotes the vector Y = (W0,q1 , . . . , WN −1,q1 , . . . , W0,qR , . . . , WN −1,qR )t . Equivalently, the parameters are obtained by minimising the expression kφ′ (A, B, C, K) − Y ′ k22

(8)

with Y ′ = (Y t , 0, 0, 0)t . One can show that this automatically implies the constraints (3). Proposition 2.1. We have dim(ker(φ′ )) = gcd(q2 − q1 , q3 − q1 , . . . , qR − q1 , N ) − 1.

(9)

For the proof we need the following lemma. Lemma 2.1. Let d1 , . . . , dm ∈ N>0 and let (γk )k∈Z be a sequence with γk = γk+di for all k ∈ Z and i = 1, . . . , m. Then γk = γk+d for all k ∈ Z, where d = gcd(d1 , . . . , dm ). Proof. It follows that γk = γk+β1 d1 +...+βm dm for all k ∈ Z and β1 , . . . , βm ∈ Z, and thus γk = γk+d for all k ∈ Z with d = gcd(d1 , . . . , dm ). Proof of Proposition 2.1. Let (A, B, C, K) ∈ ker(φ′ ), hence Ai + Bqr + Ci−qr + K = 0

(10)

223

∀i = 0, . . . , N − 1 and ∀r = 1, . . . , R together with the constraints (3). Summation of the Equations (10) and using (3) yields N −1 X i=0

(Ai + Bqr + Ci−qr + K) = N · Bqr + N · K = 0

(11)

PR for all r = 1, . . . R and further N · r=1 Bqr + R · N · K = R · N · K = 0, thus K = 0 and hence Bqr = 0 for all r = 1, . . . R. It follows that Ai + Ci−qr = 0 for all i = 0, . . . , N − 1 and r = 1, . . . , R, hence A is already determined by C. Moreover, we get Ci−qs = Ci−qr or Ci = Ci+(qs −qr ) for all i = 0, . . . , N − 1 and r, s = 1, . . . , R. Thus C can be considered as a sequence with periods qr − qs ∀r, s with r > s and also with period N . With Lemma 2.1 it follows that C has also the period d = gcd(q2 − q1 , . . . , qR − q1 , q3 − q2 , . . . , qR − q2 , . . . , qR − qR−1 , N )

= gcd(q2 − q1 , . . . , qR − q1 , N ). (12) PN −1 Since we require that i=0 Ci = 0, there are d − 1 linear independent such d-periodic sequences, and therefore dim(ker φ) = d − 1. Suppose now that gcd(q2 − q1 , . . . , qR − q1 , N ) = 1. The calculation of the parameters that minimise expression (8) is straightforward: Let M ∈ R(N R+3)×(2N +R+1) be the matrix associated to the map φ′ : R2N +R+1 → RN R+3 , then the parameters are given by (A, B, C, K)t = (M t · M )−1 · M t · Y ′ .

(13)

Note that ker φ′ = 0 due to our assumptions on q1 , . . . , qR , and therefore the matrix M has full rank and the inverse (M t · M )−1 exists. On the other hand, if gcd(q2 − q1 , . . . , qR − q1 , N ) > 1, the matrix M is rank-deficient and the solution to our minimisation problem is not unique. In this case an error separation is not possible. Throughout the rest of this paper we will thus always assume that gcd(q2 − q1 , . . . , qR − q1 , N ) = 1. 3. Solution Another way to obtain the solution is by differentiating Equation (2) with respect to the parameters and equate the derivatives to zero. Together with the constraints (3) this easily leads to K = W :=

N −1 R 1 XX Wjqr N R j=0 r=1

and

B qr =

N −1 1 X (Wjqr − W ). N j=0

(14)

224

For A and C we get the following system of linear equations:

R X

R · Aj

+

Aj+qr

+

Cj−qr

=

Ej

for j = 0, 1, . . . , N − 1

(15)

=

Fj

for j = 0, 1, . . . , N − 1

(16)

r=1

r=1

N −1 X

R X

R · Cj

=

Aj

0

(17)

j=0

PR PR where Ej = r=1 Wj+qr ,qr − R · W for r=1 Wjqr − R · W and Fj = j = 0, 1, . . . , N − 1. We can simplify this further to a system only for the parameters A: R

R · Aj −

R

R

1 X 1 XX Aj+qr −qs = Ej − Fj−qs R r=1 s=1 R s=1

(18)

for j = 0, 1, . . . , N − 1, together with the constraint (17). In order to write this as a matrix equation, we make the following two definitions. (See also the work of Debler 2 and Schreiber 9 .) Definition 3.1. Let 0 ≤ q1 < q2 < . . . < qR ≤ N − 1, then define dk = dk (q1 , . . . , qR , N ) = ♯{(r, s) | 1 ≤ r, s ≤ R, qr − qs = k mod N } (19) for k = 0, 1, . . . , N − 1. (Here ♯ denotes the number of elements of a set.) Let further the matrix D = D(q1 , . . . , qR , N ) be defined by Djk = dk−j for j, k = 0, 1, . . . , N − 1. Note that due to the definition of D = D(q1 , . . . , qR , N ) via Djk = dk−j with d ∈ RN the matrix D is circulant. See for example Chao 1 for more details on circulant matrices. The next lemma collects some easy facts about the matrix D. Lemma 3.1. PN −1 (1) d0 = R and k=0 dk = R2 . (2) dk = dN −k for k = 0, 1, . . . , N − 1, i.e. the matrix D is symmetric. (3) The Fourier coefficients dˆ of d are given by 2    X  X R R R X N −1 X 2πi 2π 2π q j r ˆ eN cos jk = (qr − qs )j = dk cos dj = . N N r=1 r=1 s=1 k=0

Hence all dˆj are real, and we have dˆj = dˆN −j for all j = 1, . . . , N − 1.

225

PN −1 (4) dˆ0 = R2 and j=0 dˆj = N d0 = RN .

Definition 3.2. For 0 ≤ q ≤ N − 1 define the N × N matrix Mq by Mq = (δj,k+q mod N )j,k ,

(20)

where δij denotes the Kronecker delta. Let further the N × (RN ) matrix M be given by ! R R 1 X 1 X (21) Mqr −q1 , . . . , I − Mqr −qR , M= I− R r=1 R r=1

with I = (δij ) the N × N identity matrix.

A short calculation shows then that (MY )j = Ej − R1 PN −1 j = 0, 1, . . . , N − 1. Note further that D = k=0 dk Mk .

PR

s=1

Fj−qs for

Theorem 3.1. Let I = (δij ) denote the N × N identity matrix, and let the N × N matrix L be defined by Ljk = 1 for 0 ≤ j, k ≤ N − 1. Recall further that we assume that gcd(q2 − q1 , . . . , qR − q1 , N ) = 1.

(1) The matrix Q = R·I − R1 D+L is invertible. More precisely, the inverse U = Q−1 is given by Uij = uj−i with  2πi N −1 N −1 R X e N jk R X cos 2π 1 1 N jk uk = 2 + = 2+ (22) N N N N R2 − dˆj R2 − dˆj j=1

j=1

for k = 0, 1, . . . , N − 1. (2) The solution A to the linear system given by the Equations (18) and constrain (17) is given by −1  1 A= R·I − D+L · M · Y. (23) R For the proof we need the next lemma. Lemma 3.2. Let 0 < a1 < a2 < . . . < aR < N be integers with gcd(a1 , . . . , aR , N ) = 1. Then kar = 0 mod N for all r = 1, 2, . . . , R implies k = 0 mod N . Proof. Since gcd(a1 , . . . , aR , N ) = 1 there exist β1 , . . . , βR , βR+1 such that 1 = β1 a1 + β2 a2 + . . . + βR aR + βR+1 N. If we consider this equation modulo N and multiply by k, this leads to k = β1 ka1 + β2 ka2 + . . . + βR kaR = 0.

(24)

226

Proof of Theorem 3.1. Note that Q is a symmetric circulant matrix, i.e. Qij = qj−i for i, j = 0, 1, . . . , N − 1 with q ∈ RN the first row of Q. It is well known that the inverse of a circulant matrix can easily be calculated with the help of Fourier transformation, see e.g. Chao 1 . However, for the convenience of the reader we will sketch the proof. If U = Q−1 exists, U P P 2πi satisfies δjm = k Ujk Qkm = k Ujk qm−k . Let ω = e− N , then it follows for 0 ≤ j, n ≤ N − 1 that X X X Ujk ω kn Ujk qm ω mn ω kn = qˆn Ujk qm−k ω mn = ω jn = with qˆn =

qˆn =

m qm ω

X m

=

(

mn

, i.e. 1 X 1 dm + 1)ω mn = R − dm ω mn + N δn,0 R R m

(Rδm,0 −

for n = 0

N R−

k

k,m

k,m

P

1 ˆ R dn

(25)

for n = 1, . . . , N − 1.

P Let 1 ≤ n < N . Since dm ≥ 0 for 0 ≤ m ≤ N −1 and m dm = R2 , we have P dˆn = m dm ω mn = R2 if and only if mn = 0 mod N for all m with dm 6= 0, i.e. for all m = qr − qs mod N with r, s = 1, . . . , R. With Lemma 3.2 it follows, that this is not possible if gcd(q2 − q1 , . . . , qR − q1 , N ) = 1. We hence have qˆn 6= 0 for all 0 ≤ n ≤ N − 1 and it follows for 0 ≤ j, m ≤ N − 1 that ! X X X X 1 n(k−m) kn N Ujm = Ujk ω = Ujk ω ω −nm = ω −(m−j)n . q ˆ n n n k,n k P 1 −kn 1 Define uk = N n qˆn ω , then Uij = uj−i , and further  N −1 R X cos 2π 1 N kn uk = 2 + . (26) N N R2 − dˆn n=1

Here we used that dˆk = dˆN −k for all k = 1, . . . , N −1, so that the imaginary terms in u disappear. This finishes the proof of the first part of the theorem. From the definition of D = D(q1 , . . . , qR , N ) we can write R R X X r=1 s=1

Aj+qr −qs =

N −1 X k=0

dk Aj+k =

N −1 X k=0

dk−j Ak =

N −1 X k=0

Djk Ak = (D · A)j ,

hence Equation (18) can be written as !   R 1 1 X = MY. Fj−qs RI − D A = Ej − R R s=1 j

(27)

227

Adding Equation  (17) to every line of the linear system given by (27) yields RI − R1 D + L A = MY . Since RI − R1 D + L is invertible and since we already know that there exists a unique solution for A under our assumption on q1 , . . . , qR , this solution must be given by  −1 1 A = RI − D + L MY. (28) R Equation (17) is then automatically fulfilled. 4. Measurement uncertainty Suppose that the covariance matrix of the measurement data Y is given by the diagonal matrix VY = σY2 I. Since A = U MY , the covariance matrix of A is given by VA = σY2 U MMt U t = σY2 U MMt U . A little calculation shows now that MMt = RI − R1 D = Q−L where Q = RI − R1 D +L = U −1 and it follows further U MMt U = U − U LU = U − N12 L. In the last step P P P P (LU )ij = k Ukj = k uj−k = k uk = N1 k,n qˆ1n ω −kn = qˆ10 = N1 and thus LU = U L = N1 L was used. Therefore the covariance matrix of the pitch deviations A is given by VA = σY2 U − N12 L , with diagonal elements   N −1 1 1 R X 2 σA = σY2 u0 − 2 = σY2 . (29) 2 N N R − dˆj j=1

PN −1 Since for fixed R we have j=1 dˆj = R(N − R) independent of the actual choice of q1 , . . . , qR , a lower bound for σA can be calculated by assuming that dˆj = NR−1 (N − R) for j = 1, . . . , N − 1. This leads to N −1 √ . (30) N R−1 In general though, this bound can not be reached. Good choices for q1 , . . . , qR with σA near the lower bound are such that the differences qr − qs mod N take at best all values in 1, 2, . . . , N −1 with approximately the same occurrence, so that the values d1 , . . . , dN −1 differ not too much from each other. σA ≥ σ Y

5. Outlook Test of software for the calculation of the pitch deviations using the proposed reduced error separating method will become part of the TraCIM software verification system 4,10 . Software developers for measuring software will hence be able to get a verification certificate for this method.

228

References 1. C. Chao. A remark on symmetric circulant matrices. Linear Algebra and its Applications, 103:133–148, 1988. 2. E. Debler. Verk¨ urztes Dreirosetten-Meßverfahren zur Bestimmung von Winkelabweichungen. Z. Vermess. wes, 102:117–126, 1977. (in German). 3. W.T. Estler. Uncertainty Analysis for Angle Calibrations Using Circle Closure J. Res. Natl. Inst. Stand. Technol., 103:141-151, 1998. 4. A.B. Forbes, I.M. Smith, F. H¨artig, and K. Wendt. Overview of EMRP Joint Research Project NEW06: “Traceability for Computationally Intensive Metrology”. In Proc. Int. Conf. on Advanced Mathematical and Computational Tools in Metrology and Testing (AMCTM 2014), St. Petersburg, Russia, 2014. 5. F. Keller, M. Stein, and K. Kniel. Ein verk¨ urztes Rosettenverfahren zur Kalibrierung von Teilungsabweichungen. 6. Fachtagung Verzahnungsmesstechnik 2017, VDI-Berichte 2316, VDI-Verlag, 2017. (in German). 6. K. Kniel, F. H¨ artig, S. Osawa, and O. Sato. Two highly accurate methods for pitch calibration. Measurement Science and Technology, 20(11):115110, 2009. 7. R. Noch and O. Steiner. Die Bestimmung von Kreisteilungsfehlern nach einem Rosettenverfahren. Z. Instr. kde, 74:307–316, 1966. (in German). 8. R. Probst. Self-calibration of divided circles on the basis of a prime factor algorithm. Measurement Science and Technology, 19(1):015101, 2008. 9. O. Schreiber. Untersuchung von Kreistheilungen mit zwei und vier Mikroskopen. Z. Instr. kde, 6:1–5, 1886. (in German). 10. K. Wendt, M. Franke, and F. H¨artig. Validation of CMM evaluation software using TraCIM. In Proc. Int. Conf. on Advanced Mathematical and Computational Tools in Metrology and Testing (AMCTM 2014), St. Petersburg, Russia, 2014.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 229–234)

Mathematical and statistical tools for online NMR spectroscopy in chemical processes S. Kern, S. Guhl, K. Meyer, L. Wander, A. Paul, W. Bremser and M. Maiwald* Bundesanstalt für Materialforschung und -prüfung (BAM), 12489 Berlin, Germany *E-mail: [email protected] www.bam.de Monitoring chemical reactions is the key to chemical process control. Today, mainly optical online methods are applied, which require excessive calibration effort. NMR spectroscopy has a high potential for direct loop process control while exhibiting short setup times. Compact NMR instruments make NMR spectroscopy accessible in industrial and harsh environments for advanced process monitoring and control, as demonstrated within the European Union’s Horizon 2020 project CONSENS. We present a range of approaches for the automated spectra analysis moving from conventional multivariate statistical approach, (i.e., Partial Least Squares Regression) to physically motivated spectral models (i.e., Indirect Hard Modelling and Quantum Mechanical calculations). By using the benefits of traditional qNMR experiments data analysis models can meet the demands of the PAT community (Process Analytical Technology) regarding low calibration effort/calibration free methods, fast adaptions for new reactants or derivatives and robust automation schemes. Keywords: Online NMR Spectroscopy; Process Control; Partial Least Squares Regression; Indirect Hard Modelling; Quantum Mechanics; First Principles.

1. Integrated Process Design — Need for Smart Field Devices Novel concepts in the field of process engineering and in particular process intensification are currently promoted for analysis and design of innovative equipment and processing methods [1, 2]. This leads to substantially improved sustainability, efficiency, environmental performance, and alternative energy conversion. 1.1. Continuous Pharmaceutical Reaction Step in a Modular Plant Compared to traditional batch processes, intensified continuous production gives admittance to new and difficult to produce compounds (see reaction Fig. 1 as an example), leads to better product uniformity, and drastically reduces the consumption of raw materials and energy. Flexible (modular) chemical plants can produce various products using the same equipment with short down-times 229

230

between campaigns, and quick introduction of new products to the market. Typically, such plants have smaller scale than plants for basic chemicals in batch production but still are capable to produce kilograms to tons of specialty products each day. Consequently, full automation is a prerequisite to realize such benefits of intensified continuous plants. In continuous flow processes, continuous, automated measurements and tight closed-loop control of the product quality are mandatory. If these are not available, there is a huge risk of producing large amounts of Outof-Spec (OOS) products. In pharmaceutical production, the common future vision is Continuous Manufacturing (CM), based on Real Time Release (RTR), i.e., a risk-based and integrated quality control in each process unit. This will allow for flexible hookup of smaller production facilities, production transfer towards fully automated facilities, less operator intervention, less down time, and end to end process understanding over product lifecycle, future knowledge, and faster product to market. It is also assumed to significantly reduce the quality control costs within a CM concept at the same time.

Fig. 1. Reaction scheme: FNB: 1- fluoro-2-nitrobenzene, Li-HMDS: Lithiumbis(trimethylsilyl) amide, NDPA: 2-nitrodiphenylamine. Aniline was also replaced by p-toluidine and p-fluoroaniline.

Figure 1 represents a given example of a pharmaceutical reaction step, within two aromats are coupled using the lithium base Li-HMDS. The reaction takes place in a 5 % (m/m) solution in tetrahydrofuran. Deviations from unknown starting material and reactant concentrations together with the precipitation of LiF will lead to severe fouling and blocking of the modules. Typically, metal organic reactants are difficult to analyze due to the sensitivity to air and moisture. Thus, this example reaction was chosen in CONSENS to develop and validate a compact NMR sensor to maintain an optimal stoichiometry during the full course of the continuous production. 2. Smart Compact NMR Spectroscopy in Process Control Monitoring specific information (such as physico-chemical properties, chemical reactions, etc.) is the key to chemical process control. The challenge within the project and its given lithiation reaction was to integrate a commercially available

231

low-field NMR spectrometer [1] from a laboratory application to the full requirements of an automated chemical production environment including robust evaluation of NMR spectral data.

Fig. 2. (a) Complete low-field NMR spectrum (43.5 MHz, single scan) and (b) aromatic region of automatically phased, baseline corrected, and shift corrected proton spectra for the lithiation reaction.

Fig. 3. Scheme of the validation set-up for monitoring of the continuous reaction unit with the compact NMR sensor (box). The lithiation reaction (Fig. 1) is continuously carried out in a thermostated 1/8” tubular reactor using syringe pumps. HF NMR spectroscopy (upper right) served as reference.

The NMR analyzer (Fig. 3, see box) is provided in an explosion proof housing of 57×57×85 cm module size and involves a compact 43.5 MHz NMR spectrometer together with an acquisition unit and a programmable logic controller for automated data preparation (phasing, baseline correction) as well as data evaluation (see section 3). Therefore, the aromatic region of the NMR spectra in Fig. 2 had to be chosen; representing higher order NMR spectra. In a first approach, a couple of semi batch reactions were performed for development of Partial Least Squares Regression (PLS-R) as well as Indirect Hard Modeling (IHM) models. Within these studies Li-HMDS was dosed stepwise to the reactants aniline and FNB in a batch reactor in order to produce spectral data

232

material along the reaction coordinate. The reaction was in parallel observed using 500 MHz high-field NMR spectroscopy as reference method. For validation purposes, the set-up depicted in Fig. 3 was used for monitoring of the continuous lithiation reaction in a thermostated 1/8” (3.175 mm OD) tubular reactor using syringe pumps. The set-up was matched to the reaction conditions of the actual plant. It was used to validate the PLS-R and IHM models as described in section 3 — again using high-field NMR spectroscopy as reference method. A considerable number of continuous experiments were performed for validation taking account for various reaction conditions by individually adjusting the flow rates of the reagents aniline, FNB, and Li-HMDS. 3. Data Analysis Methods Chemometrics for the derivation of empirical models, e.g., PLS-R or PCA (Principal Component Analysis) is available and state of the art in reaction and process monitoring. Automated applications along the life cycle are still very limited. Up to now the development of such models requires significant experimental work, i.e., producing data from several time-consuming calibration runs, ideally via experimental plans (DoE). 3.1. Physically Motivated Spectral Models The use of so-called First Principles methods along with reduction of the effort needed for these experiments is focus of ongoing research. Making use of novel sensors, like online NMR, in combination with flexible data analysis methods like IHM tremendously promote the use of novel process control concepts [1–3]. IHM model development consists of three steps: Firstly, pure component models are built upon NMR spectra of the reactants and products (Fig. 4a). Each pure component model (Hard Model) consists of a number of Lorentzian-Gaussian functions, representing the spectral peaks [3]. Within that Hard Model, the ratio of peak areas are fixed against each other. Secondly, an experimental NMR spectrum is acquired and prepared by phasing and baseline correction (Fig. 4b). Finally, this experimental spectrum is represented by the given pure component models (Component Fitting) from the beginning step by iterative fit routines aiming at minimized residues (Fig. 4c). Within this step, selected parameters of Lorentzian-Gaussian functions such as position, height, or width can be optimized. This allows for slight line shifts or other non-linear effects along the course of the reaction, which likely occur in NMR spectra of technical mixtures and make IHM the method of choice.

233

Figure 5 depicts how physically motivated spectral models based on quantum mechanical calculations can be used to derive the pure component models, shown for aniline. Therefore, line spectra from spin calculations were adopted by empirically fitting their line shape to the real pure component spectrum.

Fig. 4. Data analysis scheme for Indirect Hard Modeling (IHM) of the aromatic region (see Fig. 2) of the NMR spectrum. Spectra (a), (b) and (c) from top to bottom, respectively.

Fig. 5. Calculation of pure component model based on spin calculations (NMR Solutions, Kuopio).

4. Validation Results Figure 6 shows the amount of substance fractions observed with the LF NMR sensor and the IHM methods according to a Design of Experiments (DoE) over an observation period of 6.5 hours in comparison with HF NMR data. In some runs aniline was replaced by p-toluidine or p-fluoroaniline (Figs. 1, 3) in order to test the modular IHM approach. In all cases, the pure component models were exchangeable and worked together with the remaining models for FNB. In general, all results found by IHM are in good agreement with results from the PLS-R model as well as the HF NMR reference data.

234

Fig. 6. Amount of substance fractions observed with the LF NMR sensor of the reagents aniline and FNB and the product NDPA along an observation period of 6.5 hours together with HF NMR data. Grey areas represent points in time where pumps were not running due to cleaning or refilling.

The largest prediction uncertainties for IHM were found for aniline to be 5–7 % relative, i.e., 0.25–0.35 % absolute, or 20–30 mmol L–1 concentration deviation. As can be seen in Fig. 4a, the signals of aniline completely overlap with the further reagents causing such model deviations. IHM slightly overestimates aniline in the low concentration range during equivalent fitting of the three pure component models, thus, underestimating the product NDPA. Minimizing the residues presents a closing condition for the fitting process. Improving IHM for these unwanted issues is currently undertaken. Acknowledgments Funding of CONSENS by the European Union’s Horizon 2020 research and innovation programme under grant agreement N° 636942 as well as support of NMR Solutions, Kuopio, Finland, is gratefully acknowledged. References 1.

2.

3.

K. Meyer, S. Kern, N. Zientek, G. Guthausen and M. Maiwald, Process Control with Compact NMR, Trends in Analytical Chemistry 83, 39–52 (2016). M. Maiwald, P. Gräßer, L. Wander, N. Zientek, S. Guhl, K. Meyer and S. Kern, Strangers in the Night—Smart Process Sensors in Our Current Automation Landscape, Proceedings 1, 628 (2017). A. Michalik-Onichimowska, S. Kern, J. Riedel, U. Panne, R. King and M. Maiwald, “Click” analytics for “click” chemistry – A simple method for calibration-free evaluation of online NMR spectra, Journal of Magnetic Resonance 277, 154–161 (2017).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 235–240)

A new mathematical model to localize a multi-target modular probe for large volume-metrology applications D. Maisano* and L. Mastrogiacomo Dept. of Management and Production Engineering (DIGEP), Politecnico di Torino, Corso Duca degli Abruzzi 24, Turin, 10129, Italy *E-mail: [email protected] Recent studies show that the combined use of Large-Volume Metrology (LVM) systems can lead to a systematic reduction in measurement uncertainty and a better exploitation of the available equipment. This is actually possible using a recently developed modular probe, which is equipped with different typologies of targets and integrated inertial sensors. The goal of this paper is to present a new mathematical/statistical model for the real-time localization of this probe. This model is efficient, as it is based on a system of linearized equations, and effective, as the equations are weighed with respect to their uncertainty contribution. Keywords: Large-volume metrology; Multi-target probe; Real-time localization, Generalized least squares.

1. Introduction Typical industrial applications in the field of Large-Volume Metrology (LVM) concern dimensional verification and assembly of large-sized mechanical components [1]. LVM systems are usually equipped with sensors that perform local measurements of distances and/or angles [2]. Even though the existing measuring systems may differ in technology and metrological characteristics, two common features are: (i) the use of some targets to be localized, generally mounted on a hand-held probe to localize the points of interest or in direct contact with the measured object’s surface, and (ii) the fact that target localization is performed using the local measurements by sensors (e.g., through multilateration or multiangulation approaches). Recent studies show that the combined use of LVM systems can lead to a systematic reduction in measurement uncertainty and a better exploitation of the available equipment [3]. Unfortunately, the sensors of a specific LVM system are able to localize only specific targets and not necessarily those related to other systems. To overcome this obstacle, the authors have recently developed a new modular probe, equipped with targets related to different systems, and a tip in contact with the point of interest (P), which allows to localize P in a single turn [4, 5]. This probe has several innovative features: the number and typology of targets can be varied depending on the specific application, and the probe can integrate additional inertial sensors (i.e., two-axis inclinometer and compass), which are able to provide additional data. 235

236

The goal of this paper is to present a new mathematical/statistical model to localize the probe in measurements involving combinations of different LVM systems, i.e., systems equipped with sensors of different nature and metrological characteristics. In a nutshell, the model consists of a set of linearized equations that are weighted with respect to their uncertainty contribution. The remainder of this paper is organized into three sections. Sect. 2 summarizes the technical and functional characteristics of the probe. Sect. 3 illustrates the mathematical/statistical model for the probe localization. Sect. 4 summarizes the original contributions of this research, focusing on its practical implications, limitations, and future development. 2. Multi-target modular probe The probe has a modular structure. The main module, or primary module consists of a bar with a handle for the operator, two ends with several calibrated holes (in predefined positions), in which different types of secondary modules can be plugged in, and a power-supply and data-transmission system [4, 5]. There are different types of secondary modules: sphere mounts where to put spherically mounted retroreflectors (SMRs) for laser trackers; targets of different nature — such as those for rotary-laser automatic theodolites (R-LATs) or photogrammetric sensors; variable-length extensions, to be interposed between the primary module and the previous secondary modules; styli with a tip in contact with the point of interest. An important requirement is that these secondary modules are coupled on the primary module, quickly, precisely and with a certain repeatability. This requirement can be achieved by adopting different technical solutions, such as providing the calibrated holes and shafts with threads or adopting quick coupling systems with magnetic lock. The primary module has appropriate housings to lodge some integrated inertial sensors, such as two-axis inclinometer and compass, and is also equipped with a trigger for the acquisition of the point of interest: when the trigger is pressed, the probe tip is localized on the basis of the data collected by the probe targets/sensors at that time. Once the primary and secondary modules are assembled, the relative positions between the probe targets and tip can be measured using a standard coordinate measuring machine (CMM). At this stage, a local Cartesian coordinate system (oPxPyPzP) — with origin (oP) in the probe tip, and axes perpendicular to some reference planes on the surface of the primary module — can also be defined [4].

237

3. Model for probe localization In general, each i-th LVM system (Si) includes a number of distributed sensors, positioned around the measurement volume; we conventionally indicate the generic j-th sensor of Si — or, for simplicity, the ij-th sensor — as sij (e.g., si1, si2, …, sij, …). The probe includes a number of targets of different nature and a tip, in contact with the points of interest on the surface of the measured object. Tk conventionally denotes a generic k-th target mounted on the probe. Sensors can be classified in two typologies: distance sensors, which are able to measure their distance (dijk) from the k-th target, and angular sensors, which are able to measure the azimuth (ijk) and elevation (ijk) angle, which are both subtended by the k-th target. The subscript “ijk” refers to the local measurements (of distances or angles) by the ij-th sensor with respect to the k-th probe target. It is worth remarking that each ij-th sensor is not necessarily able to perform local measurements with respect to each k-th probe target, for two basic reasons:

 The communication range of the ij-th sensor should include the k-th target and there should be no interposed obstacle.  Even if a k-th target is included in the communication range of the ij-th sensor, local measurements can be performed only if they are compatible. In the case of compatibility between the ij-th sensor and the k-th target, we can define some (linearized) equations related to the local measurements: dist dist Aijk  X  Bijk  0 one eq. related to an ijth distance sensor and kth target

ang ang Aijk  X  Bijk  0 two eqs. related to an ijth angular sensor and kth target

, (1)

where X = [XP, YP, ZP, P, P,P]T is the (unknown) vector containing the spatial coordinates (XP, YP, ZP) of the centre of the probe tip (P) and the angles (P, P,P) of spatial orientation of the probe, referring to a global Cartesian coordinate system OXYZ. Matrices related to distance sensors are labelled with superscript “dist”, while those related to angular sensors with superscript “ang”. dist dist ang ang , Bijk , Aijk and Bijk contain: The matrices Aijk  the position/orientation parameters ( X 0ij , Y0ij , Z 0ij , ij, ij andij) related to

the ij-th sensor;  the communication range of the ij-th sensor should include the k-th target and there should be no interposed obstacle.  The distance (dijk) and/or angles (ijk,ijk) subtended by the k-th target, with respect to a local Cartesian coordinate system oijxijyijzij of the ij-th sensor. Since the “true” values of the above parameters are never known exactly, they can be replaced with appropriate estimates, i.e., Xˆ 0 , Yˆ0 , Zˆ 0 , ˆ ij , ˆij , ij

ij

ij

238

ˆij , resulting from initial calibration process(es), dˆijk resulting from distance

measurements, and ˆijk and ˆijk , resulting from angular measurements.

As already said, the probe can also be equipped with some integrated inertial sensors (two-axis inclinometer and compass) which are able to perform angular measurements for estimating the spatial orientation of the probe, through the following linearized equations: Aint  X  B int  0 three equations related to three angular measurements.

(2)

Matrices A and B contain local measurements of three angles (I, I,I) depicting the orientation of the integrated sensors with respect to a groundreferenced coordinate system (xIyIzI). The probe localization problem can therefore be formulated through the following linear model, which encapsulates the relationships in Eqs. (1) and (2): int

int

 B dist   A dist     ang  A  X  B   A   X   B ang   0 ,  int   int   B   A 

(3)

where blocks Adist, Aang, Bdist and Bang are defined as:

             B ang   A ang   B dist  ang ang dist  dist B ,   , , , B A A dist   Aijk   ijk    ijk   ijk                ijkI dist   ijkI ang  ijkI ang  ijkI dist

Idist and Iang being the sets of index-pair values (ijk) relating to the ij-th distance/angular sensors seeing the k-th target. We remark that all the equations of the system in Eq. (3) are referenced to a unique global Cartesian coordinate system, OXYZ. These equations therefore include the roto-translation transformations to switch from other reference systems (e.g., the local reference system related to each distributed sensor, that one related to the probe, or the ground-referenced system of the integrated probe sensors) to OXYZ. The six unknown parameters in X can be determined solving the system in Eq. (3), which is generally overdefined, i.e., there are more equations than unknown parameters: one for each combination of ij-th distance sensor and k-th target, two for each combination of ij-th angular sensor and k-th target, and three for the integrated sensors (i.e., two for the two-axis inclinometer and one for the compass). The equations of the system may differently contribute to the uncertainty in the probe localization. Four important factors affecting this uncertainty are:  Uncertainty in the position/orientation of distributed sensors ( Xˆ , Yˆ , Zˆ ,

ˆ ij , ˆij and ˆij ), resulting from initial calibration process(es);

0 ij

0 ij

0 ij

239

 Uncertainty in the local measurements ( dˆijk , ˆijk and ˆijk ) by the distributed sensors with respect to probe targets, which depends on their metrological characteristics;  Uncertainty in the relative position between the probe targets and the tip (P), which may depend on the accuracy of the manufacturing and calibration processes of the probe modules.  Uncertainty in the angular measurements ( ˆ I , ˆI and ˆ I ) by the probe’s integrated sensors, which depends on their metrological characteristics. Consequently, it would be appropriate to solve the system in Eq. (3), giving greater weight to the equations producing less uncertainty and vice versa. To this purpose, a practical method is that of Generalized Least Squares (GLS) [6], in which a weight matrix (W), which takes into account the uncertainty produced by the equations, is defined as:



W  J T   ξ J



1

,

(4)

where J is the Jacobian matrix containing the partial derivatives of the elements in the first member of Eq. (3) (i.e., A∙X – B) with respect to the parameters contained in the vector , i.e., the position/orientation of distributed sensors, the local measurements by the distributed sensors available, the angular measurements by the integrated sensors, and the relative position of the probe targets with respect to the tip. For details, see [5].  ξ is the covariance matrix of , which represents the variability of the parameters in . The parameters in  ξ can be determined in several ways: (i) from manuals

or technical documents relating to the distributed/integrated sensors in use, or (ii) estimated through ad hoc experimental tests. We remark that these parameters should reflect the measurement uncertainty of the elements of , in realistic working conditions — e.g., in the presence of vibrations, light/ temperature variations and other typical disturbance factors. By applying the GLS method to the system in Eq. (3), we obtain the final estimate of X as:





1 Xˆ  AT  W  A  AT  W  B .

(5)

For further details on the GLS method, see [6]. We remark that the metrological traceability of the probe localization is ensured by the initial calibration processes to determine the spatial position/ orientation of the distributed sensors and that one to determine the relative position of the probe targets. In fact, these processes are generally based on the use of physical artefacts (such as calibrated bars with multiple reference positions) or measuring instruments (such as CMMs), which are traceable to the measurement unit of length [1].

240

4. Conclusions This paper has described a novel mathematical/statistical model for the real-time localization of a modular and multi-target probe. The model is efficient, as it is based on a system of linearized equations, and effective, as the equations are weighed with respect to their uncertainty contribution, through the GLS method. For the model to be viable, some parameters relating to the sensors in use should be known in advance, e.g., uncertainties in the position/orientation or local measurements; this can be done through ad hoc experimental tests or using manuals or technical documentation of the measuring systems. The model is automatable and could be a key tool to promote the combined use of LVM systems. Regarding the future, we plan to extend the use of the probe from the measurement process to the distributed-sensor calibration process. Acknowledgements This research was supported by the project Co-LVM “Cooperative multi-sensor data fusion for enhancing Large-Volume Metrology applications” (E19J14000950003), financed by Fondazione CRT (Italy), under the initiative “La Ricerca dei Talenti”. References 1.

2.

3.

4. 5.

6.

Peggs, G.N., Maropoulos, P.G., Hughes, E.B., Forbes, A.B., Robson, S., Ziebart, M., Muralikrishnan, B. (2009) Recent developments in large-scale dimensional metrology. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 223(6), 571-595. Maropoulos, P.G., Muelaner, J.E., Summers, M.D., Martin, O.C. (2014). A new paradigm in large-scale assembly—research priorities in measurement assisted assembly. The International Journal of Advanced Manufacturing Technology, 70(1-4): 621-633. Franceschini, F., Galetto, M., Maisano, D., Mastrogiacomo, L. (2016). Combining multiple Large Volume Metrology systems: Competitive versus cooperative data fusion. Precision Engineering, 43: 514-524. Maisano, D., Mastrogiacomo, L. (2017) A novel multi-target modular probe for multiple Large-Volume Metrology systems, to appear in Precision Engineering. Maisano, D., Mastrogiacomo, L., Galetto, M., Franceschini, F. (2016) Dispositivo tastatore e procedimento di localizzazione di un dispositivo tastatore per la misura di oggetti di grandi dimensioni, Italian provisional patent application, serial number 102016000107650, filed on 25/10/2016. Kariya, T., Kurata, H. (2004) Generalized least squares, John Wiley & Sons, New York.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 241–248)

Soft sensors to measure somatic sensations and emotions of a humanoid robot Umberto Maniscalco and Ignazio Infantino Istituto di Calcolo e Reti ad Alte Prestazioni - C.N.R., Cognitive Robotics and Social Sensing Lab., Via Ugo La Malfa, 153, Palermo, Italy E-mail: [email protected], [email protected] www.icar.cnr.it, www.facebook.com/CRSSLAB/ www.facebook.com/ComunicazioneICAR/ The challenge faced in this work is to design and implement an artificial somatosensory system for a humanoid robot, able to produce somatic sensations and emotions starting from the measures achieved by the basic robot’s sensors. From a metrological point of view, it comes to obtain qualitative quantities (the sensations and the emotions) from a set of measurable physical ones (the sensors measures) in accordance to appropriate bio-inspired models. In such a process of measurement, semantic aspects are also involved in transforming low-level data into a richer information. The soft sensor paradigm was adopted to implement the artificial somatosensory system. The somatosensory system plays a crucial role in preserving the body from damage or accident influencing our behaviors. So, a humanoid robot owning an artificial somatosensory system would be particularly desirable. Keywords: Soft Sensors, Robotics, Cognitive, Somatosensory System.

1. Introduction The somatic sensations are bodily sensations, such as touch, pain, temperature and proprioception (i.e., the sense of the position of self and movement) 1 . The sensations play a crucial role in preserving the body from damage or accident, and naturally, influence our behaviors. Moreover, they permit us to perceive the environment. So, a humanoid robot owning an artificial somatosensory system would be particularly desirable. Indeed, also emotions play a significant role in our behaviors determining in how we relate to the world. In this work, we consider emotions at the same level of the sensations. Everyone has a subjective “measure” of sensations so, although each one of us can evaluate the intensity of the sensation he is feeling, this “measure” always remains a qualitative evaluation. The human 241

242

beings are certainly able to determine whether the pain is higher than another, but attributing to it a numeric value is much more complicated. Even though, different types of instrumentation have been developed to measure the sensations in recent years (i.e., the Sonic Palpometer to measure the intensity of the pain), the sensations remain mainly qualitative quantities. The challenge of this work was to design an artificial system for a humanoid robot, able to produce somatic sensations and emotions starting from the measures achieved by its basic sensors. From a metrological point of view, it comes to obtain qualitative quantities (the sensations and emotions) from a set of physical measurable quantities (the sensors measures) in according to appropriate bio-inspired models. In such a process of measurement, semantic aspects are also involved in transforming low-level data into meaningful information. In fact, simply measures are arranged to generate sensations that besides having a high semantic content, presume the ability of location and decoding. Current robots have many sensors measuring several variables such as motors temperatures or currents, battery conditions, the presence of objects and distances, bumps, touch and so on. For example, in our experimentations, we used the humanoid PEPPERa of SoftBank Robotics. The soft sensor paradigm allows the robot to get sensations and emotions from measures, finding innovative application of presented previous works such as 2–10 . Each soft sensor constituting the artificial somatosensory system replaces the function of the natural receptors, the nerve fibers transport function, and the somatosensory cortex elaboration. For each kind of sense or stimulus a biological model, or a model guaranteeing the safety of the robot has been adopted. In the section 2, will be reported some details about the soft sensor paradigm and how the measure determines correspondent sensations from the robot’s sensors. Section 3 will outline the whole design of a particular soft sensor and the overall functioning of the artificial somatosensory system. Finally, the last section discusses future works. 2. The Soft Sensor Paradigm “Soft sensor” is the name with which it is usually called a virtual measuring instrument. A soft sensor is essentially a software that measures or estimates a variable (also non-physical) using as input a set of variables, in a https://www.ald.softbankrobotics.com/en/cool-robots/pepper

243

some way related to the output, and a model to simulate the measuring instrument. The left side of Fig. 1 shows a soft sensor mechanism of measure. A set of available measurements constitutes the input of the soft sensor together with a “model” or a “knowledge base”. Then, an algorithm, which may also include a preprocessing step and a filtering step, uses a model or a knowledge base to estimate the desired output variable. Depending on whether the soft sensor uses a model or a base of knowledge, it can be distinguished into based on first-principles or data-driven. The first type is used when the process is well known, and it can be simulated as in some biological process as in Paulsson et al. 11 or as in some control process as in Lee and Zahrani 12 . The second type is designed starting from the experimental data, and their models are inferred by using machine-learning or, more generally, soft-computing techniques as in Liu 13 or Kadlec et al. 14 .

Fig. 1. Left side: the a schematic description of a soft sensor mechanism of measure. Right side: a generic soft sensor constituting the artificial somatosensory system.

In our framework, a special kind of soft sensor is proposed. In fact, it isn’t based on a real model, but it builds on a model inspired by a real model, in particular, inspired by a biological model. As shown in the right side of Fig. 1 the generic soft sensor involved in the artificial somatosensory system has a structure suitable to replicate the features of a particular somatosensory function. The information flows along two different paths as in the human. An ascending path conducts signals from sensors to cortex producing the “measure” of the somatic sensation or the emotions, and a downward path, from cortex to sensors operating the inhibition and/or the modulation action. The kernel of the soft sensor, represented by gears block, is designed from time to time, depending on the biological behavior that we

244

want to replicate. In the end, each “somatosensory” soft sensor constituting the artificial somatosensory system is a virtual instrument of measure that transforms the raw data archived from the robot’s basic sensors in highlevel semantic information by emulating a biological process. Moreover, each “somatosensory” soft sensor gets a measure of a non-physical quantity which otherwise could not be measured.

3. The Anxiety Soft Sensor and the Artificial Somatosensory System For compactness reasons, we report the design of the somatosensory soft sensor for the anxiety, that is just one of the various soft sensors that compose the whole system. The measures employed by the ascending path of this soft sensor are achieved by the sonar system of the robot that produces echo signals in the range [0.3m, 5m] when one or more objects are present in this area. The signal is interpreted, by the soft sensor, as a social distance that is a psychological distance, firstly described by Edward T. Hall introducing the proxemics science 15 . The proxemics that studies the human use of space and the effects on behavior, communication, and social interaction defines four different spaces. But, we consider only three of them and set their corresponding distances in the case of the robot. The Social Zone (over 1.6m), the Personal Zone (between 1.6m and 0.8m) and the Intimates Zone (under 0.8m). The models of measurements proposed and implemented as a soft sensor to evaluate the anxiety replicate the exponential trend and the character of saturation of this kind of emotion. Thus, the soft sensor realizes this bio-inspired characteristic implementing the model of charge and discharge of an RC (Resistor-Capacitor) circuit, well known in Electrotechnical. In a such a circuit, during the charge phase, the capacitor is charged up through the resistor until the voltage across it will be equal to the supply voltage. On the contrary, during the discharge phase, the current flows into the capacitor via the resistor and the voltage return to zero. The measured distance of something from the robot is considered as the supply voltage in an RC circuit, and its value can cause the growth or decrease of the anxiety function. Furthermore, to distinguish the different contribute given to anxiety emotion by the presence of objects in the different regions, the model uses two distinct time constants τ1 and τ2 with τ1 > τ2 . The anxiety function is computed in according to the equations

245

of Fig. 2. Equations one and three represent a charge phase and equations two and four represent a discharge phase. The left side of Fig. 2 shows the six possible change of states associated with the change of position of something in the three zones. State 1, for example, is reached when something goes from social area to personal area, state 5 is reached when something leaves the intimate zone to go in the social zone and so on. Each of these states is associated with the growth or decreasing of the anxiety function. Moreover, the two time-constants to make more or less rapid the rise or the decrease of the anxiety emotion depending on the appearance or disappearance of something in the Personal Zone or in the Intimates Zone. The right side of Fig. 2 reports the four equations that describe the trend of anxiety function for each state. At the states 1 and 4 is linked a slow (τ1 ) growth of the anxiety. At the state 3 is associated a fast (τ2 ) growth of the anxiety. States 2 and 5 cause a fast (τ2 ) decrease of the anxiety and finally ate the state 6 is linked a slow (τ6 ) decrease of the anxiety. The parameters inhib and mod represent the inhibition value that can be 0 or 1 and the modulation factor that can be a value in the range [0, 1]. A layer of reasoning where all different sensations and emotions are taking into account measures both these two parameters.

Fig. 2. The left side shows the six possible change of states. The right side reports the four equations that describe the trend of anxiety.

The solid blue line of Fig. 3 shows the trend of the anxiety computed according to the four previous equations linked to the six different states. The solid orange line represents the distance to which an object is detected by the robot’s sonar. The dotted horizontal lines represent the boundaries of the three zones. This figure makes evident how the anxiety function measured by the soft sensor has two different velocities of growing and decreasing depending on the presence of something into one of the three zones.

246

Fig. 3.

The trend of the anxiety computed according to the six different states.

The whole artificial somatosensory system is composed of several soft sensors similar to the one described above, and it can be thought as a complex system of virtual measure in which starting to raw data sensation and emotions are obtained by the use of bio-inspired models. The previously illustrated soft sensor uses only one robot’s sensor (the sonar) to generate anxiety function, but in the artificial somatosensory system are also involved soft sensors that use more than one robot’s sensor at the same time to produce the desired function. For example measure of angles, gyroscope and weight are combined to get proprioception information.

Fig. 4.

The layer structure of the artificial somatosensory system.

247

Figure 4 shown a layer structure to make it clear that starting from low-level data achieved directly from the robot’s sensors, the soft sensor paradigm increases the information content during the measurement process. This model of measure replies the three main aspects of a biological somatosensory system. In particular, the perceptive function, performed by the sensory receptors spread over all body is replied by the robot’s sensors and by the sampling and filtering layer. The transport function, operated by a set of fibers of different kinds each one dedicates to carry a specific kind of stimulus is obtained in the same way in the artificial somatosensory system because each soft sensor produces a distinct signal for each type of stimulus. The elaboration function fulfilled by the somatosensory cortex is obtained by the data fusion and modeling layer in which is implemented for each soft sensor the most fitting bio-inspired model. The functions of inhibition and modulation characteristic of the descending path are also replied by the inhibition layer and the modulation inhibition layer. 4. Conclusions and Future Work Concluding, this work shows how starting from basic sensors data, the soft sensor paradigm and a set of bio-inspired models can be evaluated several somatic sensations and emotions caused by sensing. The artificial somatosensory system introduced replicates all the functions of the biological one not identically but appropriately and speculatively for a humanoid robot by adopting from time to time the most appropriate model of measurement to be implemented in the soft sensor. We are working on a cognitive layer to put on top the structure of Fig. 4 to handle all sensations during a robot task and generate, also, in this case, using the soft sensor paradigm, just a couple of values representing the mood and the motivation of the robot. References 1. R. Nelson, The Somatosensory System: Deciphering the Brain’s Own Body Image, Frontiers in Neuroscience, Vol. 1 (CRC Press, 2001). 2. U. Maniscalco and R. Rizzo, A virtual layer of measure based on soft sensors, Journal of Ambient Intelligence and Humanized Computing 8, 1 (2016). 3. U. Maniscalco and G. Pilato, Multi soft-sensors data fusion in spatial forecasting of environmental parameters, Advanced Mathematical and Computational Tools in Metrology and Testing X 84, 252 (2012).

248

4. E. Cipolla, U. Maniscalco, R. Rizzo, D. Stabile and F. Vella, Analysis and visualization of meteorological emergencies, Journal of Ambient Intelligence and Humanized Computing 8, 1 (2016). 5. P. Ciarlini and U. Maniscalco, Mixture of soft sensors for monitoring air ambient parameters, in Proceedings of the XVIII IMEKO World Congress, (IMEKO, 2006). 6. U. Maniscalco and R. Rizzo, Adding a virtual layer in a sensor network to improve measurement reliability, Advanced Mathematical and Computational Tools in Metrology and Testing X 86, 260 (2015). 7. A. Augello, I. Infantino, U. Maniscalco, G. Pilato and F. Vella, The effects of soft somatosensory system on the execution of robotic tasks, in Robotic Computing (IRC), IEEE Int. Conf. on, (IEEE, 2017). 8. U. Maniscalco and I. Infantino, An artificial pain model for a humanoid robot, in International Conference on Intelligent Interactive Multimedia Systems and Services, (Springer, Cham, 2017). 9. A. Galip´ o, I. Infantino, U. Maniscalco and S. Gaglio, Artificial pleasure and pain antagonism mechanism in a social robot, in Int. Conf. on Intelligent Interactive Multimedia Systems and Services, (Springer, 2017). 10. D. La Guardia, A. Manfr´e, U. Maniscalco, S. Ottaviano, G. Pilato, F. Vella and M. Allegra, Improving spatial reasoning by interacting with a humanoid robot, Intelligent Interactive Multimedia Systems and Services 2017 76, p. 151 (2017). 11. D. Paulsson, R. Gustavsson and C.-F. Mandenius, A soft sensor for bioprocess control based on sequential filtering of metabolic heat signals, Sensors 14, p. 17864 (2014). 12. S. D. Lee and A. J. Zahrani, Employing first principles model-based soft sensors for superior process control and optimization, in IPTC 2013: International Petroleum Technology Conference, (International Petroleum Technology Conference, 2013). 13. J. Liu, Developing soft sensors based on data-driven approach, in Technologies and Applications of Artificial Intelligence (TAAI), 2010 International Conference on, (IEEE, Nov 2010). 14. P. Kadlec, B. Gabrys and S. Strandt, Data-driven soft sensors in the process industry, Computers and Chemical Engineering 33, 795 (2009). 15. E. T. Hall, The hidden dimension / Edward T. Hall, [1st ed.] edn. (Doubleday Garden City, N.Y, 1966).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 249–256)

Bayesian approach to estimation of impulse-radar signal parameters when applied for monitoring of human movements Paweł Mazurek and Roman Z. Morawski Institute of Radioelectronics and Multimedia Technology, Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland E-mail: [email protected], [email protected] The research reported here is related to the impulse-radar technology when applied in care services for elderly persons. A method for estimation of impulse-radar signal parameters, based on the Bayes theorem and Markov-chain Monte-Carlo technique, is proposed. Initial results have shown that the proposed approach can be effectively used for estimation of the parameters of the echoes, and thus — for person localization. Keywords: Impulse Radar; Bayesian Inference; Markov Chain Monte Carlo; Healthcare.

1. Introduction Quick ageing of the Western population increases the demand for research on new technologies that could be employed in care services for elderly persons. Among the emerging solutions, impulse-radar-based techniques of monitoring seem to be very promising [1]. Detection of the echoes reflected from a monitored person, followed by estimation of their parameters, is the basis for the monitoring of person’s movements. In this paper, a method for estimation of impulse-radar signal parameters, based on the Bayes theorem and Markov-chain Monte-Carlo technique, is proposed. 2. Mathematical Model of Radar Data Under the assumption that the emitted radar pulse is partially reflected from K T surfaces, located at different distances, the data y   y1  y N  , representative of the received radar signal, may be modeled by means of the equation:

y  X K rK 

(1)

where N is a number of data points, X K   x1  x K  is a matrix containing K K

copies of the emitted pulse x k   xk ,1  xk , N  (called echoes hereinafter), each T

249

250

shifted by nk points ( n K   n1  nK  ), rK   r1  rK  is a vector of echoes T

magnitudes, and

 K ,1 K , N 

T

T

is a vector representative of the zero-

mean white Gaussian noise with the variance  K2 . It is to be noted that n K , rK and

K

K

are the realizations of the random vectors.

3. Bayesian Inference The aim of the Bayesian inference is the estimation of K and of the parameters

  nTK rKT  K2 

function p  K , K

K

T

via integration of the joint posterior probability density

| y  . According to the Bayes theorem: p K,

| y   p  y | K ,

 p K,



(2)

where p  K , K | y  is the posterior distribution of the model parameters given the data, p  y | K , K  — the likelihood of observing the data given model parameters, and p  K , K  — the joint prior distribution of the model parameters. The posterior distribution in Eq. (2) can be further decomposed into the product of distributions forming the following hierarchical structure [2]: p  K , n K , rK ,  K2 ,  ,  2 | y   K

K

K

p  y | K , n K , rK ,  K2  p  K , n K , rK |  K2 ,  ,  2  p   K2  p    p  2 

(3)

where  is a hyperparameter interpreted as the expected signal-to-noise ratio, and  is a hyperparameter interpreted as the expected number of echoes [3]. In the above formula:  The likelihood can be expressed as: N 2  1  T exp   2  y  X K rK   y  X K rK   (4) p y | K , n K , rK ,  K2  2 K2  2 K  2





 

The conditional distribution of the model parameters  K , t K , rK  given  K2 ,  and  2 is: p  K , n K , rK |  K2 ,  ,  2   p  rK | K , n K ,  K2 ,  2  p  n K | K  p  K |   

 



1

2 K2 Σ K

12

where ΣK1   2 X TK X K .

 r T Σ 1r exp   K K2 K 2 K 

 1 K K !  K  N n  N  n 0  n !

(5)

The prior distribution of the number of echoes, i.e. p  K |   , is the truncated Poisson distribution (maximum number of echoes is N ). The echo positions, conditional upon K , are assumed to be uniformly distributed in the interval of integer numbers 1, N  .

251

The echo magnitudes, conditional upon K , n K ,  K2 and  2 , are assumed to be zero-mean Gaussian random variables with the covariance matrix  K2 Σ K .  The prior distribution for the noise variance  K2 is the inverse gamma distribution with a shape parameter  0  0 and a scale parameter  0  0 [3]. Finally, the prior distribution for the hyperparameter  is the gamma distribution with a shape parameter   1/ 2  1 and a rate parameter     2 , 1 ,  2  1 , while the prior distribution for the hyperparameter  2 is the inverse gamma distribution with a shape parameter   2  2 and a scale parameter  2  0 . The symbols N , G and IG will be used, hereinafter, to denote, respectively, the normal, gamma and inverse-gamma distributions. On having the above defined distributions, conditional upon the hyperparameters  and  2 , the echoes magnitudes rK and the variance of the noise  K2 can be integrated out from Eq. (3) because:    N  0  y T PK y  rK ~ N mK ,  K2 M K and  K2 ~ IG  0 , (6)  2  2  









where mK  M K X TK y , M K  ΣK1  X TK X K , and PK  I  X K M K X TK . Therefore, one can obtain the following conditional posterior distribution for the number of echoes K and their positions n K :

p  K , n K | ,  2 , y    0  y T PK y 

1

 1 K  (7)   K !  N  2 1  The distribution given by Eq. (7) represents a mixture of distributions whose dimensionality is unknown, and, therefore, the sampling from that mixture cannot be done using a standard Metropolis-Hastings algorithm [4]; to overcome this difficulty, the inference is based on the reversible-jump Markovchain Monte-Carlo technique [5].   N  0  2

K

4. Bayesian Computation Following the methodology proposed by P.J. Green [5], the following elementary operations have been selected:  the update of the parameters of all pulses when K  0 ;  the birth of a new pulse, i.e. the introduction of a new pulse with randomly chosen parameters (changing the dimension from K to K  1 );  the death of an existing pulse, i.e. the removal of a randomly chosen pulse (changing the dimension from K  1 to K ). At each iteration, one of those operations is randomly selected with the probabilities: bK (for birth), d K (for death), and uK (for update);

bK  d K  uK  1 for all K = 0, …, N. After the execution of the selected operation, its outcomes are retained or not, according to an acceptance rule.

252

For K  0 , the death operation is impossible, i.e. d0  0 , while for K  N

the birth operation is impossible, i.e. bN  0 ; otherwise bK  c min 1, rmove  ,

1 d K 1  c min 1, rmove  where rmove  p  K  1 |   p  K |   , p  K |   is the

prior probability of the dimension K , and c is an empirically adjusted constant controlling the proportion of the number of operations. As pointed out in [5], such specification of the birth and death probabilities ensures that bK p  K |    d K 1 p  K  1 |   . Update operation This operation comprises an update of the echoes positions and of the nuisance parameters. The positions are sampled one-at-a-time using the mixture of Metropolis-Hastings steps [4] with the target distribution:

p  nk , K | ,  2 , n k , K , y    0  y T PK y 

  N  0  2

where n  k , K   n1, K ... nk 1, K nk 1, K ... nK , K  , and the proposal distribution:

(8)

T

q  nk, K | nk , K  ~ N  nk , K ,  n2 

(9)

i.e. the normal distribution with mean nk , K , and with arbitrarily chosen standard deviation  n . For this step, the acceptance probability is:  N  0  2 T q  nk , K | nk, K       y P y   update  min 1,  0 T K  (10)  q  nk, K | nk , K      0  y PK y      P M Σ P M Σ with K , K , and K similar to K , K , and K , but with n K replaced by

nK   n1, K ... nk 1, K nk, K nk 1, K ... nK , K  . The distributions used for updating the nuisance parameters (  K2 and rK ) and the hyperparameters (  2 and  ) are the so-called full conditional distributions [6]. The distributions for the nuisance parameters are: (11) p  rK |  2 , K , n K ,  K2 , y  ~ N  m K ,  K2 M K  T

   N  0  y T PK y  p   K2 |  2 , K , n K , rK , y  ~ IG  0 ,  2  2  while the distributions for the hyperparameters are: K  rT XT X r p  2 | K , n K , rK ,  K2 , y  ~ IG   0 , K K 2 K K  0  2 K 2  p  | K  ~

K K !



N

n 0

n n!

G 1 / 2  1 ,  2 

(12)

(13) (14)

253

Drawing samples from the distributions described by Eqs. (11)–(13) is done using built-in procedures of the MATLAB Statistics Toolbox; sampling of the hyperparameter  (cf. Eq. (14)) is done according to a Metropolis-Hastings step with proposal distribution G 1 / 2  K  1 ,1   2  . Birth and death operations The birth operation consists in proposing a new echo position n  and accepting the proposition with the probability  birth (cf. Eq. (15)). Upon acceptance, a new state of the Markov chain becomes

 , 

2

, K  1, n K 1 , rK 1 ,  K2 1  with new

values of the nuisance parameters and hyperparameters drawn from distributions described by Eqs. (11)–(14); otherwise, the chain remains at the previous state. The death operation consists in randomly choosing one of the existing K  1 echoes for removal, and accepting the proposition with the probability death (cf. Eq. (15)). Upon acceptance, a new state of the Markov chain becomes

 , 

2

, K , n K , rK ,  K2  with new values of the nuisance parameters and of the

hyperparameters, drawn from distribution described by Eqs. (11)–(14); otherwise, the chain remains at the previous state. The acceptance probabilities for the birth and death operations are: -1 (15)  birth  min 1, rbirth  and  death  min 1, rbirth  with the acceptance ratio [2, 5]: rbirth

   y T P y   0 T K    0  y PK 1y 

 N  0 

2

1

 K  1

1  2

(16)

5. Results and Discussion In Fig. 1, an exemplary real-world data sequence, obtained by means of the impulse-radar sensor described in [1], is presented. Before its preprocessing, the clutter has been removed and the sequence has been filtered [1]. The sequence has been processed with the algorithm described in Section 4 with the number of iterations set to 3  106 . In Fig. 2 the samples of the model order K and the estimate of the posterior distribution p  K | y  , calculated on the basis of the samples obtained after the

burn-in period of 1.2  106 iterations, are presented. It may be observed that in the considered experiment the estimated maximum a posteriori (MAP) number of echoes has assumed the value KMAP  164 . In this case, one cannot determine the exact number of echoes, because the body of a monitored person and the objects, present in the observed area, have complex structures and reflect multiple echoes. Moreover, it has to be noted that the noise-like shape of the emitted pulse may provoke the overfitting of the data.

254

Fig. 1. The real-world data sequence and the echo template used in the experiment.

Fig. 2. The number of echoes K during the preprocessing of the data sequence (left) and the estimate of the posterior distribution p  K | y  , calculated excluding the burn-in samples (right).

In Fig. 3, the result of the Bayesian approach to estimation of the impulseradar signal parameters — i.e. the reconstructed signal based on the MAP values of the estimates of the posterior distributions — is presented. It can be noticed that the reconstructed sequence closely resembles the initial data sequence (cf. Fig. 1); however, it has to be stressed that, because the posterior distribution is invariant with respect to permutations in the labeling of the components of n K and rK , the so-called label switching problem arises, making the interpretation of the results more difficult [3, 7]. In the experiments reported in this paper, the following relabeling approach has been adopted; on having R realizations of the

Fig. 3. Sequence reconstructed on the basis of the MAP values of the posterior distributions.

255

vector of position estimates n KMAP , the following sequence of operations has

been performed for realizations r  2,  , R :  the determination of the histogram of values of each component of the vector n KMAP , on the basis of the precedent r  1 realizations of this vector; 

the relabeling (rearranging) of the components of the realization r to match the modes of the determined histograms;  the assignment of the unmatched labels to the components whose values in the realization r appeared most frequently in the histograms corresponding to those unmatched labels. Although computationally expensive, this approach has yielded satisfactory results unlike simple sorting of the position estimates used, e.g., in [2]. When the person is to be localized, it is important to note that it is not necessary to get information on all of the echoes reflected from all the objects present in the observed area; in fact, only the strongest echo is sufficient to properly localize the person. The localization procedure can be significantly simplified by limiting and fixing the number of echoes being selected. Then, during the computation, the birth and death operations are omitted — in each iteration only the update operation is performed according to Eqs. (8)–(13). (As the number of echoes is fixed, the hyperparameter  (Eq. (14)) is not updated.) In Fig. 4, the first 1000 samples (out of 10000) of the echo position and magnitude are shown, while the fitted echo is shown in Fig. 5. It may be noticed

Fig. 4. Samples drawn from the distribution of the echoes positions (left) and magnitudes (right).

Fig. 5. Reconstructed data sequence consisting of the single strongest echo.

256

that, after the simplification of the localization procedure, the Markov chain is reaching the high-probability region of the distribution significantly faster (after ca. 400 iterations); moreover, the identified echo fits the signal well and correctly indicates the position of the monitored person.

6. Conclusions Initial results have shown that the presented approach can be effectively used for estimation of the parameters of the echoes, but the computational complexity of this operation is high — mainly due to the unknown number of echoes and the length of the processed data sequences. The complexity (and, consequently, the time of data processing) can be considerably reduced by fixing the number of echoes searched for: to properly localize a person one needs only the position of the strongest echo. Further work will be focused on the assessment of the uncertainty of estimation of movement trajectories of a monitored person.

Acknowledgments This work has been financially supported by the EEA Grants — Norway Grants financing the project PL12-0001, and by the Institute of Radioelectronics and Multimedia Technology, Faculty of Electronics and Information Technology, Warsaw University of Technology.

References 1.

2. 3. 4. 5. 6. 7.

J. Wagner, P. Mazurek, A. Miękina, R. Z. Morawski, F. F. Jacobsen, T. Therkildsen Sudmann, I. Træland Børsheim, K. Øvsthus and T. Ciamulski, Comparison of two techniques for monitoring of human movements, Measurement, 111 (2017). C. Andrieu, É. Barat and A. Doucet, Bayesian Deconvolution of Noisy Filtered Point Processes, IEEE T Signal Proces, 49, 1 (2001). S. Richardson and P. J. Green, On Bayesian Analysis of Mixtures with an Unknown Number of Components, J R Stat Soc B, 59, 4 (1997). W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika, 57, 1 (1970). P. J. Green, Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination, Biometrika, 82, 4 (1995). W. R. Gilks, S. Richardson and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice (Chapman & Hall, London, 1996). M. Stephens, Dealing with label switching in mixture models, J R Stat Soc B, 62, 4 (2000).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 257–264)

Challenging calculations in practical, traceable contact thermometry J. V. Pearce* and R. L. Rusby National Physical Laboratory, Hampton Road, Teddington, TW11 0LW, UK *E-mail: [email protected] Almost every technological process depends in some way on temperature measurement and control; for example, reliable electricity generation, intercontinental flights, and food processing. They all depend on a sophisticated measurement infrastructure that allows temperature measurements to be traced back to the SI unit of temperature, the kelvin, via the International Temperature Scale of 1990 (ITS-90). As temperature cannot be measured directly, practical thermometers measure some temperature-dependent property such as electrical resistance or a thermoelectric voltage, and must be calibrated. Both the thermometers and the calibration artefacts exhibit surprisingly rich physics, which is, in many cases, at the limit of current knowledge and capabilities. We discuss four examples: calculation of phase diagrams of binary alloys in the limit of low solute concentration to quantify the effect of impurities in temperature fixed points; calculation of the effect of impurities and crystal defects on the resistivity of platinum wires of resistance thermometers; calculation of Seebeck coefficients of metals to improve characterization of thermocouple behaviour; calculation of the vapour pressure of noble metals and their oxides to improve characterization of thermocouple calibration drift. This paper discusses the state of the art in these topics, as well as their background, how they relate to realworld problems, and how the scientific community may be able to help. Keywords: Contact thermometry, thermocouples, resistance thermometers, ITS-90, Seebeck coefficient, traceability, liquidus slopes.

1. Introduction Almost every technological process depends in some way on temperature measurement and control; for example, reliable electricity generation, intercontinental flights, and food processing1. All depend on a sophisticated measurement infrastructure that allows temperature measurements to be traced back to the SI unit of temperature, the kelvin, via the International Temperature Scale of 1990 (ITS-90). As temperature cannot be measured directly, practical thermometers measure some temperature-dependent property such as electrical resistance or a thermoelectric voltage. A practical temperature scale has two components: defined temperature values associated with a set of highly reproducible ‘fixed points’, which are states 257

258

of matter such as phase transitions (freezing, melting or triple points of pure substances), and specified interpolating instruments with defined interpolating or extrapolating equations. Both the thermometers and the fixed points exhibit surprisingly rich physics, which is, in many cases, at the limit of current knowledge and capabilities. Examples include the calculation of phase diagrams of binary alloys in the limit of low solute concentration to quantify the effect of impurities in temperature fixed points; calculation of the effect of impurities and crystal defects on the resistivity of platinum wires of resistance thermometers; calculation of Seebeck coefficients of metals to improve characterization of thermocouple behaviour; calculation of the vapour pressure of noble metals and their oxides to improve characterization of thermocouple calibration drift. This paper outlines the state of the art in each of these topics, and offers some suggestions for future work which would benefit from the attention of practitioners in the field of mathematical and computational tools in metrology. 2. Binary alloy phase diagrams The ITS-90 relies on a set of defined temperature values, determined a-priori2, which are associated with a set of highly reproducible states of matter like phase transitions (freezing, melting or triple points of pure substances)3. An important limitation of this scheme is the influence of impurities on the melting and freezing of metal fixed points4 such as Ga, In, Sn, Zn, Al, and Ag: as a rule of thumb, one part per million of impurity corresponds to about 1 mK elevation or depression of the freezing temperature. To establish the effect of a given population of impurities dissolved in an ITS-90 metal on the freezing/melting temperature, and hence the likely error in thermometer calibration5, it is of great interest to determine the liquidus slope (rate of change of freezing/melting temperature with impurity concentration in the limit of low impurity concentration) for all likely impurities in the ITS-90 metals. To address this, an extensive set of thermodynamic calculations have been performed using the MTDATA software6, which, together with an extensive literature survey and other measurements, has resulted in the most comprehensive catalogue of liquidus slopes in the limit of low impurity concentration to date5,7. This software works by minimizing the Gibbs energy of a chemical system with respect to the proportions of individual species that could possibly form. This allows the calculation of the equilibrium state at a fixed temperature and pressure, as well as the overall composition. The most stable state is the one with the lowest Gibbs energy. A key requirement is the ability to handle impurities. This means the incorporation of additional terms representing them, and the bulk materials

259

must be included in the model of the free energy. The calculated liquidus slopes are estimated to have an uncertainty of about 30 %. The key database used is Version 5.0 of the SGTE solutions database (SGSOL). There are 78 elements included in this database5. There are about 350 binary alloy systems which have been assessed over a wide range of temperature and pressure. In terms of ITS-90 fixed points, the relevant solvents that have been fully assessed amount to: Hg(4), Ga (12), In (15), Sn (18), Zn (16), Al (37), Ag (24). Additional databases which are relevant include MTAL (allowing calculations in the 7 component system comprising Al, Fe, Mg, Mn, Si, Cu, Zn); MTSOLDERS (containing critically assessed data for Ag, Al, Au, Bi, Cu, Ge, In, Pb, Sb, Si, Sn, Zn and nearly all binary systems); SOLDERS (containing critically assessed data from the European project COST5318 for more than 50 binary systems involving Ag, Au, Bi, Cu, In, Ni, Pb, Pd, Sb, Sn, Zn). To include the effect of oxygen, the MTOX database is required; this has been critically assessed and can handle binary and ternary oxides involving 38 elements.

Fig. 1. Liquidus slope, as a function of impurity atomic number, in the limit of low impurity concentration where Al is the solvent. From Ref. 5.

Figure 1 shows all of the known liquidus slopes — in the limit of low impurity concentration — as a function of impurity atomic number for the Al solvent, taken from a wide range of sources5. MTDATA calculations are over plotted to show the level of agreement, and the extent to which the database is populated. There remain significant gaps in the databases which need to be filled. 3. Matthiessen’s rule The most important defined interpolating instrument of the ITS-90 is the Standard Platinum Resistance Thermometer (SPRT). This remarkable device has a reproducibility of the order of millikelvins over a temperature range which can span from a few kelvin up to around 962 °C. In an ideal metal conductor, electrical

260

resistance is caused by the scattering of electrons by thermally agitated atoms, which leads to an electrical resistance increasing with temperature, and is the basis of metallic resistance thermometry9,10. It follows that anything which influences the resistivity of platinum has a potentially important effect on the performance of the thermometer. Impurities and crystal defects have significant effects on the resistivity of the platinum wires in SPRTs used to interpolate temperature values between the fixed points. They give rise to differences between the characteristics of SPRTs which are not completely accommodated in the functions used for the interpolations. Although there seem to be no strong local resistance anomalies in the platinum (as a function of temperature), these differences are such as to cause significant ‘non-uniqueness’ in the ITS-90 between different SPRTs, and are appreciable components of the uncertainty in SPRT calibrations. The total resistivity of a crystalline metal is assumed to be the sum of the ideal resistivity of the pure metal and the resistivity due to the presence of impurities and structural imperfections in the crystal. Thus, scattering of electrons by phonons or other electrons, which are intrinsic to the platinum and are temperature dependent, act independently of scattering by impurities and defects, which are sample-dependent. In the approximation of Matthessen’s rule11, scattering by impurities and defects is assumed to be independent of temperature, so the added resistivity is a constant. Unfortunately Matthiessen’s Rule is not obeyed well enough for the present purpose, and the differences between individual SPRTs and the ITS-90 reference function over wide ranges require three or more parameters to be determined. Indeed, at low temperatures, below about 30 K, where the ideal resistivity of platinum is small, and the impurity resistivity is both a significant fraction of the total and is strongly temperature-dependent, probably as a result of coupling between the electrons and the magnetic moments of impurities such as iron or manganese. At high temperatures contamination by impurities can lead to irreversible changes in the resistance-temperature dependence of SPRTs, and are the main cause of long-term drift. Annealing of SPRTs plays an important role in controlling the influence of these phenomena, but the procedures are somewhat empirical and there is an urgent need to better understand the relationship between lattice defects, impurities, and resistivity as a function of temperature. 4. Seebeck coefficient Thermocouples are the most widely used type of temperature sensor in industry. While not as reproducible as SPRTs (hundreds of millikelvins compared with a

261

few millikelvins), they are far more robust and can be used to much higher temperatures. They consist of two thermoelements (wires) connected at one end to form the measurement junction. An electromotive force (emf) is generated across the two thermoelements in response to a temperature gradient along them. The rate of change of emf with temperature for a pair of wires is given by the combined Seebeck coefficient. Despite their simplicity, thermocouples exhibit some very rich physics. Currently, thermocouples must be calibrated with respect to known temperatures, and the temperature related to the emf by an empirical model parameterized by the calibration. By using an ab initio model of the temperature dependence of the thermopower, the thermocouple could in principle be used as a primary thermometer (i.e. not needing a calibration). Even a modestly accurate model would be useful in developing new types and evaluating the effect of changing compositions (due to e.g. high temperature or ionizing radiation). A description of calculation techniques is beyond the scope of this document, but a key component in characterizing thermoelectric effects is the electronic density of states (DoS). Conventional analytical expressions for the DoS are inadequate for accurate models. More advanced techniques such as plane-wave density function theory (DFT) code are needed. The open-source code ABINIT12 was recently employed13 to calculate the DoS for the common thermocouple metals Pt, Au and Pd, and incorporated in a simple model to calculate the thermopower. Figure 2 shows the calculated DoS for the three metals, grouped by relevance to thermocouple pairs, and the resulting thermopower. The predictions were in good qualitative agreement with the measurements, and describe well certain key features e.g. pronounced changes in the rate of change of thermopower as a function of temperature, but are not in close enough agreement to be used on a quantative basis. It seems likely that the calculation of the DoS still makes too many assumptions e.g. insufficient evaluation of electron orbital shielding effects and the assumption of single crystal behaviour. Furthermore, most thermocouple materials of interest are alloys, which introduces another set of complications. Nonetheless further development of electronic DoS calculation techniques are of great interest and have a clear practical application. Specific materials that need to be studied for contemporary thermocouples are Pt, Rh, Pt-Rh alloys (Pt-6%Rh, Pt-10%Rh, Pt-13%Rh, Pt-30%Rh, Pt-40%Rh), Pd, Au, and the base metal thermocouple alloys 90%Ni/10%Cr, 95%Ni/2%Al/2%Mn/1%Si (Type K thermocouple) and 84.1%Ni/14.4%Cr/1.4%Si/0.1%Mg, 95.5%Ni/4.4%Si/0.1%Mg (Type N thermocouple); all fractions are in terms of weight.

262

Fig. 2. Left: Electronic density of states as a function of energy for Pt and Pd. Right: Calculated thermopower as a function of temperature, compared with the measured values. From Ref. 13.

5. Vapour pressure of noble metals and their oxides Barring the presence of significant amounts of impurities, an important cause of thermoelectric inhomogeneity and hence calibration drift of platinum-rhodium thermocouples at high temperatures is the transport of the oxides of Pt and Rh, which causes local changes in the wire composition. To accurately model the effect of this vapour transport, it is necessary to have a good knowledge of the vapour pressure of Pt, Rh, and more importantly their oxides, namely PtO2 and RhO2, whose vapour pressures are at least an order of magnitude higher. Determination of the vapour pressures of noble metals and their oxides is extremely difficult, and to the authors’ knowledge, the only documented determination of these vapour pressures over the temperature range applicable to Pt-Rh thermocouples is that of Alcock and Hooper14 which describes an experimental determination and provides an expression for the vapour pressures as a function of temperature (Figure 3). A model has recently been developed that made use of this data to identify the optimum Pt-Rh thermocouple which, at a given temperature, has the ratio of oxide vapour pressures exactly equal to the ratio of molar amount of species in the wire15; in other words, the Pt-Rh composition of the wire for which the ratio of Pt and Rh atoms leaving the surface is exactly the same as the ratio of Pt and Rh in the vapour phase — so that evaporation from the wire causes no change in the local composition. The optimal composition as a function of temperature determined in this way is shown in Figure 3; the uncertainty, in terms of Rh

263

wt%, amounts to about 20%, which is somewhat large in terms of the resulting drift rates. The availability of lower uncertainty determinations of the vapour pressures would clearly benefit the study. A further limitation of this approach is the reliance on one data set (performed many years ago). Clearly more data would be extremely valuable. Since the vapour pressures are so difficult to measure, the possibility of performing accurate calculations of the vapour pressures is intriguing, and would be invaluable in furthering our understanding of thermocouple stability and drift mechanisms.

Fig. 3. Left: Vapour pressure of Pt and Rh oxide as a function of temperature. Dashed lines show the uncertainty (coverage level of 95%). Right: Optimal Rh content of the wire as a function of temperature. Dashed lines show the uncertainty (coverage level of 95%). From Ref. 15.

Conclusions Contemporary developments on contact thermometry have been presented which are aimed at improving the realization and dissemination of the ITS-90. There are some key outstanding computational needs:  More extensive calculations of the liquidus slopes of binary metal alloys involving one of the ITS-90 metals, in the limit of low concentration.  A better understanding of the effect of impurities and crystal defects on the resistivity of platinum, to characterize their effects on SPRTs.  Improved calculations of the electronic density of states of metals commonly used for thermocouples e.g. Pt, Rh, Pt-Rh alloys, Pd, Au, and nickel alloys.  Calculation of the vapour pressure of Pt and Rh and their oxides at temperatures up to about 1800 °C to complement the limited set of experimental results and to enable realistic calculations of drift effects of Pt-Rh thermocouples.

264

Acknowledgments Some of this work was funded by the European Metrology Programme for Innovation and Research (EMPIR) industry project ‘EMPRESS’ (Enhancing process efficiency through improved temperature measurement). The EMPIR initiative is co-funded by the European Union’s Horizon 2020 research and innovation programme and the EMPIR Participating States. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

J.V. Pearce, Extra points for thermometry, Nature Physics 15, 104 (2017) M.R. Moldover, W.L. Tew, H.W. Yoon, Advances in thermometry, Nature Physics 12, 7-11 (2016) B. Fellmuth et al., The kelvin redefinition and its mise en pratique, Phil. Trans. R. Soc. A 374, 20150037 (2016) B. Fellmuth et al., Guide to the Realization of the ITS-90: Fixed Points: Influence of Impurities: bipm.org/en/committees/cc/cct/guide-its90.html J.V. Pearce, J. A. Gisby, P.P.M. Steur, Liquidus slopes for impurities in ITS-90 fixed points, Metrologia 53 1101 (2016) R.H. Davies et al., MTDATA – thermodynamic and phase equilibrium software from the National Physical Laboratory, CALPHAD 26 229 (2002) J.V. Pearce, Distribution coefficients of impurities in metals, Int. J. Thermophys. 35(3), 628-635 (2014) A.T. Dinsdale et al., Atlas of phase diagrams for lead-free soldering, European Report COST531 Lead-Free Solders (volume 1) (2008) R.J. Berry, Relationship between the real and ideal resistivity of platinum, Can. J. Phys. 41, 946-982 (1963) A.I. Pokhodun et al., Guide to the Realization of the ITS-90: Platinum Resistance Thermometry: bipm.org/en/committees/cc/cct/guide-its90.html J.S. Dugdale, The Electrical Properties of Metals and Alloys, Edward Arnold Limited, London (1977) ISBN 0713125233 X. Gonze et al., First-principles computation of materials properties: The ABINIT software project, Comput. Mater. Sci. 25, 478 (2002) J.V. Pearce, Towards an ab-initio calculation of elemental thermocouple output, Int. J. Metrology and Quality Engineering 7 202 (2016) C.B. Alcock and G.W. Hooper, Proc. Roy. Soc. A, 254 (1279), 551 (1960) J.V. Pearce, Optimising Pt-Rh thermocouple wire composition to minimize composition change due to evaporation of oxides, Johnson Matthey Technology Review 60(4) 238 (2016)

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 265–272)

Wald optimal two-sample test for right-censored data* Petr Philonenko and Sergey Postovalov Theoretical & Applied Informatics Department, Novosibirsk State Technical University, Novosibirsk, Russia E-mail: [email protected], [email protected] www.en.nstu.ru The two-sample problem with censored data is considered in this paper. Using the maximin Wald model of game theory and applying the Monte Carlo method for the calculation of statistical test power, we compared the two-sample test robustness on various types of an alternative hypothesis. The recommendations of the two-sample tests application on a certain type of an alternative hypothesis have been given. Also, we proposed a new two-sample MIN3 test for right randomly censored data that is the optimal according to Wald test for making-decision under risk and uncertainty. Keywords: Monte Carlo method; Survival analysis; Two-sample problem; Right-censored data; MIN3 test; Test power.

1. Introduction In hypothesis testing there is often a situation that one two-sample test is more preferable on the alternative hypothesis A and less preferable on the alternative hypothesis B than another two-sample test. Hence, the most powerful twosample test does not exist in the general case. However, the analysis of twosample tests power by the Wald test for decision-making under risk and uncertainty can determine what the test is more preferable while alternative hypothesis does not known (“alternative uncertainty”). We have constructed the 9 types of alternative hypotheses and compared 10 two-sample test [1]. Using test statistics of Bagdonavičius-Nikulin tests for generalized Cox Multiple Crossing test [2], Monotonic Ratio test [3] and weighted Kaplan-Meier twosample test [4], we have proposed new two-sample test MIN3 that is an optimal according to Wald test for decision-making under risk and uncertainty. The test

This work is supported by Russian Ministry of Education and Science as a part of the state task (project 1.1009.2017/4.6).

*

265

266

power is simulated by the Monte Carlo method for various distributions of survival and censored times and censored rate up to 50%. 2. Formulation of the Problem

Suppose that we have two samples of continues variables 1 and  2 respectively, X1 and X 2 of two survival distributions S1(t) and S2(t). Any observation tij = min(Tij ;Cij); where Tij and Cij are the failure and censoring times for the j-th object of the i-th group. Tij and Cij are i.i.d. with CDF Fi(t) and FiC (t) respectively. Survival curve means the probability of survival in the time interval (0, t) Si (t )  P  i  t  1  Fi  t  , Then the null hypothesis is H 0 : S1 (t )  S 2 (t ), t  R. . 3. Statistical Methods 3.1. Two-Sample Tests In the paper we apply various two-sample tests for right-censored data. They are the Gehan’s generalized Wilcoxon test (denoted by G) [5], Peto and Peto’s generalized Wilcoxon test (denoted by P) [6], logrank test (denoted by LG) [7], Cox-Mantel test (denoted by CM) [8], Q-test (denoted by Q) [9], maximum value test (denoted by MAX) [10], Bagdonavičius-Nikulin tests for generalized Cox [11], multiple [2] and single [3] crossing models (denoted by BN1, BN2, BN3 respectively), weighted Kaplan-Meier test [4] (denoted by WKM) and proposed MIN3 test (denoted by MIN3). The MIN3 test statistic is MIN 3  min  pvBN 2 , pvBN 3 , pvWKM  , where pvS is a calculated p-value of two-sample test with test statistic S. 3.2. The Wald Test for Decision-Making Under Risk and Uncertainty Let us have m alternative hypotheses and k statistical tests. Which test is a preferable to use? It is evident that the better way to use the test with a maximal power. However one test can has maximal power against one alternative and has small power against another alternative. Since we do not know true hypothesis, we can consider the true hypothesis like environment state. Let r be a strategy of the r-th statistical test application on the alternative hypothesis H. Then, for every strategy r can be defined the utility function U(r|H) (abbreviated further as U(r)) that can be equal a test power against

267

hypothesis H. There are number of decision-making criteria under risk and uncertainty to choose the best strategy. One of them is the Wald test [12]. It is the test of a “careful observer” and the Wald test maximizes the utility function under an assumption that the environment is unfavorable for the observer. The decision rule is as follows: W  max min U  ri | H j  . i 1, k

j 1, m

(1)

The Wald test gives us the rule how to choose a robust test. 4. Simulation In this section, we compare scores of the Wald test (1) for various types of alternative hypotheses that are represented in Table 1. There are types with 0, 1 and 2 points of intersections between survival functions S1(t) and S2(t) and differences or intersections in the early, middle or late time. Every type contains three alternative hypotheses with various distribution of failure time F(t). So, we have 27 alternative hypotheses denoted by Hi (3 alternatives in every 9 types). The distance between survival functions S1(t) and S2(t) is approximately 0.1 in L1 metrics. Using the Monte Carlo method we calculated test power and computed the corresponding the Wald test scores (1). The results of the simulation are represented in the Tables 2–4. The number of replications for Monte Carlo method is equal 150000. The probability of the first type error is 0.05. The sample sizes n1, n2 are n1 = n2 = 200. The distribution laws of censored times FC(t) are Weibull, Gamma and Exponential and censored rate in the range from 10% to 50%. In the tables are shown minimal test powers for every type of alternative hypothesis. In the case that the survival curves do not have intersection, we have three types of alternative hypotheses. If the difference between survivals curves is in the early time the more preferable test is MIN3. If the difference between survivals curves is in the middle time the more preferable tests are Gehan’s generalized Wilcoxon test and Peto and Peto’s generalized Wilcoxon test. And if the difference between survivals curves is in the late time the more preferable tests are logrank test and the Cox-Mantel test. In the case that the survival curves have one intersection in early time, we can use the longrank test and the Cox-Mantel tests, all Bagdonavičius-Nikulin tests BN1-BN3, and MIN3 test. If the intersection is in the middle or late time the more preferable test is BN3 test.

268 Table 1. The types of alternative hypotheses.

Hi H01 H02 H03 H04 H05 H06 H07 H08 H09 H11 H12 H13 H14 H15 H16 H17 H18 H19 H21 H22 H23 H24 H25 H26 H27 H28 H29

Type of alternative hypothesis

S1

Difference in the early time without an intersection

Exp(0,1)

Exp(0.1,1)



0.099

We(0,1.1,2.4)

LgN (0,0.370)



0.096

LgN(0.01,0.913)

Exp(0,0.742)



0.109

Difference in the middle time without an intersection

G(0,1.060, 1.160)

Exp(0, 0.863)



0.075

Difference in the late time without an intersection One point of intersection in the early time One point of intersection in the middle time One point of intersection in the late time Two points of intersections in the early and middle time

Intersections

L1(S1, S2)

Exp(0,1.3)

We (0,0.9, 1.1)



0.100

Exp(0,1)

We(0.09,1.1,1.07)



0.162

Exp(0,1.3)

G(0, 0.806, 1.064)



0.089

We(0.5,1,1.2)

Exp(0.567,1)



0.116

We(0.118, 1.1, 1.735)

LgN (0.01, 0.6)



0.107

Exp (0, 1)

Exp(0.05, 1.159)

0.363

0.100

G(0,1.273, 1.475)

G(0.159, 1.300, 1.273)

0.763

0.078

We(0.02, 1,1.1)

Exp(0,0.909)

0.571

0.125

We(0, 0.980, 0.905)

G(0,0.972, 0.974)

0.611

0.081

Exp (0,1)

We (0.071, 0.906,1.059)

0.843

0.097

G (0.01, 1, 1.15)

Exp (0, 0.833)

1.040

0.068

We (0,0.968, 1.214)

Exp (0, 1.107)

1.346

0.099

G (0, 1.1, 1.040)

G(0,0.9,1.302)

1.878

0.132

We(0.5, 1.1,1.1)

Exp (0.471, 1)

3.626

0.097

LgN(0,0.948)

We (0.173, 1.325,0.911)

0.243, 0.655

0.071

Exp(0.5,1.047)

LgN (0.141, 0.596)

0.814, 1.038

0.079

We(0.5,1,1.2)

Exp (0.530, 1)

0.577, 1.327

0.105

G(0.01,1.213, 1.192)

0.683, 3.074

0.095

Two points of LgN(0, 0.916) intersections in LgN(0,0.817) the early and We(0.01,1.697, 1.846) late time Two points of intersections in the middle and late time

S2

Exp(0.185, 0.818)

0.232, 0.831

0.068

LgN(0.293, 0.569)

1.018, 2.381

0.131

We(0, 1.355, 1.018)

LgN(0.000, 0.867)

1.254, 3.265

0.099

G(0,1.134, 1.231)

LgN(0, 0.876)

0.793, 2.994

0.088

Exp(0, 0.744)

LgN(0, 0.866)

1.321, 3.463

0.104

269

Table 2. The minimal values of the test power on the alternative hypotheses with 10% and 20% censoring rates.

1 Intersection of survival functions S1(t) and S2(t) in

2 intersections of survival functions S1(t) and S2(t) in

early and late times

early and middle times

middle and late times

0,054

middle time

0,178

I late time

G P LG CM Q MAX BN1 BN2 BN3 WKM MIN3 W score

0,089

early time

0,315

middle time

G P LG CM Q MAX BN1 BN2 BN3 WKM MIN3 W score

late time

early time

Test

0,050

0,050

0,059

Minimum

Difference between S1(t) and S2(t) in

10% censoring rate 0,163

0,049

0,049

0,315

0,093

0,175

0,057

0,155

0,051

0,050

0,050

0,060

0,050

0,142

0,191

0,101

0,072

0,052

0,055

0,050

0,058

0,050

0,050

0,141

0,192

0,101

0,071

0,053

0,055

0,049

0,060

0,050

0,049

0,259

0,178

0,144

0,063

0,146

0,058

0,049

0,060

0,058

0,049

0,272

0,182

0,157

0,064

0,135

0,064

0,050

0,063

0,056

0,050

0,190

0,160

0,146

0,070

0,207

0,128

0,054

0,066

0,050

0,050

0,304

0,132

0,128

0,067

0,171

0,104

0,124

0,087

0,221

0,067

0,291

0,162

0,158

0,074

0,227

0,139

0,049

0,060

0,059

0,049

0,323

0,091

0,182

0,054

0,164

0,050

0,050

0,050

0,059

0,050

0,389

0,165

0,157

0,068

0,199

0,112

0,103

0,093

0,178

0,068

0,389

0,192

0,182

0,074

0,227

0,139

0,124

0,093

0,221

0,068

0,345

0,083

0,188

0,051

0,049

0,064

0,049

20% censoring rate 0,051

0,180

0,050

0,318

0,086

0,178

0,052

0,165

0,049

0,051

0,049

0,060

0,049

0,127

0,170

0,109

0,067

0,064

0,051

0,050

0,065

0,050

0,050

0,129

0,170

0,107

0,069

0,066

0,050

0,050

0,065

0,050

0,050

0,262

0,138

0,149

0,061

0,150

0,058

0,049

0,062

0,063

0,049

0,296

0,142

0,162

0,062

0,153

0,064

0,052

0,065

0,063

0,052

0,246

0,155

0,148

0,069

0,202

0,117

0,050

0,063

0,054

0,050

0,336

0,130

0,127

0,068

0,169

0,101

0,115

0,083

0,219

0,068

0,350

0,155

0,160

0,073

0,223

0,119

0,049

0,061

0,075

0,049

0,346

0,082

0,188

0,050

0,180

0,050

0,049

0,050

0,065

0,049

0,423

0,164

0,161

0,067

0,202

0,097

0,094

0,081

0,184

0,067

0,423

0,170

0,188

0,073

0,223

0,119

0,115

0,083

0,219

0,068

270 Table 3. The minimal values of the test power on the alternative hypotheses with 20% and 30% censoring rates.

1 Intersection of survival functions S1(t) and S2(t) in

2 intersections of survival functions S1(t) and S2(t) in

early and late times

early and middle times

middle and late times

0,190

0,049

0,200

0,050

0,051

0,051

0,075

0,049

0,346

0,082

0,181

0,051

0,171

0,050

0,050

0,049

0,062

0,049

0,139

0,125

0,115

0,063

0,083

0,049

0,050

0,066

0,050

0,049

0,137

0,126

0,118

0,066

0,083

0,049

0,050

0,067

0,050

0,049

0,278

0,106

0,147

0,057

0,160

0,058

0,048

0,062

0,065

0,048

0,346

0,106

0,170

0,060

0,170

0,058

0,054

0,063

0,073

0,054

0,320

0,145

0,148

0,069

0,201

0,098

0,051

0,062

0,070

0,051 0,069

middle time

middle time

0,077

late time

late time

0,402

early time

early time

Test

Minimum

Difference between S1(t) and S2(t) in

30% censoring rate G P LG CM Q MAX BN1 BN2 BN3 WKM MIN3 W score

0,370

0,123

0,127

0,069

0,171

0,089

0,109

0,085

0,221

0,421

0,131

0,160

0,074

0,218

0,100

0,052

0,062

0,103

0,052

0,391

0,075

0,194

0,050

0,199

0,051

0,048

0,052

0,078

0,048

0,468

0,127

0,168

0,067

0,207

0,085

0,088

0,074

0,188

0,067

0,468

0,145

0,194

0,074

0,218

0,100

0,109

0,085

0,221

0,069

40% censoring rate G P LG CM Q MAX BN1 BN2 BN3 WKM MIN3 W score

0,472

0,071

0,203

0,048

0,219

0,053

0,052

0,052

0,094

0,048

0,376

0,075

0,180

0,050

0,177

0,049

0,050

0,050

0,070

0,049

0,160

0,095

0,129

0,058

0,100

0,049

0,051

0,056

0,050

0,049

0,163

0,095

0,124

0,058

0,099

0,050

0,051

0,055

0,050

0,050

0,291

0,086

0,154

0,053

0,168

0,057

0,051

0,055

0,072

0,051

0,413

0,087

0,177

0,058

0,186

0,055

0,058

0,063

0,089

0,055

0,401

0,092

0,150

0,071

0,195

0,086

0,049

0,064

0,100

0,049

0,429

0,112

0,131

0,072

0,174

0,080

0,100

0,076

0,237

0,072

0,452

0,086

0,157

0,075

0,211

0,086

0,051

0,064

0,136

0,051

0,468

0,070

0,200

0,049

0,223

0,053

0,051

0,053

0,095

0,049

0,528

0,100

0,170

0,070

0,215

0,075

0,081

0,068

0,215

0,068

0,528

0,112

0,203

0,075

0,223

0,086

0,100

0,076

0,237

0,072

271 Table 4. The minimal values of the test power on the alternative hypotheses with 50% censoring rates.

early and late times

early and middle times

middle and late times

0,207

0,052

0,243

0,056

0,054

0,051

0,131

0,051

0,427

0,069

0,180

0,049

0,189

0,052

0,050

0,051

0,083

0,049

0,209

0,077

0,132

0,051

0,115

0,050

0,049

0,050

0,051

0,049

0,212

0,077

0,133

0,052

0,114

0,049

0,050

0,051

0,052

0,049

0,303

0,073

0,156

0,051

0,169

0,051

0,052

0,052

0,070

0,051

0,478

0,075

0,180

0,057

0,205

0,056

0,059

0,060

0,117

0,056 0,053

middle time

0,066

I late time

0,531

Test

early time

middle time

Minimum

2 intersections of survival functions S1(t) and S2(t) in

late time

1 Intersection of survival functions S1(t) and S2(t) in

early time

Difference between S1(t) and S2(t) in

50% censoring rate G P LG CM Q MAX BN1 BN2 BN3 WKM MIN3 W score

0,434

0,069

0,148

0,072

0,196

0,076

0,053

0,064

0,147

0,490

0,093

0,134

0,079

0,173

0,073

0,097

0,070

0,275

0,070

0,489

0,066

0,156

0,078

0,209

0,078

0,063

0,065

0,180

0,063

0,535

0,066

0,206

0,051

0,242

0,057

0,053

0,051

0,131

0,051

0,597

0,083

0,180

0,074

0,227

0,071

0,077

0,063

0,256

0,063

0,597

0,093

0,207

0,079

0,243

0,078

0,097

0,070

0,275

0,070

In the case that the survival curves have two intersections, the more preferable test is BN2 test and also MIN3 test if the intersections are in early and middle times. If we do not know what can be type of alternative hypothesis (“alternative uncertainty”) the most preferable tests are BN2 test and MIN3 test. However BN2 is worse than MIN3 test if the survival curves do not have intersection. The censoring rate influences to the power of the tests, however the Wald optimal tests are the same BN2 and MIN3. 5. Conclusions Thus, in the paper the types of alternative hypotheses have been represented for 0, 1 and 2 points of intersections between underlying survival functions. Every type contains three alternative hypotheses with various distributions of failure time. The results of simulation have been represented that the proposed twosample MIN3 test power is the best according to the maximin Wald test for

272

decision-making under risk and uncertainty. Hence, the MIN3 test is a robust two-sample test for the uncertainty of an alternative hypothesis. It makes possible to recommend the MIN3 for applying in two-sample problem solving under right-censored data. References 1.

P. Philonenko, S. N. Postovalov, Test power in two-sample problem testing as the utility function in the theory of decision making under risk and uncertainty // Actual problems of electronic instrument engineering (APEIE-2016) : XIII International Scientific-Technical Conference, Novosibirsk, Russia, October 03-06, 2016 : v. 1, p. 2. pp. 369-373, 2016. 2. Bagdonavičius V.B., Nikulin M., On goodness-of-fit tests for homogeneity and proportional hazards, Applied Stochastic Models in Business and Industry, vol. 22, no. 1, pp. 607-619, 2006. 3. Bagdonavičius V.B., J. Kruopis, M. Nikulin, Nonparametric tests for censored data, John Wiley & Sons, Inc., New York, 2010, 233 pp. 4. M.S. Pepe and T.R. Fleming, Weighted Kaplan-Meier statistics: A class of distance tests for censored survival data, Biometrics 45 (1989), pp. 497-507. 5. Gehan E. A. A generalized Wilcoxon test for comparing arbitrarily singlycensored samples // Biometrika. V. 52, N 1/2. P. 203-223, (1965). 6. Peto R., Peto J. Asymptotically eficient rank invariant test procedures // J. Royal Statist. Soc. Ser. A (General). V.135, N 2. P.185-207, (1972). 7. Lee E. T., Wang J. W. Statistical Methods for Survival Data Analysis. N. Y: Wiley, 2003. (Wiley Series in Probability and Statistics). 8. Mantel, N., Evaluation of Survival Data and Two New Rank Order Statistics Arising in Its Consideration. Cancer Chemotherapy Reports, 50, 163-170, (1966). 9. Ruvie Lou Maria C. Martinez A pretest for choosing between logrank and Wilcoxon tests in the two-sample problem / Ruvie Lou Maria C. Martinez, Joshua D. Naranjo // International Journal of Statistics, vol.LXVIII, n.2. 111 - 125 pp, (2010). 10. P. Philonenko, S. Postovalov, A new two-sample test for choosing between log-rank and Wilcoxon tests with right-censored data, Journal of Statistical Computation and Simulation, 85:14, 2761-2770, (2015). 11. Bagdonavičius V.B., Levuliene R.J., Nikulin M.S., et al., Tests for equality of survival distributions against non-location alternatives, Lifetime Data Analysis, vol. 10, no. 4, pp. 445-460, 2004. 12. Wald, A., Statistical decision functions which minimize the maximum risk. The Annals of Mathematics, 46(2), 265-280, (1945).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 273–285)

Measurement A. Possolo Statistical Engineering Division, National Institute of Standards and Technology, Gaithersburg, MD 20899-8980, USA E-mail: [email protected]

In an article added recently to the Stanford Encyclopedia of Philosophy, Eran Tal notes that “there is little consensus among philosophers as to how to define measurement, what sorts of things are measurable, or which conditions make measurement possible” [1]. The meaning and scope of measurement have been discussed frequently within physical metrology, by David Hand [2], Ren´ e Dybkaer [3], Rod White [4], Eran Tal [5], Luca Mari et al. [6], Giovanni Rossi [7], Eran Tal [8], and Luca Mari et al. [9], among others. The same issues have also been examined within the social sciences [10], in particular in education [11, 12], psychology [13–15], and sociology [16, 17]. Defining measurement involves identifying its essential characteristics, and specifying the traits that distinguish it from similar activities that ought not to be construed as measurement. Delineating its scope involves determining the classes of properties whose values are measurable. None of the logical, terminological, historical, or customary-use considerations that might give preeminence to a particular definition of measurement seem to be decisive, for otherwise a consensus about it would have formed already. For example, a recent survey of its membership, conducted by the ISO/REMCO committee on reference materials, revealed that the metrological community represented in this committee is just about evenly split on whether the assignment of value to qualitative (nominal, or categorical) properties should, or should not be called “measurement.” Defining the meaning of measurement, and circumscribing its scope, therefore involve arbitrary choices, which concern matters of taste, matters of precedence, and matters of convenience. This contribution discusses these choices, and advocates for an inclusive and broad understanding of measurement. Keywords: measurement; uncertainty; property; quantitative; qualitative.

273

274

1. Introduction “The second type of measurement, by latitude of forms, describes processes in which accidental forms or qualities are intensified or diminished in terms of the distribution of natural qualities such as heat or whiteness or moral qualities such as love, grace, sin, will, or desire.” — Jung (2016, Richard Kilvington) [18] This contribution focuses on measurement as studied and practiced in national metrology institutes (NMIs) worldwide, and in other institutions in the public and private sectors that are similarly concerned with measurement. The need for measurement arises in physical science and technology, including medicine, engineering, manufacturing, agriculture, and commerce, among others, and motivates the research and measurement services provided by metrology institutes. The National Institute of Standards and Technology (NIST), an agency of the U.S. Department of Commerce, is the NMI of the United States of America. The specific issues that surround measurement in the social sciences, including in economics (utility, or risk), education (efficiency of teaching methods, or reading level) and in psychology (intelligence, or depression), will not be discussed here, because metrology institutes generally are not engaged in these areas, even though there are notable exceptions, including NIST research in human factors relating to interactions between humans and machines (for example, voting machines), and metrological studies of human sensory perception, and of man as measuring instrument, at the Research Institutes of Sweden (SP) [19]. The practice of science involves consideration of both quantitative and qualitative properties (variously also referred to as attributes, characteristics, etc.). The mass of a white powder in a plastic bag that was left on a seat of a bus is a quantitative property. The chemical nature of this powder (soy protein, baking soda, cocaine, etc.) is a qualitative property. The hardness of a gem determines whether it may be better suited for mounting on a ring or on an earring: a garnet that scratches quartz and is scratched by topaz is said to have hardness between 7 and 8 in the Mohs scale [20]. On the one hand, since a number is used to characterize it, it is tempting to say that hardness gauged in this way is a quantitative property, even if the sense in which it may be quantitative is rather different from how mass is quantitative. On the other hand, Mohs hardness 7 could be construed as the label of a class that comprises minerals of the same

275

hardness as quartz: from this viewpoint, Mohs hardness seems closer to a quality than to a quantity. The Beaufort wind force scale seems similar in nature to the Mohs scale of hardness for minerals. As originally defined it comprised thirteen grades that referred to the effects that winds have on ships’ sails, not to wind-speed explicitly: from sufficing to propel a frigate, to being stronger than canvas sails could withstand. The scale comprises ordered classes of wind conditions that may be described using either numbers or words: for example, 6 or “strong breeze”, 8 or “gale”. There is also a correspondence between these labels and ranges of windspeed measured 10 m above the surface of land or water — strong breezes 38 km/h to 49 km/h, and gales 62 km/h to 74 km/h, according to the Royal Meteorological Society (www.rmets.org/weather-and-climate/ observing/beaufort-scale). There is even an empirical relation to convert a numerical value B on the Beaufort scale into an estimate of windspeed as 0.836B 3/2 , expressed in meter per second [21]. There are also recognized differences between quantities, in particular between those whose values are naturally bounded (the amount fraction of gold in an alloy, which ranges from 0 % to 100 %, or the kinetic energy of a bullet, which is non-negative but otherwise unconstrained) and those that are not (an electrical charge, whose absolute value may be arbitrarily large, and either positive or negative), and also between those that are customarily expressed using a linear scale (mass), and those others for which a logarithmic scale is used most often (acoustic power, earthquake magnitude, optical absorbance). Mosteller and Tukey [22] offer “first-aid” transformations that may be applied to observations made in any of these kinds of scales, to make them amenable to conventional statistical analysis, implicitly lending support to the liberal view that there is too much ado about these distinctions between types or scales of measurement, and that any boundaries between them are, at best, diffuse. The question we propose to address is whether, and under what conditions, it may be appropriate to call “measurement” an assignment of value to a property, yet without precluding or discouraging the use of other terms when greater specificity may be warranted. The NIST Measurement Services Council has recently affirmed the view implicit in the NIST Quality Manual for Measurement Services (NISTQM-I, Version 10, 27-Dec-2016, www.nist.gov/qualitysystem/) that the

276

concept of measurement need not be restricted to assignment of value to quantitative properties, but may also be used in relation with qualitative properties, and with properties that somehow are in-between qualitative and quantitative. However, this does not prevent that other, more specialized terms be employed to describe assignments of value that are covered by the wider meaning of measurement, for reasons of either style or substance. For example, it is equally legitimate to say that one has measured the identity of the nucleobase at a particular location in a gene, as to say that one has identified the white powder in the aforementioned plastic bag as being makeup clay.

Definition The NIST-QM-I defines measurement as Experimental or computational process that, by comparison with a standard, produces an estimate of the true value of a property of a material or virtual object or collection of objects, or of a process, event, or series of events, together with an evaluation of the uncertainty associated with that estimate, and intended for use in support of decision-making. NIST Technical Note 1900 (NIST Simple Guide) supplements this definition with the following clarification [23, Note 2.1]: The property intended to be measured (measurand ) may be qualitative (for example, the identity of the nucleobase at a particular location of a strand of DNA), or quantitative (for example, the mass concentration of 25-hydroxyvitamin D3 in NIST SRM 972a, Level 1, whose certified value is 28.8 ng mL−1 ). The measurand may also be an ordinal property (for example, the Rockwell C hardness of a material), or a function whose values may be quantitative (for example, relating the response of a force transducer to an applied force) or qualitative (for example, the provenance of a glass fragment determined in a forensic investigation). This definition stands in marked contrast with its counterpart in the International Vocabulary of Metrology (VIM, 2.1) [24]: “process of experimentally obtaining one or more quantity values that can reasonably be

277

attributed to a quantity.” The on-line version of the VIM, which is available at http://jcgm.bipm.org/vim/en/, adds the following notes, among others: “measurement does not apply to nominal properties”; “measurement implies comparison of quantities or counting of entities.” The definition in the NIST-QM-I also may be used to differentiate measurement from other activities that appear similar to but that do not qualify as measurement. For example, the assignment of a numerical ZIP code to a region, as used by the U.S. Postal Service, or of an area code in the U.S. telephone system, are not measurements because such assignments neither involve comparison against a standard, nor is qualification with an evaluation of uncertainty appropriate for them. This more inclusive definition of measurement calls for a correspondingly broader definition of measurement uncertainty. The NIST Simple Guide [23, §3] already includes a definition that is fit for this purpose, as follows: measurement uncertainty is the doubt about the true value of the measurand that remains after making a measurement (cf. [25]). The NIST Simple Guide then adds that measurement uncertainty is described fully and quantitatively by a probability distribution on the set of values of the measurand. At a minimum, it may be described summarily and approximately by a quantitative indication of the dispersion (or scatter) of such distribution. This definition of measurement uncertainty distinguishes what measurement uncertainty is from how it may be represented, and ensures that measurement uncertainty is a particular kind of uncertainty as this concept is generally understood. 2. Discussion A recently added article to the Stanford Encyclopedia of Philosophy notes that “there is little consensus among philosophers as to how to define measurement, what sorts of things are measurable, or which conditions make measurement possible” [1]. Mari et al. [9] point out that “measurement is, and has been for some time, an integral component of a wide range of human activities, and is commonly afforded privileged status as a trustworthy source of knowledge. [. . . ] however, as the activities that demand precise and trustworthy information have diversified, the scope of activities conducted under the banner of measurement has broadened, and it is not always obvious what all these

278

ways of measuring have in common with one another.” The definition in the VIM is consistent with the consensus view formed during the late nineteenth and early twentieth centuries, and formalized by von Helmholtz [26], who “defined measurement as the procedure by which one finds the denominate number that expresses the value of a magnitude, where a “denominate number” is a number together with a unit, e.g., 5 meters, and a magnitude is a quality of objects that is amenable to ordering from smaller to greater, e.g., length” [1]. In that same article and elsewhere, Tal [1, 27] supports a model-based account of measurement, according to which “measurement consists of two levels: (i) a concrete process involving interactions between an object of interest, an instrument, and the environment; and (ii) a theoretical and/or statistical model of that process, where ‘model’ denotes an abstract and local representation constructed from simplifying assumptions. The central goal of measurement according to this view is to assign values to one or more parameters of interest in the model in a manner that satisfies certain epistemic desiderata, in particular coherence and consistency” [1]. Mari et al. [9] argue persuasively that a property being quantitative is neither necessary nor sufficient for it to qualify as object of measurement; or, in other words, that “not all quantitative evaluations are measurements and not all measurements are quantitative evaluations.” Finally, they conclude that “the structure of measurement is generally independent of the possible quantitative structure of the property under consideration: the stance that only quantities are measurable is based on historical convention rather than logical necessity.” White’s [4] arguments in favor of a broad, inclusive understanding of measurement are particularly compelling, and it would be redundant at best to attempt to rephrase them here since they are articulated to perfection in the original. Possolo and Iyer [28] echo the definition in NIST-QM-I, and point out the support that such broad understanding of measurement has variously received also from Nicholas and White [29], Dybkaer [3], and Mari et al. [6, 9]. None of the logical, terminological, historical, or customary-use considerations that might give preeminence to a particular definition of measurement are decisive, for otherwise the metrological community would not be as evenly split on the issue as a recent survey revealed that it is. This survey was a committee internal balloting conducted by the ISO/REMCO committee on reference materials.

279

Circumscribing the meaning and scope of measurement therefore involves making arbitrary choices, which concern (a) matters of taste, (b) matters of precedence, and (c) matters of convenience, which we consider next, in turn. 2.1. Matters of Taste Matters of taste relate to subjective preferences and predispositions formed differently by different individuals or communities, as a result of innate inclinations, cultural background, education, and professional interactions. These preferences include the relative emphasis that may be attributed to the various constitutive elements of measurement. The broader view adopted by NIST-QM-I implicitly gives the greater weight: (i) to assignment of value (regardless of the kind of value); (ii) to this assignment necessarily involving comparison with a standard (thus achieving traceability, which in turn requires valid calibrations); (iii) to there being a quantitative evaluation of uncertainty associated with the measured value; and (iv) to the purpose of measurement (supporting decision-making). It should be noted that it is possible to evaluate and express quantitatively the uncertainty associated with the value assigned to a qualitative property. Example E6 (DNA Sequencing) of the NIST Simple Guide refers to a technique used for sequencing a strand of DNA that, at each location along the strand, computes the probability of the nucleobase there being adenine (A), cytosine (C), guanine (G), or thymine (T), and then assigns the nucleobase that has the highest probability to that location. The corresponding (discrete) probability distribution over the set {A, C, G, T} is a representation of uncertainty that, if required, may be summarized using its information-theoretic entropy, say. Example E29 (Forensic Glass Fragments) of the NIST Simple Guide uses the same representation to express uncertainty about the provenance of a glass fragment collected in a forensic investigation. Example E19 (Thrombolysis) describes an analysis of results of a classification of binary outcomes (death or survival) according to whether a thrombolytic agent was administered early, at the cardiac patient’s home, or late, only upon arrival at the hospital. These qualitative attributes are summarized in an odds ratio of survival, qualified with a conventional evaluation of uncertainty [23]. The model-based approach to measurement mentioned above, which

280

underlies the GUM [30] and the NIST Simple Guide, does not intrinsically emphasize the quantitative nature of the measurand, but processual and quality characteristics instead. 2.2. Matters of Precedence Matters of precedence include historical considerations and scientific traditions, with many continuing to defer to von Helmholtz’s [26] understanding. However, authoritative references regulating the delivery of NIST measurement services, foremost among them NIST-QM-I, as well as peer-reviewed literature upholding the same views, are compelling in their own right, and inclusive and liberating in their broad understanding of measurement. The fact that NIST measurement services already comprise products (for example: Standard Reference Materials 2374, 2391c, 2392, 2394; and Reference Materials 8375, 8391, 8392, 8393, 8398 — https://www.nist.gov/ srm) that include the assignment of value to qualitative properties, and indeed support further, similar assignments made by the users of our services and by the consumers of our products, also bolsters an inclusive understanding of measurement. NIST’s increasing engagement in measurement (sensu lato) research in molecular biology, cellular biology, and the forensic sciences, among several other breakthrough areas, also calls for broadening the definition of measurement that we have adopted. Furthermore, the fact that assignments of value in these areas, as well as assignments of value to the identity of substances in analytical chemistry, both for organic and inorganic materials, typically involve measurements of quantitative properties as contributing elements, suggests that the final assignment of value to a derivative qualitative property should be regarded as a measurement, too. Precedence includes also the recognition that, in the history of science, concepts that once had an intuitively obvious meaning underwent extensions that not only ran counter to the wisdom accepted at the time, but that also were shocking or even scandalous, yet eventually gained wide acceptance. One striking example is the meaning of number, which in Classical Greece encompassed only the integers and ratios of integers. The disciples of Pythagoras discovered that the ratio of the length of the diagonal of a square to its side-length cannot be expressed as a ratio of two integers. This discovery caused a major crisis among the Pythagoreans because, in

281

their view, numbers were the essence of all things, and here was a quantity that could not be expressed using a number as conceived by them. A similarly upsetting extension occurred even closer to our subject, and also closer to our time, concerning the meaning of length and area. These intuitive concepts that stood for millennia eventually were extended into the mathematical concept of measure, which greatly widens their scope, but also introduces counter-intuitive surprises. This extension occurred near the end of the nineteenth century and at the beginning of the twentieth, as a product of the creative imagination of ´ Camille Jordan [31], Emile Borel [32], and Henri Lebesgue [33], motivated by practical needs of mathematical analysis [34, 35]. The measure (in the sense of Lebesgue) of the unit square in the Euclidean plane is 1, the same as its area in the conventional sense. However, and on the one hand, the set of points in the unit square whose coordinates are rational numbers has measure zero, even though they are everywhere and densely so, in the sense that there are infinitely many of them in any neighborhood, no matter how small, of every point in the square. Neither is such thinness an exclusive property of sets, like the one just mentioned, that comprise only a countable infinity of points, for there are uncountable subsets of the unit square that also have measure zero. On the other hand, there exists a subset of the unit square that has the same “area” (Lebesgue measure) as the unit square itself, yet is so rarefied that for each point in this set there is an infinitely long straight line on the plane whose intersection with the unit square is that point only [36].

2.3. Matters of Convenience Matters of convenience include the desirability of being able to call measurements all instances of value assignment (that satisfy all the defining elements of measurement), which issue from measurement activities at NIST. For otherwise, having to explain that, in addition to measurement (sensu stricto), NIST also does identifications, examinations, inspections, qualitative analyses, and classifications, may blur the definition of our mission and lead to confusion with the missions of other agencies of the U.S. government, both potentially detrimental to the perception of NIST by the public and by other agencies and branches of the government.

282

3. Epilogue Both philosophers of science and scientists hold divided opinions about the defining traits of measurement, hence about what is measurable, and the corresponding fault lines separate not only different disciplines, but also different groups within disciplines. Neither etymology nor historical precedent alone suffice to circumscribe the scope of measurement, or to delineate its essential, defining traits. The definition of measurement indeed requires that arbitrary choices be made on matters of taste, matters of precedence, and matters of convenience. Since neither logic nor law dictate how these choices should be made, only some voluntary, cooperative agreement may eventually produce a consensus about the meaning of measurement. To the many compelling arguments already amply discussed in the literature, we have added the fact that the research being pursued, and the measurement services being offered by metrological institutes worldwide, has been steadily widening, thus lending support to the view that the traditional concept of measurement should be widened accordingly. We have also attempted to ease the reluctance of those who subscribe to the nineteenth century concept of measurement, by pointing out that the very notions of length of area have, since about the same time, been generalized in ways that are far more radical and laden with surprises than what here we advocate for measurement in general. The scope of measurement ultimately hinges on the definition of its essential traits, which also serve to distinguish measurement from other activities that are similar to it but that should not be construed as measurements. We have given primacy to measurement necessarily involving: (i) comparison with a standard (typically an internationally agreed upon reference that realizes a unit of measurement); (ii) quantitative evaluation of measurement uncertainty; and (iii) support for decision-making. If for no other reason then by exclusion, we do not regard the nature of the property to which a value is to be assigned, as a defining trait of measurement, and therefore favor an inclusive, liberal understanding of its scope. References [1] E. Tal, Measurement in science, in The Stanford Encyclopedia of Philosophy, ed. E. N. Zalta (The Metaphysics Research Lab, Center for the Study of Language and Information (CSLI), Stanford University, 2015) Summer edn.

283

[2] D. J. Hand, Measurement theory and practice — The world through quantification (Arnold, London, UK, 2004). [3] R. Dybkaer, Definitions of ‘measurement’, Accreditation and Quality Assurance 16, 479 (2011), 10.1007/s00769-011-0808-8. [4] R. White, The meaning of measurement in metrology, Accreditation and Quality Assurance 16, 31 (2011). [5] E. Tal, How accurate is the standard second?, Philosophy of Science 78, 1082 (December 2011). [6] L. Mari, P. Carbone and D. Petri, Measurement fundamentals: a pragmatic view, IEEE Transactions on Instrumentation and Measurement 61, 2107 (August 2012). [7] G. B. Rossi, Toward an interdisciplinary probabilistic theory of measurement, IEEE Transactions on Instrumentation and Measurement 61, 2095 (August 2012). [8] E. Tal, Making time: A study in the epistemology of measurement, The British Journal for the Philosophy of Science 67, 297 (2016). [9] L. Mari, A. Maul, D. T. Irribarra and M. Wilson, Quantities, quantification, and the necessary and sufficient conditions for measurement, Measurement 100, 115 (2017). [10] National Research Council, The Importance of Common Metrics for Advancing Social Science Theory and Research: A Workshop Summary (The National Academies Press, Washington, DC, 2011), Rose Maria Li, Rapporteur. Committee on Advancing Social Science Theory: The Importance of Common Metrics. Committee on Social Science Evidence for Use. Division of Behavioral and Social Sciences and Education. [11] M. Wilson, Measuring progressions: Assessment structures underlying a learning progression, Journal of Research in Science Teaching 46, 716 (2009). [12] S. Warnock, N. Rouse, C. Finnin, F. Linnehan and D. Dryer, Measuring quality, evaluating curricular change: A 7-year assessment of undergraduate business student writing, Journal of Business and Technical Communication 31, 135 (2017). [13] B. D. Haig and D. Borsboom, On the conceptual foundations of psychological measurement, Measurement: Interdisciplinary Research and Perspectives 6, 1 (2008). [14] E. Angner, Is it possible to measure happiness? The argument from measurability, European Journal for Philosophy of Science 3, 221 (May 2013). [15] A. Maul, D. T. Irribarra and M. Wilson, On the philosophical foundations of psychological measurement, Measurement 79, 311 (2016). [16] S. A. Stouffer, Measurement in sociology, American Sociological Review 18, 591 (1953). [17] R. J. Smith and P. Atkinson, Method and measurement in sociology, fifty years on, International Journal of Social Research Methodology 19, 99 (2016). [18] E. Jung, Richard Kilvington, in The Stanford Encyclopedia of Philosophy, ed. E. N. Zalta (The Metaphysics Research Lab, Center for the Study of

284

[19]

[20] [21]

[22] [23]

[24]

[25]

[26] [27]

[28] [29] [30]

[31] [32] [33]

Language and Information, Stanford University, Stanford, California, 2016) Winter edn. L. Pendrill and N. Petersson, Metrology of human-based and other qualitative measurements, Measurement Science and Technology 27, p. 094003 (2016). C. Klein and B. Dutrow, Manual of Mineral Science, 23rd edn. (John Wiley & Sons, Hoboken, NJ). G. Simpson, The Beaufort Scale of Wind-Force Report of the Director of the Meteorological Office upon an inquiry into the relation between the estimates of wind-force according to Admiral Beaufort’s scale and the velocities recorded by anemometers belonging to the Office, (1906), Official No. 180, 54pp. London: HMSO. F. Mosteller and J. W. Tukey, Data Analysis and Regression (AddisonWesley Publishing Company, Reading, Massachusetts, 1977). A. Possolo, Simple Guide for Evaluating and Expressing the Uncertainty of NIST Measurement Results (National Institute of Standards and Technology, Gaithersburg, MD, 2015), NIST Technical Note 1900. Joint Committee for Guides in Metrology, International vocabulary of metrology — Basic and general concepts and associated terms (VIM), 3rd edn. (International Bureau of Weights and Measures (BIPM), S`evres, France, 2012), BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, JCGM 200:2012 (2008 version with minor corrections). S. Bell, A Beginner’s Guide to Uncertainty of Measurement, Measurement Good Practice Guide, Vol. 11 (Issue 2) (National Physical Laboratory, Teddington, Middlesex, United Kingdom, 1999), Amendments March 2001. H. von Helmholtz, Counting and measuring (D. Van Nostrand, New Jersey, 1887), C. L. Bryan (translator, 1930). E. Tal, The epistemology of measurement: A model-based account, Ph.D. Thesis, Department of Philosophy, University of Toronto, (Toronto, Canada, 2012). A. Possolo and H. K. Iyer, Concepts and tools for the evaluation of measurement uncertainty, Review of Scientific Instruments 88, p. 011301 (2017). J. V. Nicholas and D. R. White, Traceable Temperatures, second edn. (John Wiley & Sons, Chichester, England, 2001). Joint Committee for Guides in Metrology, Evaluation of measurement data — Guide to the expression of uncertainty in measurement (International Bureau of Weights and Measures (BIPM), S`evres, France, 2008), BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, JCGM 100:2008, GUM 1995 with minor corrections. C. Jordan, Remarques sur les int´egrales d´efinies, Journal de Math´ematiques Pures et Appliqu´ees 8, 69 (1892). E. Borel, Le¸cons sur la Th´eorie des Fonctions (Gauhtier-Villars, Paris, 1898). H. Lebesgue, Int´egrale, longueur, aire, Annali di Mathematica Pura ed Applicata, Serie III 7, 231 (June 1902).

285

[34] J. J. Benedetto, Real Variable and Integration with Historical Notes (B. G. Teubner, Stuttgart). [35] S. Abbott, Understanding Analysis, second edn. (Springer, New York, NY, 2015). [36] O. Nikodym, Sur la mesure des ensembles plans dont tous les points sont rectilin´eairement accessibles, Fundamenta Mathematicae 10, 116 (1927).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 286–293)

Sensitivity analysis of a wind measurement filtering technique Thomas Rieutord∗ and Lucie Rottner CNRM, M´ et´ eo-France, Toulouse, France ∗ E-mail: [email protected] A recent filtering method was designed for wind measurements in the atmospheric boundary layer. The method mixes non-linear filtering techniques with a Lagrangian representation of the fluid. It depends on a set of parameters, such as the instrument error or the number of particles. A sensitivity analysis is presented to assess the influence of such parameters. It is shown that only 3 parameters are relevant and a simple strategy to tune them is given.

1. Introduction Turbulence is a random process. This nature induces several problems when it comes to measure it. Indeed, the purpose of the measure is to inform about the true state of the measured medium, but what it the true state of a random medium? Moreover, the measurements imperfections add random errors. How to determine which variability is to keep and which is to remove? This is the purpose of filtering, applied here to wind measurements. The first section is dedicated to the filtering method implemented and the testing framework. The second section is devoted to sensitivity analysis: Sobol indices are introduced and commented, as well as complementary results. At last, the tuning strategy concludes the paper. 2. The filtering method 2.1. Problem statement r be the real wind. It is a stochastic process indexed by the time Let Vz,t counter t and the position counter z. The evolution of this stochastic process is described by an imperfect model fz,t , which induces an error ǫrz,t . o Let Vz,t be the observed wind. It is also a stochastic process linked to the real wind by the observation operator hz,t . The measurement error is r denoted ǫoz,t . The system to solve is the following, with unknown Vz,t :  r r Vz,t+1 = fz,t (Vz,t ) + ǫrz,t (1) o r Vz,t = hz,t (Vz,t ) + ǫoz,t

286

287

The tested filtering method is based on a Lagrangian representation of the fluid. Instead of a stochastic process of the form Vz,t , we have a positionspeed couple of stochastic processes (Xt , Vt ) linked by Ito integral:  dXt = Vt dt (2) Vt = VXr t ,t The difficulty when changing Eulerian quantities into Lagrangian ones is to deal with boundary conditions. The space is divided in boxes B(z) of vertical extent ∆z. At each time step, four steps modify the probability law of these processes: (1) Mutation step: use the model to forecast the wind and position at the next time step. L(Vt , Xt ) is such that Vt = VXr t ,t . (2) Conditioning step: ensure the Lagrangian position is inside a space ˜ t ) = L(Vt |Xt ∈ B(z)). subdivision. L(V˜t , X (3) Selection step: correct the Lagrangian speed in accordance with the o ˆ t ) = L(Vt |Xt ∈ B(z), Vz,t observation. L(Vˆt , X ). (4) Estimation step: calculate output estimations, update quantities   o needed in the Lagrangian model. V e (z, t) = E Vt |Xt ∈ B(z), Vz,t

Fig. 1.

Four steps of the algorithm repeated at each time step.

ˆt) After the selection step, the full probability law of L(Vˆt , X

r o L(Vz,t |Vz,t )



∆z→0

is known. The wind value provided in output is the expectation of this law:  r o V e (z, t) ≃ E Vz,t |Vz,t (3) ∆z→0

It can be seen as the average of all realizations of wind that gave such observations. 2.2. Particle approximation ˜ t ) and L(Vˆt , X ˆ t ) are approached The probability laws L(Vt , Xt ), L(V˜t , X i i with Monte Carlo methods. A set of N particles (Vt , Xt )i∈[[1,N ]] samples the

288

law of the couple of stochastic processes (Vt , Xt ). The instrument (LIDAR∗ ) measures vertical velocities only. As a consequence, particles evolve in a 1-dimensional space. The four steps of figure 1 are applied independently to each particle. To perform the mutation step, a stochastic Lagrangian model of turbulence 2,3,14 is used. The estimation of the wind is thus obtained by taking the average of all particles at the same vertical level. It is denoted ˆ t) = {i, X ˆ ti ∈ B(z)} the set of particles at level z after selection. B(z, X 1 Vˆti (4) V e (z, t) = ˆ t)| |B(z, ˆ i∈B(z,t)

2.3. Filtering by genetic selection The noise removal is done at the selection step. It is called selection because particles are selected according to their likelihood:   2  ˜ti − V o (z, t) V   Gobs (i) = exp −  2(σ obs )2 Such form of likelihood assumes the measurement error ǫoz,t is Gaussian, centred and of standard deviation σ obs . This assumption is relevant for LIDAR measurements 6 . From this likelihood, the selection is two-fold: (1) Acceptance-rejection: particles with small likelihood are rejected randomly. (a) u is a random number taken uniformly between 0 and 1. ˜ t), Gobs (i)/ max(Gobs ) < u} (b) Particles rejected: Irej = {i ∈ B(z,

(2) Importance resampling: rejected particles are replaced by others taken randomly, according to their likelihood. For each irej in Irej , (a) inew is a particle taken randomly with Gobs (i) as empirical probability density function. ˆ tirej , Vˆtirej ) = (b) Reset the rejected particle to the value of inew : (X ˜ tinew , V˜tinew ). (X End for Such algorithm has been qualified in the literature 1,4,5 . It has already been applied to punctual wind measurements 2 and then extended to LIDAR 15,19 . ∗ Light

Detection and Ranging

289

2.4. Testing the method In order to test the method, a reference wind is taken (good quality measurements from the BLLAST research project 11 ). The reference wind has two functions: • Simulating the observations, so that the observation noise is known and controlled. • Assess the quality of the retrieval, by computing root-mean-squared error. To assess the quality of the retrieval, several scores are used: the rootmean-squared error (RMSE) of the wind, rV , the wind spectrum slope, b, the execution time, Texe . The wind RMSE and the execution time are expected to be as low as possible. The wind spectrum slope is expected to match the theoretical value of -5/3 in inertial turbulence 7,13 . The resulting system is represented by the diagram 2:

Fig. 2.

Diagram of the testing experience.

Throughout the four steps, several parameters are set to a default value. There are summarized in the table 1. Qualitatively, it is known that they affect the scores. To set them correctly, a quantitative knowledge of their influence is needed. This is the purpose of sensitivity analysis. 3. Sobol indices 3.1. Definition The system in figure 2 is seen as a function of p = 9 inputs: Y = f (X) with X = (C0 , C1 , ℓ, N, σ add , σ obs , σ X , σ V , τ ) and Y = rV or Y = b or Y = Texe .

290

Table 1. Notation C0 C1 ℓ N σ add σ obs σX σV τ

Setting parameters studied in the sensitivity analysis. Description Kolmogorov constant Fluctuation coefficient Spatial interaction length Number of particles True observation noise Given observation noise Discretization error in the Lagrangian model Default standard deviation of wind speed Integration time

Place in the system Mutation Mutation Estimation All steps Observation simulation Selection Mutation Conditioning Score computation

Sobol indices are sensitivity scores that represent the proportion of variance attributed to a group of input 10,18 . The number of parameters in the group is called the order of the Sobol index. Simple Sobol indices account only for the influence of the group alone. Total Sobol indices include the influence of all the parameters in the group, at any order 9 . Only first order simple (eq. 5), first order total (eq. 6) and second order simple (eq. 7) Sobol indices have been calculated in this work. Si =

SiT

Si,j =

V (E [Y |Xi ]) V (Y )

  V E Y |X[[1,p]]\i =1− V (Y ) V (E [Y |Xi , Xj ]) − Si − Sj V (Y )

(5)

(6)

(7)

To bypass the computational cost, a surrogate model was used 12 . Given 2000 evaluations of the function f in Y = f (X), it is approximated by a faster function f˜ obtained by simple kriging. The faster function f˜ is used instead of f in Saltelli’s estimators (first 17 and second 16 order). 3.2. Results Figure 3 shows the estimations of the Sobol indices for the two outputs b and rV . For the execution time, N is the only influential parameter (graph not shown). For the wind error, rV , influential outputs are σ add , N , C0

291

and σ obs . For the wind spectrum slope, b, influential outputs are σ add and σ obs . Overall, important parameters are σ add , σ obs and N . Their influence is explored by additional experiences.

Fig. 3. Interaction graphs : second order (edges shades), first order simple (blue inner ring width) and total (green outer ring width) Sobol indices for wind spectrum slope (a) and wind error (b).

3.3. Additional experiences The wind spectrum slope is equal to the theoretical value of -5/3 when σ obs = σ add , as shows figure 4. In practice, σ add is not perfectly known and σ obs must be provided to the filter. The wind spectrum slope can be used to correctly set σ obs . Figure 5 depicts the error on wind against N and σ obs . One can see a decrease with N whatever σ obs value is. One can also notice that the error is in a low when σ obs = 0.5m·s−1 , which is the default value of σ add . When σ obs = σ add , the surface response of rV (not shown) matches the surface: σ obs (8) rV = K √ N with K = 2.33. It gives a simple expression of the error left after filtering. Similar regression for the execution time, Texe reveal a relation of form Texe = aN h −4

(9)

with h = 1.75 and a = 8 · 10 s. Doubling the amount of particles costs 3.36 times more execution time in the tested range.

292

Fig. 4. Evolution of b when only σ add and σ obs vary. The sampling grid has 20 values of σ obs and 20 values of σ obs (400 points in total). The red plan is at the level b = −5/3 (theoretical expected value).

Fig. 5. Evolution of rV when only N and σ obs vary. The low is reached when σ obs is at the default value of σ add .

4. Conclusions The filtering method has been embraced in a broader framework in order to be tested. The sensitivity analysis highlights the following tuning strategy: (1) Set N to a low value, such that Texe is small. (2) For σ obs ranging around the a priori accuracy of the instrument, calculate the wind spectrum slope b. (3) Set σ obs to the value which gives b the closest to -5/3. σ obs is then almost equal to σ add . (4) Set N to the maximum affordable value. The error on wind retrieval obs is now minimum, estimated by K σ√N with K = 2.33 Such tuning strategy helps to improve the accuracy of the filtering method. The three outputs were considered separately in this work, but there exists Sobol indices for multidimensional outputs 8 that might be worth to try. An interesting prospect might be to redo the sensitivity analysis with a different reference and see if it changes the results. References 1. M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on signal processing, 50(2):174–188, 2002. 2. C. Baehr. Nonlinear filtering for observations on a random vector field along

293

3. 4. 5.

6.

7. 8.

9.

10.

11.

12.

13. 14. 15.

16. 17.

18. 19.

a random path. Application to atmospheric turbulent velocities. ESAIM. Mathematical Modelling and Numerical Analysis, 44(5):921, 2010. S. Das and P. Durbin. A Lagrangian stochastic model for dispersion in stratified turbulence. Physics of Fluids (1994-present), 17(2):025109, 2005. P. Del Moral. Feynman-Kac Formulae. Springer, 2004. A. Doucet, S. Godsill, and C. Andrieu. On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and computing, 10(3):197–208, 2000. R. Frehlich and M. Yadlowsky. Performance of mean-frequency estimators for Doppler radar and lidar. Journal of atmospheric and oceanic technology, 11(5):1217–1230, 1994. U. Frisch. Turbulence: the legacy of AN Kolmogorov. Cambridge university press, 1995. F. Gamboa, A. Janon, T. Klein, A. Lagnoux, et al. Sensitivity analysis for multidimensional and functional outputs. Electronic Journal of Statistics, 8(1):575–603, 2014. T. Homma and A. Saltelli. Importance measures in global sensitivity analysis of nonlinear models. Reliability Engineering & System Safety, 52(1):1–17, 1996. B. Iooss and P. Lemaˆıtre. A review on global sensitivity analysis methods. In Uncertainty Management in Simulation-Optimization of Complex Systems, pages 101–122. Springer, 2015. M. Lothon, F. Lohou, D. Pino, F. Couvreux, E. Pardyjak, J. Reuder, J. Vil` aGuerau de Arellano, P. Durand, O. Hartogensis, D. Legain, et al. The BLLAST field experiment: Boundary-layer late afternoon and sunset turbulence. Atmospheric chemistry and physics, 14(20):10931–10960, 2014. A. Marrel, B. Iooss, B. Laurent, and O. Roustant. Calculations of Sobol indices for the Gaussian process metamodel. Reliability Engineering & System Safety, 94(3):742–751, 2009. A. S. Monin and A. M. Yaglom. On the laws of small-scale turbulent flow of liquids and gases. Russian Mathematical Surveys, 18(5):89–109, 1963. S. Pope. Lagrangian PDF methods for turbulent flows. Annual review of fluid mechanics, 26(1):23–63, 1994. L. Rottner. Reconstruction de l’atmosph`ere turbulente a ` patir d’un lidar Doppler 3D et ´etude du couplage avec Meso-NH. PhD thesis, Universit´e Paul Sabatier-Toulouse III, 2015. A. Saltelli. Making best use of model evaluations to compute sensitivity indices. Computer Physics Communications, 145(2):280–297, 2002. A. Saltelli, P. Annoni, I. Azzini, F. Campolongo, M. Ratto, and S. Tarantola. Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index. Computer Physics Communications, 181(2):259– 270, 2010. I. M. Sobol. Sensitivity estimates for nonlinear mathematical models. Mathematical Modelling and Computational Experiments, 1(4):407–414, 1993. F. Suzat, C. Baehr, and A. Dabas. A fast atmospheric turbulent parameters estimation using particle filtering. Application to LIDAR observations. In Journal of Physics: Conference Series, volume 318. IOP Publishing, 2011.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 294–306)

The simulation of Coriolis flowmeter tube movements excited by fluid flow and exterior harmonic force* V. A. Romanov Department of Technical Mechanics, South Ural State University, Chelyabinsk, 454080, Russian Federation E-mail: [email protected] http://susu.ru V. P. Beskachko Department of Computer Modeling and Nanotechnology, South Ural State University, Chelyabinsk, 454080, Russian Federation E-mail: [email protected] http://susu.ru In connection with the issues of increasing the accuracy and reliability of measurements of mass flowrates of fluids by Coriolis flowmeters, 1-D and 3-D models of interaction between the sensitive element of the instrument (the measuring tube) and the with fluid flow are considered. The main attention is paid to the study of the dissipative properties of the system to total losses of mechanical energy by means of both numerical and full-scale experiments. A method for estimating the contributions of the structural and fluid parts of the system to total losses of mechanical energy is proposed. It is shown that these two types of losses are commensurable in magnitude. The contribution of the liquid to the energy dissipation increases with the increasing oscillation amplitudes and with the rising of the fluid flowrate. Attempts to interpret the dissipative properties of CFM using finite element 3D modeling implemented in the Ansys products led only to a qualitative agreement with the experimental data. Keywords: Mass flow measurements; Coriolis flowmeter; Fluid–structure interaction; Dissipative properties; Computer simulation; Numerical model.

1. Introduction Coriolis flowmeter (CFM) is one of the most advanced devices for measuring the mass flowrate of fluids used in a wide range of applications [1]. Its principle of South Ural State University is grateful for financial support of the Ministry of Education and Science of the Russian Federation (grant No 9.9676.2017/8.9).

*

294

295

operation is based on observing the characteristics of forced vibrations of a tube carrying a fluid flow. The theoretical basis of the measurement method refers to the field of mechanics connected with the problems of hydroelasticity, where analytical solutions are the exception rather than the rule. At present, these problems are known as the problem of Fluid Structure Interaction (FSI) [2-4]. Since the beginning of the large-scale distribution of CFMs in the 1990s, many models of their work have been created, differing in depth and accuracy [5-7]. The most popular now are the 1D-models, in which measuring tube is considered as an Euler-Bernoulli or Tymoshenko beam, and the liquid is a homogeneous, inextensible, massive filament are moving along the axis of the tube at a constant speed. There are many results obtained with the help of 1D-models, justified on the experience and useful for CFM design [1,15]. The next steps in improving CFM will be based on a more detailed description of the fluid flow than it is possible in 1D models. This requires the creation of two relatively independent 3D models: the elastic tube model using computational structural mechanics (CSM), and the fluid flow model using computational fluid dynamics (CFD) methods. A non-trivial task is to organize a coupling interaction between these models. The first attempt of this kind was made in [8] for CFM with the simplest geometry — with a straight tube. It was shown that the vibrations of the tube lead to a distortion of the flow velocity profile, which is especially noticeable at low Reynolds numbers and can causes systematic error of the instrument. This possibility was confirmed by subsequent studies of the 3D model [9-13], as well as experimentally [14]. The phase behavior used in the CFM as a parameter for estimating the flowrate of a flowing liquid in the near-resonance zone of the vibrational system is extremely sensitive to any changes in the vibrating regime. The degree of sensitivity depends on to a large extent on the level and nature of the dissipative properties of the vibrating system. The adequacy of the modeling of the dissipative properties largely determines the quality of the building of the CFM dynamic model. In the present work, an example of an existing serial sample of a Coriolis sensor discusses the possibility of calculating the dissipative properties of an oscillating system “elastic tube — a liquid flowing through it”. 2. Forced bending vibrations of a tube with a liquid flow The effect of the flow of liquid on the steady-state forced vibrations of the elastic tube can be seen already in the simple 1D beam model of the CFM constructed by us. The flow gives two additional terms in Eq. (1), that were obtained in [15]:

296

a second term proportional to the first power of the velocity and a third term proportional to the square of the velocity

 mT

 mL  

2 2 y 2 y 4 y 2  y m V m V EJ           f  z, t  , 2 L L t z t 2 z 2 z 4

(1)

where, respectively: у, z — transverse displacement and coordinate along the axis of the tube, mT and mL — mass per unit length of the pipe and the liquid; Е and J — Young’s modulus and the axial moment of inertia of the tube cross section; f(z,t) — distribution of external forces perpendicular to the axis of the tube. The motion of a fluid with velocity V is taken into account by the second and third terms representing the Coriolis and centrifugal inertia forces, respectively. In order to simplify the analysis of the steady-state forced vibrations at the operating frequency of CFM, we assume, firstly, that the shape of the resonant oscillations coincides with the resonant eigenmode (the hypothesis of H. Wydler). This is justified for vibrating system with a sufficiently low damping level or with proportional damping. Then an arbitrary tube deflection can be represented in the form y ( z , t )   u k ( z )  q k (t ) 

(2)

k

where u k ( z ) — k-th eigenform of tube movements at V  0 , qk (t ) — k-th main coordinate (decomposition coefficient). Secondly, in (2) we confine ourselves to the two most important terms of the series, corresponding to the resonant eigenmode of the vibrations at V  0 (the eigenmode of excitation) and the shape of vibrations initiated by the Coriolis forces (information eigenmode): y ( z , t )  u1 ( z )  q1 (t )  u 2 ( z )  q2 (t ) .

(3)

The validity of this assumption is largely determined by the nature and level of damping. Substitution of relation (3) into equation (1) makes it possible to obtain the system of equations customary for the theory of small vibrations:

Aq  B q  C q  0

where

A   aij  , aij  0 i  j , akk  M k ,

(4)

(5)

B   bij  , bij  2  mL  V  D1ij ,

 EJ D3ij  C   cij  , cij  mL  D 2ij  V 2  .  mL D 2ij  

297

(6) (7)

The quantities appearing in the expressions for the inertia matrices (A), damping (B) and elasticity (C) are defined as

M k   mT  mL    uk  z   uk  z  dz, L

D1ij   ui  z   u j  z  dz , 0

L

D 2ij   ui  z   u j  z  dz , 0

L

(8)

D3ij   ui  z   u IV j  z  dz. 0

L

0

The presence of non-zero off-diagonal elements in the matrices B and C indicates a connection between the eigenmode of excitation and the information eigenmode. In equations (4), the right-hand side (the driving force) is omitted, which allows us to interpret them as the equations for the “natural vibrations” of a tube in the presence of a moving fluid in it. The term in (4), which is responsible for the dissipative properties of CFM, has the same form as for the model of a system with linear viscous friction. However, unlike the academic case, the elements of the matrices B and C depend on the fluid velocity. This predetermines the dependence on the fluid flow both of the “eigenmode shapes” and the dissipative parameters of CFM. Using the transformations carried out in [16] in the analogous case, one can obtain the ratio (β) of the amplitudes of generalized coordinates of the exciting u11 and information u12 shapes of vibration, which is valid for resonance at the frequency p1 of CFM



u21 c  p12 a11   11  u11 c12  p12 a12

a11 1  1   2 . a22 2

Eigen frequencies pk of a coupled system with two degrees of freedom are:

(9)

298

2  p1,2

 p 2  p22  p12  p22 p12  p22  p12  p22  2 2 2   1    p p    1 2 2 2  2 2    2

1   2 , (10)

2

where pk — frequencies of partial subsystems pk2 

ckk mL D 2k , k  akk Mk

 2 EJ D3kk  V  . mL D 2kk  

(11)

The connection between equations (4) can be characterized by the coupling coefficient    2 EJ D312    D 212  V   mL D 212        2 EJ D322 EJ D311  D 211  V 2   D 2 22  V  mL D 211  mL D 2 22   2

 

с122 с11с22

or the coefficient of association 

 

2 p1 p2

p12  p22

.

  

,

(12)

(13)

A convenient geometric interpretation of the relations (10)–(12) is shown in Fig. 1. With its help, it is easy to trace the influence of the coupling between partial systems (coefficients  or  ) caused by the flow of fluid on changes in the resonance frequencies of the composite vibration system.

Fig. 1. Graphical representation of the ratio of the eigen frequencies and partial frequencies of the coupled system.

299

The level of the information signal in the two Coriolis sensor designs is greater in the one where the coefficient of association σ is higher. If we agree with [15] and assume that the shift of the resonance frequencies under the action of the flow can be neglected, then for the sensor Flomac S015 (Fig. 2) it is possible to increase the coefficient of association by 1.515 times (Table 1) by changing the angle  between the ascending and descending branches of the tube from 88 to 120 as a result of the decreasing difference between the frequencies of the exciting ( p1 ) and information ( p 2 ) partial subsystems. In this case, the increase in the amplitude of the generalized coordinate of the information partial subsystem with the same amplitude of the generalized coordinate of the exciting subsystem will be 33.9% (Table 1: θ = 88 — the initial existing version of the geometry of the tube, θ = 120 — the version of the geometry of the tube with an increased amplitude of the oscillations of the information partial subsystem).

Fig. 2. Geometry of the elastic element of CFM Flomac S015. Table 1. Dependence of the vibration parameters of the exciting and information subsystems on the design parameter θ.

θ, degrees 0 88 120

p1 p2

0.369 0.562 0.676

 0 1 1.923 2.914

 0 1 1.75 2.343

3. Experimental determination of the dissipative properties of the Coriolis sensor The nature of the damping is of fundamental importance for the behavior of the oscillatory system in the resonance regime. A laboratory stand was created to

300

determine the damping characteristics of CFM samples. Records of CFM damped oscillations for various levels of water flowrate were recorded. Steady-state resonant vibrations with amplitudes close to the operational ones were excited in CFM Flomac S015. Then, the excitation was turned off and the sensor readings of displacement with a frequency exceeding the frequency of the vibrations in about 90 times were recorded. The processing of the vibration patterns of the damped vibrations was reduced to building an envelope curve of the maximum deviations, which was used both to determine the logarithmic decrement δ and to analyze the degree of the damping nonlinearity. It has been found that the value δ does not depend on the vibration amplitude only for the case of dry tubes. When the tubes were filled with liquid, even at V  0 the amplitude decrease does not follow the law of geometric progression, which points at a nonlinear nature of damping (Fig. 3).

Fig. 3. Experimental data on the amplitude dependence of the CFM vibration decrements.

The nonlinear component of the dissipative forces was separated according to the algorithm where the generalized frictional force was assumed to be proportional to the n -th degree of velocity [18] Q*  b  q

n 1

 q

(14)

Thus, the cyclic energy losses W corresponding to different friction models (different n = 0, 1, 2 ...) are proportional to A n 1 : W ~ A for the Coulomb dry friction model ( n  0 ), W ~ A 2 for linear viscous friction ( n  1 ), W ~ A3 for

301

quadratic friction ( n  2 ) etc. Taking into account the linear independence of the functions An we can present the experimentally observed cyclic losses in the form of the next series in terms of powers of the small parameter A

W  WC  WL  WQ  ...  0 A  1 A2  2 A2  ...

,

which allows to identify the nature of damping in the vibrating system and to estimate the contribution made to irreversible losses by each possible friction mechanism (dry, linear, quadratic, etc.), by their expansion coefficients  0 , 1 ,  2 , respectively. The direct application of the aforesaid algorithm was hindered by the presence of an irregular component in the experimental dependences of   A  , which we attribute to the existence of parasitic low-frequency noises in the signal (see, for example, Fig. 4a). Therefore, we previously applied smoothing of the function   A  on a certain interval of the amplitude variation. Fig. 4b shows the result of the algorithm. It can be seen that the dissipation of mechanical energy in the considered CFM consists of linear losses in the tube material and quadratic losses in the fluid in approximately equal proportions, which depend appreciably on the vibration amplitude and the flowrate of the liquid. This figure corresponds to the water flowrate of 0.544 kg/s and a starting vibration amplitude of about 1 mm.

Decrement

Logarithmic decrement

0.6 Relative contribution

4

510

4

410

Contribution to the dissipated energy

0.45 0.3 0.15 0.6

4

310

0.6

0.7

0.8

0.9

1

Vibration amplitude (ralative value)

0.7

0.8

0.9

1

Vibration amplitude (relative value)

a)

- contribution of Coulomb damping - contribution of linear damping - contribution of quadratic damping

b)

Fig. 4. The results of processing the vibration patterns of damped oscillations: a) a characteristic form of the dependence of the decrement on the amplitudes of damped oscillations; b) a characteristic form of the dependence of the relative contribution to the energy dissipation of the linear and quadratic components of viscous friction on the amplitudes of oscillations.

302

4. Calculation of a flowmeter sample using the finite element method In 1D-models it is impossible to adequately describe losses of mechanical energy in the liquid due to internal friction, whereas they lack the very concepts of the shear flow and the velocity gradient. At the same time, the data presented in Fig. 3 clearly shows that these losses are present, even if the liquid does not move along the tube ( V  0 ). The result just presented in Fig. 4 indicates that losses in the liquid cannot be neglected in general. All this supports the opinion expressed in the Introduction about the need to develop 3D models that allow a more realistic description of the fluid flow in the CFM tube. A finite element model was created for the Flomac S015 sensor (Fig. 5) in the Ansys Products application software package.

Fig. 5. Fragments of the finite-element decomposition of the geometric model.

The FSI solution technology in the Ansys package provides for joint calculations for two relatively independent parts of the finite element model: the Transient module for analyzing the mechanics of the deformable solid ANSYS Mechanical is used to calculate the elastic tube, and to calculate the velocity and pressure distribution fields in the fluid, we utilized CFD calculations module Fluent. The interaction of relatively independent parts of the finite element model was organized through the integrated Two-Way FSI technology in the Ansys package. When exchanging data, the solvers are synchronized through a sequence of predefined synchronization points (SPs). In each of these SPs points, each decision-maker collects the necessary data from the other resolver Iterations are repeated until the maximum number of repetitions is reached or until the data transmitted between the resolvers converge. The theoretical base of these calculations is described in detail, for example, in [10-14]. All the work stages on building the finite element model, its loading and use were performed in the ANSYS Workbench interface, which provided the

303

interaction of the Geometry (Space Claim, Design Modeler), Mesh, Transient Structural, Fluent, System Coupling, Results (CFD-Post) modules. To quantify the dissipative properties, the sensor model underwent three stages of computation. First, in several steps, we calculated the steady motion of the liquid of a preset flowrate (Fig. 6). Then, a harmonic driving force with a frequency close to the resonant one was applied for several vibration periods in the point of the exciting electromagnet in the flowmeter. After the driving force ceased, the process of damped vibrations was recorded.

.

‐ .

.

.

.

. 8

.

.

‐ the excitatio  force

displace e t

Fig. 6. Stages of excitation, establishment and attenuation of oscillations in the 3D model.

12H18N10T steel was used as the tube material. A single-phase liquid with physical characteristics of water was adopted as the properties of the fluid. Inviscid was used as a flow model. With such a trivial flow version, the solver’s settings for spatial sampling of the velocity, pressure, and momentum gradient almost did not influence to the vibrations damping rate. The convergence criteria of the iterative processes both for the solvers and the System Coupling module have been tightened by an order of magnitude relative to the value offered in the software package by default. The rate of attenuation turned out to be the most sensitive to the time step of integration Δt (Fig. 7). Fig. 7 shows an initial fragment of the damped vibrations obtained by calculations with different time step of integration Δt. The dependence of the logarithmic decrement of oscillations on the step Δt with the oscillation amplitude of the excitation point about 1 mm and the resonance frequency of the 100 Hz sensor operation is shown in Fig. 8, a.

304

Displace e t, 

. . .

‐ .

.

‐ .

.

.

.

Ti e. s ‐ Ti e Step  .   s

‐ Ti e Step  ,   s

‐ Ti e Step  .   s

Fig. 7. Initial fragment of damped vibrations obtained by calculations with different sampling intervals.

A decrease in the time step from 0.3 ms (33 computational points for the vibration period) to 0.1 ms (100 computational points for the vibration period) is accompanied by a halving of the decrement and is not asymptotic. Even at t  0.1 ms, the decrement predicted by the model is more than an order of magnitude larger than the value observed experimentally (see Fig. 3). Approximately the same result was obtained in the earlier studies of 3D models performed using computational tools other than Ansys. The frequency of damped oscillations depends on t lesser (Fig. 7), but such a dependency exists (Fig. 8, b)

a)

b)

Fig. 8. Dependence of the logarithmic decrement of vibrations with the amplitude of 1 mm on the choice of the integration step over time.

5. Conclusion The attempts of the experimental and numerical determination of the dissipative characteristics were made for the coupled nonlinear vibrating system “elastic tube — liquid flowing through it”. Based on the data processing of the performed

305

experiments, we established a dependence of the dissipative properties of the considered vibrating system both on the vibration amplitudes and on the flowrate of the flowing liquid. This dependence is reproduced at a qualitative level using a 3D model. We did not manage to reach a quantitative agreement with the experimental data, despite the numerous numerical experiments performed using various settings of the Ansys package. It has turned out that the time step of integration has the greatest influence on the estimates of the dissipative properties. Its decrease improves the agreement with the experimental data (and increases the amount of calculations). A decrease of the interval to a value when the computational costs still remain acceptable (up to 1/50 of the vibration period) leads to predictions for the vibration decrements, which exceed the experimental ones by approximately two orders of magnitude. Thus, quantitative estimates of the dissipative properties of such systems as CFM require a transition to a new level of accuracy in describing the “fluid-structure” interaction. Acknowledgments

South Ural State University is grateful for financial support of the Ministry of Education and Science of the Russian Federation (grant No 9.9676.2017/8.9). References

1.

2. 3. 4. 5.

6. 7.

T. Wang, R. Baker, Coriolis flowmeters: a review of developments over the past 20 years and an assessment of the state of the art and likely future directions, Flow Measurement and Instrumentation 40, 99 (2014). A.S. Vol’mir, Shells in Gas and Liquid Flows. Problems of Hydroelasticity, (Nauka, Moscow, 1979) (in Russian). M. P. Païdoussis, S. J. Price, E. de Langre, Fluid-Structure Interactions. Cross-flow-induced instabilities (Cambridge University Press, UK, 2011). M. P. Païdoussis, Fluid-structure interactions. Slender structures and axial flow (Academic Press, 2016). M. Anklin, W. Drahm, A. Rieder, Coriolis mass flowmeters: Overview of the current state of the art and latest research, Flow Measurement and Instrumentation 17, 317 (2006). G. Sultan, J. Hemp, Modelling of the Coriolis mass flowmeter, Journal of Sound and Vibration, 132(3), 473 (1989). H. Raszillier, F. Durst, Coriolis-effect in mass flow metering, Archive of Applied Mechanics, 61, 192 (1991).

306

8.

9.

10.

11. 12.

13.

14.

15. 16. 17. 18. 19.

G. Bobovnik, N. Mole, J. Kutin, B. Stok, I. Bajsić, Coupled finitevolume/finite-element modelling of the straight-tube Coriolis flowmeter, Journal of Fluids and Structures, 20, 785 (2005). N. Mole, G. Bobovnik, J. Kutin, B. Stok, I. Bajsić, An improved threedimensional coupled fluid–structure model for Coriolis flowmeters, Journal of Fluids and Structures, 24, 559 (2008). V. Kumar, M. Anklin, B. Schwenter, Fluid-structure interaction (FSI) simulations on the sensitivity of Coriolis flow meter under low Reynolds number flows, in. Proc. Flow Measurement Conference (FLOMEKO 2010), (Taipei, Taiwan, 2010). V. Kumar, M. Anklin, Numerical simulations of Coriolis flow meters for low Reynolds number flows, Mapan, 26, 225 (2011). R. Luo, J. Wu, S. Wan, Numerical study on the effect of low Reynolds number flows in straight tube Coriolis flowmeters, in Metrology for Green Growth (XX IMEKO World Congress), (Busan, Republic of Korea, 2012). R. Luo, J. Wu, Fluid-structure coupling analysis and simulation of viscosity effect on Coriolis mass flowmeter, in 5th Asia Pacific Congress on Computational Mechanics & 4th International Symposium on Computational Mechanics (APCOM & ISCM 2013), (Singapore, 2013). C. Huber, M. Nuber, M. Anklin, Effect of Reynolds number in Coriolis flow measurement, in: European Flow Measurement Workshop, (Lisbon, Portugal, 2014). M. A. Mironov, P. A. Pyatakov, A. A. Andreev, Forced flexural vibrations of a pipe with a liquid flow, Acoustical Physics, 56, 739 (2010) (in Russian). S. P. Strelkov, An introduction to the vibration theory (Lan', St. Petersburg, 2005) (in Russian). L. I. Mandelstam, Lectures on Vibration Theory (Nauka, Moscow, 1972 – in Russian). Y. G. Panovko, Introduction to mechanical oscillations theory (Nauka, St. Petersburg, 1989) (in Russian). Virtual Prototyping of a Coriolis Effect Mass Flow Meter http://www.ansys.com/other/virtual-prototypes/virtual-prototyping-of-acoriolis-effect-mass-flow-meter

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 307–314)

Indirect light intensity distribution measurement using image merging I. L. Sayanca*, K. Trampert and C. Neumann Karlsruhe Institute of Technology, Light Technology Institute, Engesserstraße 13, 76131 Karlsruhe, Germany *E-mail: [email protected] In this work, we propose a method to indirectly measure the complete angle-dependent light intensity distribution (LID) of lamps and luminaires, using a goniometer, a white screen and a luminance camera. The goniometer permits an accurate variation of two angles in spherical coordinates by moving the Device Under Test (DUT). We take several camera images of the reflexion of the DUT on the screen with different viewing angles, correct the perspective distortion of each image using planar homography and finally merge the set of images using the geometrical information of the system. We designed and measured a special DUT, which produces known LIDs and photometrical values, and compared the results with conventional scanning methods, in order to validate the proposed method and to estimate their measurement uncertainty. Using the suggested process, higher resolutions and lower measurement times can be obtained.

1. Introduction At night, the sensitivity of the eye cells decreases considerably, so that artificial lighting systems are needed to perform basic tasks such as walking, reading or driving. For this purpose, new lighting systems are constantly developed for applications such as interior, street and automotive lighting. Modern smart house lighting systems and multi beam LED car headlights are capable of producing a great variety of light distributions with high resolutions. All these applications have to comply with certain laws and standards regarding light distributions and maximum values before they can be implemented in real products. To assure that fixed limits are respected, reliable measurement techniques are indispensable. The state of the art is to measure light intensity distributions (LID) using scanning systems, like a goniophotometer. If the LIDs to be captured are complex and irregular, high measurement resolutions are necessary, which result in long measurement times. In this context, we suggest a method to measure the light intensity distribution of lamps and luminaires using a luminance camera, a goniometer and a white screen. This measurement process allows both, shorter 307

308

measurement times and higher angular resolutions in comparison to conventional scanning measurement techniques. With the measured LIDs, the conformity with regulating laws and standards can be tested and complex light distributions simulated. Our method can be applied on an industrial scale, where measurement times are very important, and for big luminaires, like automotive headlights. 2. Basic Concepts of Light Measurement Light is the electromagnetic radiation visible to the human eye. Therefore, photometry is the measurement of light. Photometric quantities can be calculated using the spectral luminous efficiency function � � , which describes the wavelength dependent spectral response of the human eye at daylight. There are four fundamental photometric quantities: the luminous flux � is the total light emitted by a light source and it is measured in lumen [��]. The light intensity I is the amount of light emitted per solid angle �Ω, expressed in candela [��]. The Illuminance E is the amount of light falling on a surface and it is measured in lux [��]. The last quantity is the Luminance L, which refers to the amount of light coming from a surface element in a specific direction in [��⁄� ]. A light intensity distribution (LID) � �, � is an angle-dependent light intensity measurement in the far field using a goniometer and a photometer. By moving the photometer with the help of the goniometer and choosing an adequate coordinate system, we are able to measure light at different angles (see Fig. 1a). The detector has to scan each point of the distribution separately. These measured points are then interpolated to obtain smooth LIDs. However, to measure very sharp LIDs, high angular resolutions are necessary, which are extremely time consuming [1 and 2]. An alternative to conventional scanning methods is to use an Imaging Luminance Measurement Device (ILMD) instead of a photometer. An ILMD is an optically filtered CCD camera with a precision objective lens. But to measure LIDs with an ILMD, a special test setup is necessary: a white or a transparent screen has to be illuminated with the DUT and an image of this screen needs to be captured by the camera (see Fig. 1b). The LID is measured indirectly by measuring the screen instead of the DUT. In comparison to a scanning method, a large number of measurement points can be captured simultaneously in a single image, reducing considerably measurement times [1]. Similar measurement methods using a single camera image are proposed in [3] and [4]. However, this method is only applicable to small LEDs and lamps, because they emit light over a small solid angle.

309

a) b) Fig. 1: a) Coordinate system with their origin at the DUT with a schematic ray at the Photometer’s position and b) Test setup showing the relative position of DUT, ILMD and screen to each other.

The measurement area that the ILMD can measure with a single image is defined by the lens’s field of view, the screen size and the distance between camera and screen. If the LID to be measured is bigger than this measurement area, it is necessary to combine several screen images with different viewing angles between the DUT and the screen. In this paper, we propose a method using the described test setup in combination with a goniometer to scan the full solid angle. The measured set of images has to be merged into one single image then, so that the complete LID can be obtained. Note that most common DUTs, such as automotive headlights and luminaires, require image merging. 3. Machine Vision System The key component of the proposed experimental setup for LID measurements is an automotive goniometer. It consists of an optical table attached to two moving mechanical axes, which allows an accurate variation of two spherical angles � and �. We fix the DUT on the optical table adjusting their photometric centre to the origin of the goniometer’s coordinate system by using a cross laser. A far-field measurement means that the distance between the DUT and the detector is large enough to neglect the spatial extension of the DUT and to approximate it as a point source located at their photometric centre. This premise is necessary to correctly measure LIDs. Therefore, we place a white screen at 10 m distance along the optical axis of the system. Due to physical constraints, the luminance camera is fixed on the goniometer at the height of the optical axis but laterally displaced, looking at the screen with a different perspective in comparison to the DUT. For validation purposes, a photometer is also included above the screen (see Fig. 1b). In order to measure luminance values on the screen with the ILMD, the screen has to reflect light homogeneously in all directions i.e. Lambertian diffuse. This means that the luminance is constant and independent of the viewing angle. The

310

spectral reflectance of the screen has to be wavelength independent, meaning that all wavelengths are reflected equally. In practice, we have to deal with deviations. A stray light reduction in the measurement room is also necessary when using an ILMD. Stray light means all light included in the measurement although it does not radiate directly from the light source but from room reflexions. To minimalize stray light, we designed an aperture system in the room using light absorbent materials. This way, stray light values of 1.5 % can be achieved. 3.1. Camera Measurements Due to the laterally displaced position of the camera, the first step after taking an image of the screen using the ILMD is to correct the perspective distortion of the images by using planar homography. After applying this correction, the region of interest i.e. the screen area containing the measurement data can be separated from the rest of the image. Then it is possible to transform each pixel position �, � on the image into spherical coordinates � and � on the screen, using [3]: �



tan



tan

(1)

.



.

(2)

Here � �/� and � �/� refer to the camera magnification in horizontal and vertical direction, with � ∙ � representing the captured image in pixels and � ∙ � the size of the screen in meters. OA is the optical axis of the system and � the distance between screen and DUT (see Fig. 2). Once the geometrical information of the system is known, the next step is to convert the captured luminance values, first into illuminance and then into light intensity values. Taking into account the Lambertian reflectance of the screen and that the screen reflects light into a solid angle hemisphere, we calculate: � �, �, �

, ,

∙ ∙



,

.

(3)

∙� .

(4)

where � is the mean value of the measured spectral reflectance and �´ the distance to measured point from the DUT. Since conventional methods measure the LID on a spherical surface, and our method measures a ray projection on the screen’s plane surface (see Fig. 1), a transformation is necessary [1, 3 and 4]: � �, �

, ,

When illuminating the screen using a light source with almost constant light intensity values in the far field, all measured pixels should ideally contain equal light intensities. But in practice we obtain an image with spatial differences due to inhomogeneities on the screen and stray light effects. We use this measured

311

image to calculate a pixel dependent correction factor � �, � used in (4) to correct reflectance spatial dependencies.

Fig. 2: Relationship between the pixel, spherical and Cartesian coordinates for a schematic ray.

The next step consists of taking images from different angles of view of the DUT by moving the goniometer. Here we use the homographic relations of images with a common camera centre but different angle of view i.e. rotations of the camera, in order to correct the perspective distortion of each image relative to the principal image plane [6] before merging the images to obtain the full LID. 3.1.1. Planar homography In computer vision, a camera is usually described using the central perspective imaging model. Here the light rays coming from an object in the “world” coordinate system are projected onto the image plane located at the camera’s focal distance. Using similar triangles, the relationship between world points � and image plane points � can be calculated. If the intrinsic parameters of the camera, such as the focal length �, pixel size � , � and principal point � , � , are known from calibration, we can note the camera projection model in homogeneous coordinates in their general matrix form as follows [5, 6]: �/� � �. � � � (5) �/� � ∙ .

The last matrix contains the extrinsic parameters of the camera, which describes a change in the pose of the camera as a homogeneous transformation i.e. rotation � and translation � in � . A change in the pose can occur on any of the three axes or in a combination of them. A planar homography � is a geometric matrix transformation of points on a plane. The homography transforms the pixel coordinates of the image but not their

312

values, by changing the pose of the modelled camera i.e. position and angle of view [5 and 6]. We use this mathematical tool to calculate the last matrix on (5) and correct the camera position relative to the DUT and therefore the perspective distortion on the images. To calculate �, a minimum of eight points is necessary: four points on the original image plane and four on the corrected image. For example, we can use the four corners of the distorted screen image, and the extrema values of these points to define the vertices of a rectangle in the corrected image. Here we use an existing Matlab Toolbox from [5]. We now change the position of the goniometer in two possible directions or angles. Since these angles are known, we can directly calculate the elements of the matrix �. We carry out the geometric transformation in a straight way, in order to obtain the corrected images in relation to the principal plane. An interpolation in two dimensions is also necessary after applying the homography, so that a constant grid of points i.e. coordinates can be achieved. These are needed to merge the set of corrected images, looking for coordinates in common between images. 4. Results We designed a special DUT to carry out a validation of the measurement method. The DUT consists of a metal circuit board containing high power LEDs in combination with an aperture system, which permits us to generate very accurate LIDs. This DUT is explained in [7] in more detail. We measured this DUT using our method (see Fig. 3) by taking a total of 12 images of the LID by moving the goniometer. We can present the complete LID information either on a reference plane or on the sphere surface. For the first option, we have to correct the perspective distortion of each image relative to the principal plane and then we have to carry out a two-dimensional interpolation, in order to merge the set of images. For the second option a more complex three-dimensional interpolation is necessary. Nevertheless, both representation forms are equivalent and can be transformed into each other, allowing us to display the complete LID. We performed a comparison measurement in the same setup with the conventional scanning method using the photometer above the screen. The results are displayed in figure 4 as a function of � in one defined plane, i.e. at a fixed angle �, but using the equivalent goniometer angular coordinates � and �. They show small differences between the two methods in areas of strong signal levels with a relative mean difference of 1.6%. This difference can be attributed to the stray light in the room, which shows approximately the same order of magnitude. This difference is a systematic error and can be corrected by

313

improving the aperture system. The bigger relative differences can be found in areas with small signal levels, where the effects of stray light become stronger.

a) b) Fig. 3: a) Perspective corrected images of the LID and b) merged full LID.

a) b) Fig. 4: a) Comparison of camera measurements with classical goniophotometry in one fixed � and b) Relative percentage difference between the measured LIDs as angle function.

5. Conclusion and Discussion We presented a method to measure LIDs over the full solid angle combining images with different angles of view by rotating a goniometer. For validation, a DUT with defined optical parameters was designed. We compared our results with conventional scanning methods achieving small percentage differences explained mainly by stray light and inhomogeneity of the screen. The reference method using a photometer measures light intensities with a measurement uncertainty (MU) of 0.8%. A proper MU analysis of our method combining the camera images using the Monte-Carlo method is still object of future research. The proposed method presents some advantages in comparison with the photometer measurement. A significant reduction of the measurement durations is possible, while at the same time achieving much higher angular resolutions in

314

the range of 0.01° thanks to an ILMD. The photometer comparison measurement took 3 days with an angular resolution ten times lower than the camera in comparison to 30 minutes using our method. We demonstrated that the suggested method can correct the perspective distortions of each image correctly. This becomes apparent in the preservation of the circle form of the used DUT aperture system after merging the set of images. However, in order to achieve ideal results with our method, extra efforts are necessary for the measurement room, such as a better aperture system to correct the remaining stray light effects. In this regard, the results can still be improved. Stray light spatial room dependencies can be detected and corrected too. The screen irregular reflectance and spectral dependencies should be kept in mind as well and should be further investigated to better correct the camera measurements and reduce the relative difference to scanning methods. Colour homogeneity analysis i.e. angular and spatially resolved colour measurements are an additional application for the presented measurement method, using a different set of optical filters. Acknowledgments This work has been supported by the German Federal Ministry of Education and Research (program Photonics Research Germany, contract number 13N13396). References 1. 2. 3.

4. 5. 6. 7.

G. Leschhorn and R. Young: Handbook of LED and SSL Metrology, Instruments Systems GmbH, 1st Edition 2017, ISBN 978-3-86460-643-4. CIE International Standard S 017/E:2011: International Lighting Vocabulary, International Commission on Illumination, 2011. I. Moreno and C-C. Sun: Three-dimensional measurement of light-emitting diode radiation pattern: a rapid estimation, Measurement Science and Technology 20, 2009. M. Bürmen, F. Pernus and B. Likar: Automated optical quality inspection of light emitting diodes, Measurement Science and Technology 17, 2006. P. Corke: Robotics, Vision and Control, Springer 2011, ISBN 978-3-64220143-1. R. Hartley and A. Zisserman: Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press 2003, ISBN 987-0-521-54051-3. I.L. Sayanca, K. Trampert and C.Neumann: Colour Measurement with Colorimeter: Mismatch of colour matching function, to be published, Lux Junior, Dörnfeld, 2017.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 315–323)

Towards smart measurement plan using category ontology modelling Q. Qi∗ , P. J. Scott and X. Jiang EPSRC Future Advanced Metrology Hub, School of Computing and Engineering, University of Huddersfield, Huddersfield, HD1 3DH, UK ∗ E-mail: [email protected] In this paper a schema that generates smart measurement plans from the designed specifications is proposed. The knowledge modelling of the proposed schema is based on a new knowledge representation language, named category ontology language (COL) (authors’ ongoing work). In the proposed method, the semantics of a designed specification was modelled into a category S. Then a faithful functor was constructed from the category S to generate a measurement category M, using GPS concepts and philosophy as the framework. A test case of an areal surface texture specification was used to validate the proposed method. It can demonstrate that the schema has the required adaptivity and as such when the design specification is revised or updated, the corresponding measurement plan will be reasoned simultaneously. Keywords: Category theory; Measurement plan; Category Ontology Language (COL); Knowledge modelling; Geometrical Product Specifications (GPS).

1. Introduction In the new era of smart manufacturing, knowledge modelling and reasoning are of fundamental importance for knowledge-intensive tasks 1 . A starting point to achieve this is to ensure the semantics of the product specification can be understood and reasoned in a knowledge network of the manufacturing system 2 . Current representation of the product specifications is mostly focusing on its syntax rather than semantics 3 . For instance, ISO Geometrical Product Specifications (GPS) standard system defines the syntax (symbols) and semantics (meaning of the symbols) of product specifications 4 , and its semantics are yet not computer-readable nor they can be seamlessly transferred to the following production and measurement stages without information loss. To represent semantics of manufacturing knowledge, the current stateof-the-art of knowledge representation methods are largely based on descriptive logics (DLs) to construct different manufacturing ontologies. DLs 315

316

are based on set theory and are best suited to represent relationships between sets. They are therefore limited in extent (no sets of sets) and cannot directly merge two different ontologies, nor construct complex relationships among ontologies. In this paper, using a new knowledge representation language, named category ontology language (COL) (authors’ ongoing work), the knowledge modelling and reasoning of the geometrical specifications will generate computer-readable measurement plans by establishing rigorous semantics mapping from a design category to a measurement category. COL is based on category theory for representing categories and relationship networks between them. It is developed to provide an unified formal language along with a natural reasoning mechanism. The syntax and semantics of COL are based on categorical graphs and category-theoretical notions and operations. COL is substantially distinct from DL-based ontology. Apart from that the mathematical foundations are entirely different, one of the most significant distinguishing features of the categorical-based language over the others is that it represents multi-level knowledge structures (hierarchical) with greatly enhanced searching efficiency. A test case of an areal surface texture specification is used to validate the proposed method.

2. The categorical modelling In this section, a brief introduction of syntax and semantics of the category ontology language is introduced. A category C is denoted by a tuple (NO , NM , NS ), where C.NO is the set of objects in the category, C.NM is the set of morphisms and C.NS is the set of morphism structures. All objects and morphisms satisfy the set of category laws. Objects Let O be any object in the set of C.NO , it may also be one of five special type of objects with extra properties, written as O.p. The five objects are: terminal object (denoted as O.t), initial object (O.i), zero object(O.z), singleton object (O.s) and empty object (O.e). Morphisms A morphism represents a relationship from objects A to B in C, written as f ∶ A → B, here object A is the domain of f , denoted A = f (O1 ) and B is the codomain of f , written as B = f (O2 ). A morphism set in a category C represents morphisms from objects A to B, written as MC (A, B). For any object A ∈ NO , a specified morphism, denoted id(A) is the identity morphism on object A. A morphism f may has none, one or two morphism properties (a morphism may not have any properties), and those properties are epic (denote as ↠), monic (▸→), isomorphic (↔),

317

retraction (●→), section (◾→), both epic and monic but not isomorphic (▸↠). Definitions of such morphisms can refer to 5,6 . A f morphism is often assigned with a notion (n) to make it readable, written as f (n) ∶ A → B. Notions of a morphism could be started with strings such as ‘is’, ‘has’, ‘with’, ‘applied to’, ‘assigned to’, examples are given in the following sections. Morphism structures As a morphism has predefined path, multiple commutative morphisms can form geometric shapes such as lines, triangles or rectangles in which a morphism’s property can generate categorical structures such as pullbacks or pushouts. We define those geometric shapes “morphism structures”, to facilitate reasoning and to improve the efficiency of query and other operations. Based on categorical concepts, we define product structures (×), coproduct structures (⊔), triangle structures (△), rectangle structures (◻), pullback/pushout structures (∏ / ∐) as morphism structures as shown in Figure 1. Note that a morphism structure allows nesting of other morphism structures.

f

O3

O3

u

u

O1 × O2 p1

O1 ⊔ O2

f

g

p2

O1

i1

O2

(a) Product Structure ×

g

O2

i2

O1

m1

O2

(b) Coproduct Structure ⊔

E

m2 m3

O1

(c) Triangle Structure △

E q1

m3

O3

△2

O2 △1

m4

q1

u

m1

O1

q2

m2

(d) Rectangle Structure ◻

u

A π3

O4

C

π1

◻1 π4

B π2

D

(e) Pullback Structure ∏ Fig. 1.

O3

q2

A π3

C

π1

◻1 π4

B π2

D

(f) Pushout Structure ∐

Morphism structures.

Definition 2.1. Product structure ×(O1 , O2 , p1 , p2 ) is constructed by a

318

product of two objects O1 and O2 , and two projection morphisms p1 and p2 , where p1 ∶ O1 × O2 → O1 , p2 ∶ O1 × O2 → O2 . And there exists a unique morphism u ∶ O3 → O1 × O2 , and p1 ○ u = f (here ○ is the composition between morphisms), p2 ○ u = g, where f ∶ O3 → O1 , g ∶ O3 → O2 . Definition 2.2. Coproduct structure ⊔(O1 , O2 , i1 , i2 ) is constructed by a coproduct of two objects O1 and O2 , and two inclusion morphisms i1 and i2 , where i1 ∶ O1 → O1 ⊔ O2 , i2 ∶ O2 → O1 ⊔ O2 , ∃!u ∶ O1 ⊔ O2 → O3 , u ○ i1 = f , u ○ i2 = g, where f ∶ O1 → O3 , g ∶ O2 → O3 . Definition 2.3. Triangle structure △({O1 , O2 , O3 }, {m1 , m2 , m3 }) is formed by two commutative morphism m1 and m2 in between three objects {O1 , O2 , O3 } and a composition morphism of the two m3 = m2 ○ m1 , as shown in Figure 1c. And we represent the first object O1 of a triangle structure as △(O1 ), the second object △(O2 ), and the third object △(O3 ), so does for the morphisms of a triangle structure, we write △(m1 ), △(m2 ) or △(m3 ). Definition 2.4. Rectangle structure ◻({O1 , ...O4 }, {m1 , ...m4 }, △1 , △2 ) is formed by four morphisms and four objects, which also form two triangle structures △1 ({O1 , O2 , O4 }, {m1 , m2 , m1 ○ m2 }) and △2 ({O1 , O3 , O4 }, {m3 , m4 , m3 ○ m4 }), such that △1 (m1 ) = ◻(m1 ), △1 (m2 ) = ◻(m2 ), △2 (m1 ) = ◻(m3 ), △2 (m2 ) = ◻(m4 ). We name ◻(O1 ) the staring object and ◻(O4 ) the ending object. Definition 2.5. Pullback structure ∏({O1 , ...O5 }, {m1 , ...m9 }, {◻1 , ...◻4 }) is constructed from a rectangle structure ◻1 in which ◻1 (△1 (m2 )) or ◻1 (△2 (m2 )) is either monic or isomorphic. It consists of a set of five objects and a set of nine morphisms whose objects and morphisms form four rectangle structures including ◻1 . As shown in Figure 1e, the four rectangle structures is listed as follows: ◻1 ({A, B, C, D}, {π1 , π2 , π3 , π4 }, △1 , △2 ), where △1 ({A, B, D}, {π1 , π2 , π2 ○ π1 }) and △2 ({A, C, D}, {π3 , π4 , π4 ○ π3 }); ◻2 ({E, B, C, D}, {q1 , π2 , q2 , π4 }, △3 , △4 ), where △3 ({E, B, D}, {q1 , π2 , π2 ○ q1 }) and △4 ({E, C, D}, {q2 , π4 , π4 ○ q2 }); ◻3 ({E, A, C, D}, {u, π4 ○ π3 , q2 , π4 }, △5 , △4 ), where △5 ({E, A, D}, {u, π4 ○ π3 , π4 ○ π3 ○ u}); ◻4 ({E, B, A, D}, {q1 , π2 , u, π4 ○ π3 }, △3 , △5 ). Apart from ◻1 , other rectangle structures (can be also written as ◻′ ) all start with object E and end with object D. For any ◻′ , morphisms u,

319

π3 and q2 always form a triangle structure △6 (u, π3 , q2 ), so do morphisms u, π1 and q1 form △7 (u, π1 , q1 ). Definition 2.6. Pushout structure ∐({O1 , ...O5 }, {m1 , ...m9 }, {◻1 , ...◻4 }) is constructed from a rectangle structure ◻1 in which ◻1 (△1 (m1 )) or ◻1 (△2 (m1 )) is either epic or isomorphic. It consists of a set of five objects and a set of nine morphisms, whose objects and morphisms form four rectangle structures including ◻1 . As shown in Figure 1f, the four rectangle structures is listed as follows: ◻1 ({A, B, C, D}, {π1 , π2 , π3 , π4 }, △1 , △2 ), where △1 ({A, B, D}, {π1 , π2 , π2 ○ π1 }) and △2 ({A, C, D}, {π3 , π4 , π4 ○ π3 }); ◻2 ({A, B, C, E}, {π1 , q1 , π3 , q2 }, △3 , △4 ), where △3 ({A, B, E}, {π1 , q1 , q1 ○ π1 }) and △4 ({A, C, E}, {π3 , q2 , q2 ○ π3 }); ◻3 ({A, B, D, E}, {π1 , q1 , π2 ○ π1 , u}, △3 , △5 ), where △5 ({A, D, E}, {π2 ○ π1 , u, u ○ π2 ○ π1 }); ◻4 ({A, D, C, E}, {π4 ○ π3 , u, π3 , q2 }, △5 , △4 ). All rectangle structures in the pushout structure start with object A apart from ◻1 , the other rectangle structures (◻′ ) all end with object E. For any ◻′ , morphisms π2 , u and q1 always form a triangle structure △6 (π2 , u, q1 ), so do △7 (π4 , u, q2 ). More details about pullback and pushout can refer to refs 6 p91. Functors in a higher level, a morphism can also be relationship between two categories C1 and C2 , and this morphism is called functor. Definition 2.7. Functor is a relationship from categories C1 to C2 , denoted as F ∶ C1 → C2 . It is constructed by a tuple (F O ,F M ) where F O is the mapping between the two objects sets C1 .NO and C2 .NO , and F M is the mapping between the two morphisms sets C1 .NM and C2 .NM , and for every pair of objects c, d ∈ C1 .NO , F ∶ MC1 (c, d) → MC2 (F (c), F (d)). A functor can be full when F is surjective, faithful when F is injective, or full faithful when F is bijective, 3. The specification model To generate a measurement plan for a specified specification symbol, the first step is to model the specification into a category S. The second step is to construct a faithful functor from the category S to generate a measurement category M, using GPS concepts and philosophy as the framework. As such elements of object set NO in M can be derived, and then the measurement plan is established.

320

A specification of a geometrical product, with the form of zone, are expressions of the permissible variation of a non-ideal feature inside a space limited by an ideal feature(s) 7 . In this paper, we use areal surface texture (AST) specification as an example to illustrate the modelling process. As shown in Fig. 2, an areal surface texture specification symbol, is illustrated in ISO 25178-1:2016 8 to explain the meaning of each control elements of a specification symbol.

Fig. 2.

A areal surface texture specification symbol.

The first step is to model each specification element into a object of category SAST , and then identify morphisms between any objects. The AST specification model can then be constructed as shown in Fig. 3. There are ten objects in the category SAST , where: [Orientation] is the object to represent the surface texture lay (item 10 in Fig. 2); [Filter (S,F)] indicates the type of scale-limited surface (item 3); [Nesting Index] represents the two nesting indices from item 4 and item 5; [F Nesting Index] is the nesting index of F operator (item 5); [S Nesting Index] is the nesting index of S filter (item 4); [Limit Parameter] is the areal parameter (item 6); [Limit Value] represents limit value of item 7;

321

[Comparison Rule] represents the comparison rule which doesn’t shown in the symbol as it is a default setting; [Specification Type] indicates the type of tolerance (upper or lower, item 2). Morphism m1 : [Filter (S,F)] → [Nesting Index] represents relationship between the two objects, i.e. the S filter and F operator in [Filter (S, F)] has attribute [Nesting Index (S, F)]; Morphism m3 : [F Nesting Index] ↠ [S Nesting Index] is an epic morphism, and it represents that the value of nesting index for F operator decides the value of nesting index for S filter; Morphism m2 : [Limit Value × Limit Parameter ] ↠ [F Nesting Index] is also an epic morphism, and it states that the product of [Limit Value] and [Limit Parameter] decides the value of nesting index for F operator; In SAST , there are two product structures which are ×1 (F Nesting Index, S Nesting Index, p1 , p2 ) and ×2 (Limit Parameter, Limit Value, p3 , p4 ), including product morphisms p1 , p2 , p3 and p4 . There are one triangle structure △1 ({×2 , F Nesting Index, S Nesting Index}, {m2 , m3 , m3 ○ m2 }). 4. Generating the measurement plan To construct a faithful functor F ∶ SAST ▸⇒ MAST , all ten objects and seven morphisms in the category S have to be mapped to the measurement category M. Using GPS concepts and philosophy as the framework, more measurement objects and morphisms are then constructed as shown in Fig. 3. [Evaluation Area] represents the measurement evaluation area which includes the [Orientation], [Shape] and [Size] of the area; [S Nesting Index] decides the [Max Sampling Distance] and [Max Lateral Period Limit], which together form the [Surface Type]; [Measurement Value] will be compared with the [Limit Value] based on the [Comparison Rule], therefore providing the [Conformance Result]. In the category MAST , there are seven product structures which are: ● ● ● ● ● ● ●

×3 (Orientation, Size, p5 , p6 ); ×4 (×3 , Shape, {p5 , p6 }, p7 ); ×5 (F Nesting Index, S Nesting Index, p8 , p9 ); ×6 (Limit Parameter, Limit Value, p10 , p11 ); ×7 (×6 , Measurement Value, p12 , p13 ); ×8 (Comparison Rule, Specification Type, p14 , p15 ); ×9 (Max Lateral Period Limit, Max Sampling Distance, p16 , p17 );

322

Fig. 3.

Functor between specification and verification of AST.

and there are six triangle structures: ● △2 ({×6 , F Nesting Index, Size}, {m10 , m5 , m5 ○ m10 }); ● △3 (F Nesting Index, S Nesting Index, Max Sampling Distance}, {m7 , m8 , m8 ○ m7 }); ● △4 (F Nesting Index, S Nesting Index, Max Lateral Period Limit}, {m7 , m9 , m9 ○ m7 }); ● △5 (Limit Parameter, Shape, Size}, {m11 , m4 , m4 ○ m11 }); ● △6 (×8 , Comparison, Conformance Result}, {m14 , m13 , m13 ○m14 }); ● △7 (×7 , Comparison, Conformance Result}, {m12 , m13 , m13 ○m12 }). ● △8 (×6 , F Nesting Index, S Nesting Index}, {m10 , m7 , m7 ○ m10 }). As all elements of object set NO of the category MAST can be generated in the modelling process, a full measurement plan can then be generated. The model also provides guidance of the conformance check between the measurement value and the specification.

323

5. Conclusion In this paper, using category ontology language, smart measurement plans can be generated from the designed specifications by establishing rigorous semantics mapping from design to measurement. In the proposed method, the semantics of a designed specification was modelled into a category S. Then a faithful functor was constructed from the category S to generate a measurement category M, using GPS concepts and philosophy as the framework. A test case of an areal surface texture specification was used to validate the proposed method. It can demonstrate that the method has the required adaptivity and as such when the design specification has been revised or updated, the reacted measurement plan will be reasoned simultaneously. References 1. W. J. Verhagen, P. Bermell-Garcia, R. E. van Dijk and R. Curran, A critical review of knowledge-based engineering: An identification of research challenges, Advanced Engineering Informatics 26, 5 (2012). 2. Q. Qi, P. J. Scott, X. Jiang and W. Lu, Design and implementation of an integrated surface texture information system for design, manufacture and measurement, Computer-Aided Design 57, 41 (2014). 3. A. B. Feeney, S. P. Frechette and V. Srinivasan, A portrait of an iso step tolerancing standard as an enabler of smart manufacturing systems, Journal of Computing and Information Science in Engineering 15, p. 021001 (2015). 4. ISO 14638:1995, Geometrical product specification (GPS) - Masterplan, ISO (International Organization for Standardization, Geneva, CH, 1995). 5. F. W. Lawvere and S. H. Schanuel, Conceptual mathematics: a first introduction to categories (Cambridge University Press, 2009). 6. S. Awodey, Category Theory, volume 52 of Oxford Logic Guides (Oxford University Press, Oxford, 2010). 7. ISO 17450-1:2011, Geometrical Product Specifications (GPS) - General concepts - Part 1: Model for geometrical specification and verification (International Organization for Standardization, Geneva, CH, 2011). 8. ISO 25178-1:2016, Geometrical Product Specifications (GPS) - Surface texture: Areal – Part 1: Indication of surface texture (International Organization for Standardization, Geneva, CH, 2016).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 324–331)

Analysis of a regional metrology organization key comparison: Preliminary consistency check of the linking-laboratory data with the CIPM key comparison reference value Katsuhiro Shirono National Metrology Institute of Japan, National Institute of Advanced Industrial Science and Technology, Tsukuba, 305-8565, Japan E-mail: k [email protected] Maurice G. Cox National Physical Laboratory, Teddington, TW11 0LW, UK E-mail: [email protected] The validation of the data reported by linking laboratories in a regional metrology organization (RMO) key comparison (KC) is discussed in this study. The linking laboratories are the laboratories that participate in both the RMO KC and the corresponding International Committee of Weights and Measures (CIPM) KC. Even though the main purpose of an RMO KC is the performance evaluation of the non-linking laboratories, it is required that the data from linking laboratories are reliable to analyze an RMO KC data for that purpose. The validation is implemented through statistical testing. Results of simulation show that the influence of correlation is not marginal. Since correlation is not generally considered in the analysis of a single KC, the proposed testing can be used in the RMO KC as a preliminary analysis before the performance evaluation.

1. Introduction Linking of key comparisons (KCs) conducted by a regional metrology organization (RMO) and the International Committee of Weights and Measures (CIPM) is discussed in terms of statistical testing. The mutual recognition arrangement (MRA) 1 has played a primary role in realizing global metrological traceability, because the calibration and measurement capabilities (CMCs) of a national metrology institute (NMI) are established through the framework of the MRA. Ongoing effort needs to be paid to maintain CMCs in order to deliver scientific advances to industry. Key comparisons (KCs) support CMCs from a technical point of view. 324

325

It is common practice in a KC to evaluate the consistency of the reported data in aggregate. Following the consistency check the degrees of equivalence are computed. When consistency is not confirmed, some other data processing including the removal of some outliers from the computation of the reference value is often implemented. Note that any outliers should always be ratified as such by the appropriate working group of the CIPM Consultative Committee (CC) concerned. It is emphasized that all KC participants’ data are used in the reported degrees of equivalence. For the same purpose, a similar approach is desirable also in the analysis of an RMO KC. However, it may not be so simple to yield an appropriate statistic, because the results of an RMO comparison have to be linked to the CIPM key comparison reference value (KCRV) in accordance with the MRA, which states in its Technical supplement T.4: “the results of the RMO key comparisons are linked to key comparison reference values established by CIPM key comparisons by the common participation of some institutes in both CIPM and RMO comparisons”. In this linking procedure, it is assumed that the linking laboratories, that is “some institutes in both CIPM and RMO comparisons” 1 , report reliable values. When it happens that values from the linking laboratories are not reliable, the required linking cannot faithfully be implemented. In this study, some statistical models are given to develop a preliminary statistical test to check the consistency of the reported data from the linking laboratories in an RMO KC, considering the linking to the CIPM KCRV. This form of consistency checking is useful when the correlation between reported values from a linking laboratory were not quantified in a CIPM, because the CIPM and RMO data from a linking laboratory can be correlated. A common form of correlation occurs when a laboratory consistently provides values that are greater than a consensus value such as the KCRV, for instance. Some previous studies have been reported 2,3 for the analysis for an RMO KC; our proposal is different from them from both the procedural and mathematical perspectives. The complete performance evaluation is the topic of a further paper in preparation. This paper is organized as follows: Section 2 shows the assumptions made. Many symbols employed in this study are defined in this section. Section 3 gives the statistical testing method to check the validity of the data from linking laboratories. Section 4 shows applications of the proposed method to synthetic examples, deliberately chosen to illustrate some valuable points. We briefly summarize this study in section 5.

326

2. Assumptions Consider L linking laboratories participating in both a CIPM and an RMO KC. Suppose a data set consisting of m reported values in the CIPM KC, including the L data from the linking laboratories, has been used to determine the CIPM KCRV. Let x1 , . . . xL denote the reported values from the linking laboratories, and xL+1 , . . . xm denote the reported values from the non-linking laboratories included in the computation of the CIPM KCRV. The consistency of the reported data has been confirmed by comparing the χ2 score given by the following expression with an appropriate value from the χ2 distribution with m − 1 degrees of freedom 4 : m X (xi − xref )2 , (1) χ2KC = u2 (xi ) i=1

where

Pm xi u−2 (xi ) . xref = Pi=1 m −2 (x ) i i=1 u

(2)

The test assumes that, for each i, xi is a random draw from a normal distribution with a common mean µx and standard deviation u(xi ). We consider only cases where the CIPM KCRV is given as the weighted mean of x1 to xm as shown in formula (2). In the MRA document, linking to the CIPM KCRV is required. (Any outliers that may have been removed from the computation of the CIPM KCRV and the χ2 score by previous agreement in the working group is no concern of this paper, since the CIPM KC has been completed and is not to be influenced by the linking process.) In the RMO KC, it is assumed that laboratories 1 to L in the CIPM KC report y1 to yL with associated standard uncertainties u(y1 ) to u(yL ). Furthermore, it is assumed that, for each i, correlation exists between xi and yi . The according correlation coefficient ρi can be deduced from the uncertainty budgets for the CIPM and RMO KCs 5 . 3. Preliminary consistency check for the data from linking laboratories To develop the testing method, it is considered that xi for i = 1 to m and yi for i = 1 to n are derived from the distributions whose means are µx and µy , respectively. Let µ = (µx , µy )⊤ and define, for i = 1, 2, ..., L,     u2 (xi ) ρi u(xi )u(yi ) xi . , Σi = zi = yi ρi u(xi )u(yi ) u2 (yi )

327

We assume the following statistical model for data in the CIPM KC and data from the linking laboratories in the RMO: zi ∼ N (µ, Σ i ) , 2

xi ∼ N(µx , u (xi )),

i = 1, . . . , L,

(3)

i = L + 1, . . . , m.

(4)

The value x ˜ = u2 (˜ x)

m X

i=L+1

where 2

u (˜ x) =

"

m X

i=L+1

xi 2 u (x

1 u2 (xi )

i)

(5)

,

#−1

,

is considered as a realization of the following distribution† : x ˜ ∼ N(µx , u2 (˜ x)).

(6)

The estimates, x∗ and y ∗ , of µx and µy using the data in models (3) and (6) are  ∗  (X  −2    −2 )−1 (X ) L L x u (˜ x) 0 x ˜u (˜ x) −1 −1 xi ∗ z = = Σi + + . Σi y∗ yi 0 0 0 i=1

i=1

We propose the preliminary consistency check should be carried out using the statistic χ2pre =

L X

∗ (z − z ∗ )⊤ Σ −1 i (z − z ) +

i=1

(˜ x − x∗ ) 2 , u2 (˜ x)

(7)

the natural generalization of the statistic used in the conventional χ2 test. This value is to be tested to check whether it can be regarded as a plausible draw from the χ2 distribution with 2L − 1 degrees of freedom when models (3) and (6) are valid. Therefore, defining χ2ν,0.95 as the 95th percentile of the χ2 distribution with ν degrees of freedom, when χ2pre < χ2L−1,0.95 , the data is judged to be consistent. Furthermore, the right side in formula (7) includes the data to check the validity of the information from the linking laboratories and reproduce the CIPM KCRV. In other words, the † In

common with the GUM 6 we use the same symbol for a variable and an estimate of the variable.

328

χ2 score in formula (7) could be an index to check the consistency of the linking-laboratory data with the CIPM KCRV, linking to which is required in the MRA. Based on models (3) and (4), we can consider the alternative statistic χ2alt

=

L X

∗ ⊤

(zi − z )

Σ −1 i (zi

i=1

m X (xi − x∗ )2 , −z )+ u2 (xi ) ∗

(8)

i=L+1

which relates to the χ2 distribution with m + L − 2 degrees of freedom. Since model (6) has been confirmed in the CIPM KC, χ2pre in expression (7) is to be preferred. Some discussion is given in section 4 about the use of χ2alt in expression (8). When consistency is not confirmed, there may be several options. Although the removal of some data from linking laboratories from the analysis is a possible solution from the mathematical point of view, it may not be realistic. Since the number of linking laboratories is usually five or smaller, it may seriously impair the quality of the RMO KC by removing any linking-laboratory data. It is strongly recommended that the reason for any inconsistency is thoroughly investigated from a technical point of view. 4. Examples Table 1 shows three dummy cases of a CIPM and an RMO KC. Laboratories 1 and 2 are linking laboratories in all cases, and correlation coefficients are set only for them. The data from the non-linking laboratories in the RMO KC are not shown because they are not employed in the analysis proposed in this study. Cases 1 and 2 are compared to show the influence of the correlation. Cases 1 and 3 are compared to show the effect of the choice of χ2 score. The data for cases 1 to 3 are shown in Figure 1. 4.1. Influence of the correlation The data in the CIPM KC are identical for cases 1 and 2. The consistency of the data in a CIPM KC can be checked by the χ2 score shown in formula (1). In these cases, the χ2 score is computed as 2.0 and since it is less than χ24,0.95 = 9.5, these data can be regarded as consistent. The correlation coefficient between the two reported values from the linking laboratories in cases 1 and 2 are taken as 0.0 and 0.3, respectively. χ2pre is compared to the 95th percentile of the χ2 distribution with 2L−1 = 3

329

degrees of freedom. In case 1 with ρ1 = ρ2 = 0.0, χ2pre is computed as 7.1, which is less than χ23,0.95 = 7.8, and thus the consistency of the data from the linking laboratories and the data in the CIPM KC is confirmed. In case 2 with ρ1 = ρ2 = 0.3, χ2pre is computed as 9.9, which is larger than χ23,0.95 = 7.8, meaning that consistency is not confirmed. Thus, the choice of correlation coefficient, a value for which is usually not available from the CIPM KC alone, can influence the result of the consistency checking. The data in case 2 were judged to be inconsistent because the data in the RMO KC are unexpected when we consider the positive correlation between the data in the CIPM and the RMO KCs. Positive correlation between two data implies that the deviations of the two data from their mean have the same sign. However, for laboratories 1 and 2, the data in a CIPM and an RMO KC do not have such a property. It can be said that the proposed preliminary check can be helpful to confirm validity of correlations. 4.2. Choice of the test statistic In case 1, the data in the CIPM KC are different from those in case 3. As mentioned, the χ2 score for the CIPM-KC data in case 1 is 2.0 and less than the criterion of χ24,0.95 = 9.5. In case 3, the χ2 score is 9.4, meaning Table 1. Dummy data for the discussion in section 4. The explanations of the symbols are given in the manuscript. Laboratories 1 and 2 are the linking laboratories in all cases, and ρi is set only for them.

Case 1

Case 2

Case 3

CIPM KC Laboratory xi 1 −0.5 2 0.5 3 0.0 4 0.0 5 0.0 1 −0.5 2 0.5 3 0.0 4 0.0 5 0.0 1 −0.5 2 0.5 3 1.0 4 1.0 5 −2.0

u(xi ) 0.5 0.5 0.9 0.9 0.9 0.5 0.5 0.9 0.9 0.9 0.5 0.5 0.9 0.9 0.9

RMO KC Laboratory yi 1 0.8 2 −0.8

u(yi ) 0.5 0.5

Correlation ρi 0.0 0.0

1 2

0.8 −0.8

0.5 0.5

0.3 0.3

1 2

0.8 −0.8

0.5 0.5

0.0 0.0

330

Fig. 1. Dummy KC data shown in Table 1; the CIPM-KC data (a) for case 1 and 2, and (b) for case 3, and (c) the RMO-KC data for all cases. The red horizontal lines in (a) and (b) show the weighted mean of the reported data. The vertical bars show ± twice the standard uncertainty about the measured values.

that an inconsistency in the reported data cannot be detected. The value x ˜ is the same for cases 1 and 3, and its standard uncertainty u(˜ x) are likewise identical. Moreover, the RMO-KC data are identical in both cases. The values of χ2pre obtained by (7) are, thus, identical so that the consistency can be found using χ2pre in both cases. However, when we apply formula (8) for the consistency check, the results could be different. χ2alt is 7.1 and 14.5, for cases 1 and 3, respectively. These values are compared to the 95th percentile of the χ2 distribution with m + L − 2 = 5 degrees of freedom, namely, χ25,0.95 = 11.1. It is concluded that the data in cases 1 and 3 are consistent and inconsistent, respectively. In case 3, there is a difference in the evaluations of the consistency between the tests using χ2pre and χ2alt . However, since the same data are given for the RMO KC in case 3 as those in case 1, a main cause of this difference is the variation in the data from the non-linking laboratories in the CIPM KC. It is noted that the consistency of the data in the CIPM KC has been confirmed through the technical review and the χ2 test. Thus, the statistic of χ2alt may sometimes be too contaminated by the data from the non-linking laboratories in the CIPM KC to check the consistency of the data from the linking laboratories. Extremely, when m ≫ L, χ2alt is essentially determined by the variation only from the non-linkinglaboratory data, and χ2alt cannot be a good index for the variation of the linking-laboratory data. Based on these results we recommend the use of χ2pre for this test.

331

5. Summary A statistical testing method is developed to show the reliability of the data from the linking laboratories in a key comparison conducted by a regional metrology organization. Through simulations, it is demonstrated that the correlation between the two values from a linking laboratory can play an important role. Furthermore, the appropriateness of the employed statistic is discussed in terms of its comparison with an alternative statistic. The method seems particularly helpful when the correlation between measured values from a linking laboratory is not quantifiable from the CIPM KC. Acknowledgments The work of the first author was partly supported by JSPS KAKENHI Grant Number 17K18411. The National Measurement Office of the UK’s Department for Department for Business, Energy and Industrial Strategy supported the second author and in part the first author as part of NPL’s Data Science programme. References 1. Bureau International des Poids et Mesures, Mutual recognition of national measurement standards and of calibration and measurement certificates issued by national metrology institutes (1999). 2. C. M. Sutton, Analysis and linking of international measurement comparisons, Metrologia 41, p. 272 (2004). 3. J. E. Decker, A. G. Steele and R. J. Douglas, Measurement science and the linking of CIPM and regional key comparisons, Metrologia 45, p. 223 (2008). 4. M. G. Cox, The evaluation of key comparison data, Metrologia 39, 589 (2002). 5. M. A. Collett, M. G. Cox, T. J. Esward, P. M. Harris and J. A. Sousa, Aggregating measurement data influenced by common effects, Metrologia 44, 308 (2007). 6. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, Evaluation of measurement data — Guide to the expression of uncertainty in measurement Joint Committee for Guides in Metrology, JCGM 100:2008.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 332–340)

Stationary increment random functions as a basic model for the Allan variance T. N. Siraya Concern CSRI Elektropribor, JSC, 30, Malaya Posadskaya St., 197046, St. Petersburg, Russia E-mail: [email protected] Allan variance is widely used nowadays as an empirical characteristic of data scattering. The paper presents a metrological scheme for a formal definition of the Allan variance, which relies on the determination of a data model and a scale characteristic. The space of random processes with stationary increments is proposed as a basic model for data, so the main scale characteristic is the structure function. Within this model the Allan variance turns out to be an estimate of a similar parameter.

1. Introduction Nowadays the Allan variance is widely used as an empirical characteristic of data scattering [1]. It is defined via differences of data means on successive intervals: n 1 (1)  a2 ( )   ( xk 1 ( )  xk ( )) 2 / 2(n  1), k 1

where xk (), k = 1...n, are mean values of signal x(t) on intervals (k, (k+1)). The Allan variance (AVAR) was originally introduced as a stability characteristic for time and frequency standards [1], and now there are many fields of AVAR applications [2-4]. AVAR has proved to be especially useful in the non-stationary cases, including the Wiener process, white and flicker (1/f type) noises [3, 4]. However, there are still some cases, when the physical interpretation of AVAR seems to be vague. It may be caused by the empirical origin of this characteristic. The traditional metrological way for definition of an error characteristic [5] assumes determination of a random model and a scale parameter to be estimated. As regards AVAR, it was introduced empirically [1], omitting the definition of the model and scale parameter. If a proper model and scale parameter are established, 332

333

it would allow for a better study of AVAR properties, and its relations with other estimates, such as the sample variance. In this paper the space of random processes with stationary increments (SIRP space) [6] is proposed as a basic model for the formal definition of AVAR, and structure function is considered as the main scale parameter. Within this model, AVAR would be an estimate of a similar parameter. 2. General Methodology for Error Characteristic Definition While studying error characteristics, one should rely on a model of measurement. Thus, the measurand is usually defined as a location parameter within the model, and error characteristics are the corresponding scale parameters [5]. The definition and estimation of the error characteristic may be briefly presented as follows [5]: – a type or class of errors is considered, for instance, random errors; – a model for the error is stated, for instance, random variable; – a characteristic is defined (theoretically) as a scale parameter of the model, for instance, variance of the random variable; – an estimate of this characteristic is defined, for instance, the sample variance. The general scheme of the measurement error estimation is presented in Fig. 1.

Figure 1. General scheme of measurement error estimation.

This simple scheme allows one to define and estimate the error characteristics, and also to compare the properties of various error characteristics. In view of the increasing popularity of AVAR, it would be useful to obtain its formal definition, and also to clarify its relations with the classical sample variance and other statistics.

334

Thus, the initial comparison of the variance and AVAR may be presented as Table 1. Table 1. Initial comparison of the variance and the Allan variance. Stages of error evaluation

Variance

Allan variance

1

Error model

Random variable

???

2

Model scale parameter

Variance

???

Sample variance

AVAR

(theoretical characteristic) 3

Estimate of characteristic (empirical characteristic)

Similar tables may be also formed to compare other pairs of the error characteristics; but the cells for AVAR would be also vacant. The aim of this paper is to fill up the gaps in this table in order to clarify the metrological sense of the Allan variance. 3. Basic Information on the Allan Variance Actually, AVAR was used in mathematical statistics for a long time, just as an alternative estimate of variance. It should be noted that the ratio of AVAR to the sample variance was used as a test statistic in the mean-square difference test (Abbe test) [7] for the case of the classic sample: r =  2a / S 2 .

(2)

This test is used in metrology to detect systematic errors or shift in a data sequence: x1, ..., xn. Nowadays AVAR is widely used in metrology and precision devices. It was originally introduced in time and frequency measurements [1], but now it is used in many fields of measurements [2, 3]. AVAR is also standardized for some groups of navigation devices, and it is regulated in IEEE Standards [4]. A common way of AVAR application in studying errors of devices may be presented as follows: – decompose errors into typical components: x(t) =  xi(t); 2 – evaluate corresponding Allan variances  a i ( ) for the typical components; – estimate the empirical Allan variance (for the given data set); – compare the empirical Allan variance  a2 ( ) with the typical components (select the main components and estimate their parameters).

335

In particular, the error components for gyroscopes and accelerometers are usually taken as follows [4]: x(t) = x1(t) + x2(t) + x3(t) + x4(t) + x5(t),

(3)

where x1(t) is the angle/velocity quantization noise; x2(t) – rate white noise, angle/velocity random walk; x3(t) – bias instability, rate/acceleration flicker noise; x4(t) – rate or acceleration random walk; x5(t) – linear drift (trend). Thus, the models of typical components and mathematical expectations of the corresponding Allan variances i2() = E [ai2()] can be presented in Table 2. Table 2. Typical error components. Component

Type of component

x1(t)

quantization noise

x2(t)

white noise

x3(t)

flicker noise

x4(t)

Wiener process

x5(t) = R t

linear function

Allan variance expectation

12() = c1 / 2 22() = c2 /  32() = c3

42() = c4 

52() = c5 2

The resulting decomposition of AVAR into typical components would be as follows:

а () =  2i () = c1 / 2 + c2 /  + c3 + c4  + c5 2.

(4)

Typical AVAR (or Allan deviations) are usually plotted versus time on log log scale [4], like in Fig. 2. Stationary processes stand aside from scheme (3). In this case mathematical expectation of AVAR a02() may be presented via spectral density f() [4]:

 a20 ( )  2  f ( ) sin 4 ( ) / ( ) 2 d .

(5)

It is not a monotone function, so it cannot be easily identified. These typical plots are useful for analysis of empirical Allan variance  a2 ( ) and detection of the main typical components [4].

336

Figure 2. Log - log plots of typical Allan deviations versus sample times: 1 – quantization noise; 2 – white noise; 3 – flicker noise; 4 – Wiener process; 5 – linear trend.

This empirical approach often reveals the main error components, so that more accurate methods can be further used. However, sometimes the typical components are not clearly identified. It is partly due to the fact that physical meaning of AVAR is still not clear. 4. Basic Model for the Allan Variance Definition In order to define a basic model for AVAR, one should take into account the following requirements: – the basic model should include space of stationary processes (since this case is the most popular for practice); – the model should also include several groups of non-stationary processes, such as the Wiener process, white and flicker noise; – the model should also allow for simple kinds of trends, such as a linear trend; – the model should include a scale parameter, which is clearly interpretable; – AVAR should be a consistent estimate of this scale parameter. The best model for AVAR definition could be a set of Wiener processes; it is confirmed both on statistical and functional terms. Really, in this case the sequence of data mean differences on the successive intervals forms a classical sample. Nevertheless, the Wiener model is too restricted for practice, as it does not include stationary processes. A natural extension of the stationary model is the space of random processes with stationary increments (SIRP space) [6].

337

It is quite a relevant extension for the case since the SIRP space includes both stationary processes and some non-stationary processes, such as Wiener processes, white and flicker noise. It also allows for linear trends. These processes were introduced by A. N. Kolmogorov and A. M. Yaglom [6]. They were successfully employed in the research of liquid and airstream turbulence [8], and other physical phenomena. The basic definitions concerning the SIRP space are listed below. Random process X(t), – < t < , is called [6] a random process with stationary increments (SIRP), if the mathematical expectation of the increment on time interval is proportional to the length of the interval, and the covariance of the increments depends only of differences of time moments: Е ( X(s) – X(t)) = a (s – t), D1(t; u, s) = Е ( X(u) – X(t)) (X(s) – X(t)) = D (u – t, s – t). Thus, the process is entirely determined by the structural function of two variables: D (1 , 2)= Е ( X(t+1) – X(t)) (X(t+2) – X(t)).

(6)

Moreover, the real-valued process is completely determined by the reduced structural function of one variable: D0() = Е ( X(t+) – X(t))2 .

(7)

The SIRP space is an essential extension of the space of stationary processes; but it still maintains the spectral representations of the process and the structural function. The SIRP has a spectral representation of the form: X(t) =  [(eit – 1) / (i)] dZ() + X0,

(8)

D ( ,  )=  [(ei – 1) (ei – 1) / 2] dF(),

(9)

where Z() is a random process with non-correlated increments. Similarly, the structural function has also a spectral representation of the form: where F() is an increasing function corresponding to the process Z(): E | Z() – Z() |2 = F() – F(),

 > .

(10)

In particular, spectral representation for the real-valued process is also simplified: D0() = 2  [( 1– cos ) / 2] dG().

However, there are some peculiarities for SIRP spectral representations.

(11)

338

For instance, increasing function F() is bounded only in the case of a differentiable process. In a general case [6], function F() only meets a weaker condition: for any a > 0  a 2 2 a dF ( ) /    ,  dF ( ) /    .

(12)

In particular, the Wiener process is not differentiable in a usual way, and the corresponding function F() = c  is unbounded; but it submits to terms (12). The main properties of SIRP imply a reasonable guess that it would be appropriate to consider a set of SIRP as a basic model for data under processing. Within the SIRP model, the structure function D0(), defined by (7), is the main scale parametric characteristic. It also possesses the spectral representation like (11). Taking into account the empirical formula (1) for AVAR definition, it would be desirable to consider AVAR σ2a() as an empirical estimate of the main scale parameter D0()/2. Of course, it is true for the case of discrete time (or random sequence). In the case of continuous time, one should take into account the signal smoothing. In particular, this effect is clearly seen from Equation (5) in the case of the stationary process. In practice, one should use the “smoothed” signal x(t), which is the mean value of signal x(t) on the interval (t, t+). The “smoothed” process x(t) is also SIRP, and it is defined by the structural function D(t). So, AVAR σ2a(), as defined by (1), would be an empirical estimate for this function at t = , that is, for D()/2. The structural function D(t) also allows for a spectral representation like (11), thus it implies a similar representation for AVAR mathematical expectation  a20 ( ) . This spectral representation is a natural generalization of Equation (5), which was given for the stationary process. Consequently, there is a promising way to complete the initial Table 1 for comparison of the classical variance and AVAR. It may be presented as follows. Table 3. Final comparison of the variance and the Allan variance. Stages of error evaluation 1

Error model

2

Scale parameter of model (theoretical characteristic)

3

Estimate of characteristic (empirical characteristic)

Variance

Random variable 

Allan variance Stationary increment random process x(t)

Variance D()

Structure function D()

Sample variance S2

Allan variance  a ( ) . 2

339

Thus, it is clear that AVAR σ2a() is an empirical estimate of the scale parameter, which differs from the variance of the random variable. So, the question of the advantage of one characteristic over the other one is not relevant. In practice, AVAR works quite well in some cases, where the classical variance is not effective, such as Wiener processes, white and flicker processes. On the other hand, the classical variance and spectral representation are quite effective tools in the cases of stationary or similar processes. Therefore, the classical variance cannot be replaced by the Allan variance. These are two empirical estimates for different scale parameters, which are effective in diverse fields of application. In a general case, the Allan variance and the classical variance could successfully work together and complement each other in different ways. Conclusion 1 The random processes with stationary increments are useful as the basic model for defining the Allan variance. This set of processes includes stationary and some non-stationary processes (Wiener, white and flicker processes), and also linear trends. 2 The main scale parameter within the SIRP model is the structure function D(), and the Allan variance σ2a() is an empirical estimate of the “smoothed” scale parameter D()/2. 3 The Allan variance σ2a(), and the classical variance S2 are the empirical estimates for the different scale parameters, which are effective in diverse fields of application. Acknowledgments This paper is supported by Russian Foundation for Fundamental Research; project No. 16-08-00801. References 1. 2.

Allan D. W., Ashby N., Hodge C. C. The Science of Timekeeping. – Application Note 1289 (Hewlett-Packard Company, 1997). Allan D. W. Historicity, Strengths, and Weaknesses of Allan Variances and Their General Applications. Proc. 22nd St. Petersburg International Conference on Integrated Navigation Systems (CSRI Elektropribor, St. Petersburg, 2015).

340

3.

4. 5.

6. 7. 8.

Witt T. J. Testing for Correlations in Measurements with the Allan Variance. Advanced Mathematical and Computational Tools in Metrology IV (World Scient. Publ., New York, 2000). IEEE Std 952-1997. IEEE Standard Specification Format Guide and Test Procedure for Single-Axis Interferometric Fiber Optic Gyros. Granovsky V. A., Siraya T. N. Measurement Quality Characteristics In Metrology: Systemic Approach. Proc. XVII IMEKO World Congress (Dubrovnik, 2003). Yaglom A. M. Correlation Theory of Stationary and Related Random Functions. – Vol. 1, Vol. 2. (Springer-Verlag, New York, 1987). Brownlee K. A., Statistical Theory and Methodology in Science and Engineering. (J. Wiley & Sons, New York, 1965). Monin A. S., Yaglom A. M. Statistical Fluid Mechanics. – Vol. 1. (Cambridge, Mass., USA: MIT Press, 1971).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 341–348)

Modelling a quality assurance standard for emission monitoring in order to assess overall uncertainty* T. O. M. Smith Emissions and Atmospheric Sciences Group, National Physical Laboratory, London, TW11 0LW, UK E-mail: [email protected] www.npl.co.uk Emissions of many pollutants to the atmosphere are controlled to protect human health and the environment. Operators of large combustion plants are required under the industrial emissions directive to continuously measure various pollutants and report their measurements. The directive and associated standards set uncertainty requirements for this process. The main governing standard is EN 14181:2014, covering the required quality assurance regime for automated measuring systems (AMS). Uncertainty assessment on single measurements are fairly straightforward, but understanding overall uncertainty for annual mass emissions from an instrument only calibrated every 3-5 years requires significant effort. NPL has modelled a gas analyser running according to the quality assurance scheme described in EN 14181:2014, with each measurement including a suite of uncertainties. These measurements include recorded values and those from the quality assurance tests, QAL2 (calibration), QAL3 (regular drift correction test) and AST (annual surveillance test to check the calibration function), both from the AMS and a standard reference method (SRM) where appropriate. The model uses a Monte-Carlo approach to provide an overall uncertainty for the annual mass emission. The model has been tested with data from hypothetical good, ok and barely meeting requirements AMS, illustrating the effects of improvements in uncertainty from monitoring equipment on overall reported emission totals. While meeting measurement uncertainty limits has to be demonstrated, operators are not required to report emission totals with uncertainties. The result is little understanding of the impact this uncertainty has. The NPL model demonstrates this impact, encouraging industry to improve their measurement quality. Keywords: Monte-Carlo simulation; assessment, EN 14181:2014.

Emission

monitoring;

Process

uncertainty

Developed as part of the IMPRESS (Innovative Metrology for Pollution Regulation of Emissions and area Sources) project. Funded under the European Metrology Research Programme and the UK’s Department for Business, Energy and Industrial Strategy National Measurement System under the Optical, Gas and Particle Metrology Programme.

*

341

342

1. Introduction Industrial emissions in the air pose a hazard to human health [1; 2] so are controlled by regulation. The current European legislation governing this is the Industrial Emissions Directive (IED) [3]. The IED sets emission limits for different types of pollution, which vary depending on the emission source. The IED also sets uncertainty levels for measurements based on percentage of the emission limit value (ELV) as shown in Table 1. Table 1. Maximum allowable measurement uncertainty for pollutants as defined in the IED.

Pollutant Carbon monoxide (CO) Sulphur dioxide (SO2) Nitrogen oxides (NOx) Dust

Maximum uncertainty as % of ELV 10 20 20 30

For large combustion plants (LCP) many gaseous pollutants must be continuously monitored. The IED requires that all sampling and analysis is carried out in accordance with CEN standards, including the quality assurance of automated measuring systems (AMS) used for continuous monitoring. EN 14181 [4] fulfils this role, setting out the required measurement process in order to measure within the uncertainty limits set. However the ability of EN 14181 to control the measurement process and keep uncertainty within the legislated uncertainty limits has not been conclusively tested. Assessing uncertainty on a single measurement is well understood and best explained in the guide to uncertainty in measurement (GUM) [5; 6]. However assessing the uncertainty of complex processes with regular measurements occurring over long periods and with years between calibrations is difficult and therefore not fully undertaken. EN 14181 makes assumptions that over the many individual measurements that make up the reported emissions any random uncertainty sources (e.g. repeatability) will average out, leaving a process that meets the IED uncertainty requirements. However there are systematic uncertainty sources that will not be cancelled in this way that might be missed due to these assumptions. Compliance with EN 14181 is considered to demonstrate meeting uncertainty requirements, so emission measurements are reported without uncertainty. Given the significant uncertainty allowed by IED, there is potential for national and international inventories to contain major uncertainty. NPL has modelled the full process to assess the actual uncertainty that can be achieved when continuously monitoring an emission source. Using Monte-

343

Carlo simulation the model results indicate the overall process uncertainty. By altering variables within the model it is possible to investigate a wide range of scenarios to fully test the ability of EN 14181 to maintain uncertainty within acceptable limits. 2. EN 14181 EN 14181 sets out a number of quality assurance levels (QAL), processes that have to be undertaken to make sure measurements are of sufficient quality (Figure 1). Initial checks of the suitability of the equipment, including type approval, come under QAL1, covered by the EN 15267 [7] family of standards. Once the AMS is installed it has to be tested and calibrated, done by direct comparison with a standard reference method (SRM) to set a calibration function (QAL2). Once operating, periodic tests are required to check for drift and loss of precision by measuring a reference sample (QAL3). Annual surveillance tests (AST) alongside the SRM are carried out to check the calibration function is still valid.

Figure 1. Quality assurance levels defined by EN 14181 to ensure that automated measurement systems for monitoring industrial emissions maintain acceptable levels of performance.

The QAL tests mentioned above form the basis of the procedures in EN 14181:2014. Calibration is done in situ, with the AMS measuring as usual, in parallel with a standard reference method (SRM). The SRM will vary dependant on the pollutant species being measured, but are described in international standards (e.g. Chemiluminescence for NOx as described in EN 14792:2005 [8]). These SRMs have demonstrated their abilities to measure the specified species, so are compared to the AMS as an error-free standard to calibrate against. The SRM in the standard is not the only acceptable method, as an alternative method that has demonstrated equivalence according to EN 14793

344

[9] can be used instead, assuming it has been approved by the relevant competent authority (e.g. Environment Agency in England, SEPA in Scotland, etc.). The QAL2 consists of at least fifteen parallel measurements with the AMS and SRM during normal operation of the plant. Differences between the AMS and SRM results is used to calculate a calibration function, which is applied to readings from the AMS to correct it during normal operation. The calculation of the calibration function varies dependant on the data to allow for circumstances where results are not well distributed over the full measurement range. The calibration function is tested annually to check it is still valid with a shorter run of parallel measurements with the SRM. AST and QAL2 testing is carried out by specialist stack monitoring teams that have accreditation approved by the competent authority. Between the QAL2 and AST events the plant operator carries out routine QAL3 testing to detect problems with loss of precision or instrumental drift. This involves repeated measurement of a reference sample to highlight changes in measurement response over time. Control charts are used to track the variation. The NPL model implements cumulative sum (CUSUM) control charts as they provide the ability to correct for drift, reducing the need for additional recalibrations. 3. NPL Model Each measurement made within the model is tracked as the actual value that was measured (the “true” value), which is error adjusted in each Monte-Carlo repeat by sampling a probability distribution function (pdf) for each source of error (Figure 2). The type and spread of the pdf is determined by the type of error and its expected magnitude. For example, repeatability would be represented by a normal distribution with a mean of zero and a standard deviation determined by the repeatability uncertainty listed for that measurement. The type approval process under QAL1 will have characterised the expected uncertainties for the sources of measurement error for a particular AMS. For each monte-Carlo repeat the pdf for each error source is sampled once. The errors from all the different sources are combined for each measurement in the repeat. By running with sufficient repeats the simulation provides an indication of the potential range of possible measurement results. The model has been set up to represent a population of models measuring the same emission source. This form of implementation will allow the investigation of systematic error sources. An example of this is the relatively

345

low number of test laboratories undertaking QAL2 testing. If there is any lab specific bias on their calibrations it will produce a systematic error across the whole sector. Most uncertainty assessments assume that any calibration error will be solely random, cancelling out over the wider population, however the model can test this assumption.

Figure 2. Within the model each measurement is adjusted by sampling from a pdf for each potential source of measurement uncertainty.

Depending on the set up of certain key variables the model could also be used to look at uncertainty of a single instrument. The MCS would replicate some uncertainties across all repeats to represent multiple instances of the same instrument, testing repeatability of the AMS and how that evolves over the five year period between calibrations. A major element of the model is the implementation of the QAL2 as this sets the calibration function for the measurements that will contribute to the annual mass emission. Since the SRM measurements are considered free of bias and assumed to be correct they were included in the model with a single uncertainty value. The SRM will have bias and its own measurement errors so the model has the ability to consider this, but for much of the model testing and validation this was set to zero, following the assumption set out in EN 14181. Dependant on the measurements the model would have to use different processes, in particular for the handling of clustering in the QAL2 testing. As mentioned previously there are different scenarios to cope with well distributed

346

data, low concentration clusters and high concentration clusters. For low concentration clusters the zero reference measurement is used to provide the offset in the calibration function, while for high level clusters additional measurements at zero and close to ELV are made using reference materials to ensure the validity of the calibration function over the full range. The model has to be able to detect which case to use, which could vary between MCS repeats due to the errors, and execute correctly to create valid calibration functions for each repeat (Figure 3).

Figure 3. Flow chart indicating the branching nature of QAL2 that is implemented in the model.

The model also needs to be able to cope with test failures, running recalibrations of the AMS where necessary within a model run (Figure 4). The model records the failure rates for each model repeat, which can be compared with recorded failure rates in the field to validate the model. This is also a good

Figure 4. Processes undertaken during a model run, how they are linked and where the model has to account for potential test failures.

347

metric to look at when experimenting with variables within the model to see the effects of changes to testing frequency and instrumental changes. 4. Testing and Validation The model was coded in R, a programming language for statistical computing [10]. It was created in a modular, object-orientated format, based on the different processes and sub-processes involved in the measurement regime. During development each of the modules were tested individually, then retested after integration with other modules to ensure that everything was running correctly. These tests were performed with a variety of simplified test data, initially based on constant inputs across the expected range. Once these tests had been satisfactorily completed, varying data based on a sine wave was used to retest over the full expected range. As a final test a white noise signal was added to the sine wave, pushing the test beyond the expected boundaries. Once parts of the model were complete they could be validated with real data from actual quality control tests. The model was initially run with all error sources set to zero to ensure that correct results were consistently being achieved in all repeats. Subsequent runs used default error levels and failure rates for the QAL testing were assessed to make sure they were in line with occurrence rates in industrial settings. 5. Conclusions A very complex uncertainty analysis, covering thousands of measurements over a space of up to five years, has been modelled by NPL to verify the procedures in EN 14181 are capable of meeting the uncertainty requirements in EU legislation. Using Monte-Carlo Simulation techniques the model is able to propagate uncertainty into the overall uncertainty in annual mass emission values. The model allows a full sensitivity analysis to be performed, highlighting areas contributing most to overall uncertainty. This information can be used by operators and instrument manufacturers to focus improvement of their measurement processes on areas where it will have the most significant effects. Pollution is reported as mass emissions so flow is an important element which is currently being implemented fully within the model. Confidence in emission monitoring is a vital first step in a process of emission reduction. Acknowledgments This work was completed as part of the IMPRESS project, Innovative Metrology for Pollution Regulation of Emissions and area SourceS. Funded

348

under the European Metrology Research Programme and the UK’s Department for Business, Energy and Industrial Strategy National Measurement System under the Optical, Gas and Particle Metrology Programme. References 1.

Brunekreef, B., Holgate, S. T., 2002, Air pollution and health. The Lancet, 360, pp. 1233-1242. 2. Kampa, M., Castanas, E., 2008, Human health effects of air pollution. Environmental Pollution, 151, pp. 362-367. 3. European Parliament, 2010, Directive 2010/75/EU on industrial emissions (integrated pollution prevention and control). Official Journal of the European Union, L(334), pp. 17-119. 4. BSI, 2014. BS EN 14181:2014 - Stationary source emissions - Quality assurance of automated measuring systems. London: BSI Standards Limited. 5. JCGM, 2008, Evaluation of measurement data – Guide to the expression of uncertainty in measurement. 6. JCGM, 2008, Evaluation of measurement data – Supplement 1 to the “Guide to the expression of uncertainty in measurement” – Propagation of distributions using a Monte Carlo method. 7. BSI, 2007. BS EN 15267-3:2007 - Air quality - Certification of automated measuring systems. London: BSI Standards Limited. 8. BSI, 2005. BS EN 14792:2005 – Stationary source emissions – Determination of mass concentration of nitrogen oxides (NOx) – Reference method: Chemiluminescence. London: BSI Standards Limited. 9. BSI, 2005. CEN/TS 14793:2005 - Stationary source emission Intralaboratory validation procedure for an alternative method compared to a reference method. London: BSI Standards Limited. 10. R Core Team, 2017. R: A language and environment for statistical computing, Vienna, Austria: R Foundation for Statistical Computing.

$%)RUEHV1)=KDQJ$&KXQRYNLQD6(LFKVWlGW)3DYHVH HGV  $GYDQFHG0DWKHPDWLFDODQG&RPSXWDWLRQDO7RROVLQ0HWURORJ\DQG7HVWLQJ;, 6HULHVRQ$GYDQFHVLQ0DWKHPDWLFVIRU$SSOLHG6FLHQFHV9RO ‹:RUOG6FLHQWLILF3XEOLVKLQJ&RPSDQ\ SS± 

Integrating hyper-parameter uncertainties in a multi-fidelity Bayesian model for the estimation of a probability of failure R. Stroh†,⋄,∗ , J. Bect⋄ , S. Demeyer† , N. Fischer† and E. Vazquez⋄ † Mathematics and Statistics Department, Laboratoire National de métrologie et d’Essais (LNE), Trappes, France ⋄ Laboratoire

des Signaux et Systèmes (L2S), CentraleSupélec, Univ. Paris-Sud, CNRS, Université Paris-Saclay, Gif-sur-Yvette, France ∗ E-mail:

[email protected]

A multi-fidelity simulator is a numerical model, in which one of the inputs controls a trade-off between the realism and the computational cost of the simulation. Our goal is to estimate the probability of exceeding a given threshold on a multi-fidelity stochastic simulator. We propose a fully Bayesian approach based on Gaussian processes to compute the posterior probability distribution of this probability. We pay special attention to the hyper-parameters of the model. Our methodology is illustrated on an academic example.

1. Introduction In this article, we aim to estimate the Probability of Failure (PoF) of a system described by a multi-fidelity numerical model. Multi-fidelity simulators are characterized by the fact that the user has to make a trade-off between the realism of the simulation and its computational cost, for instance by tuning the mesh size when the simulator is a finite difference simulator. An expensive simulation gives a high-fidelity result, while a cheap simulation returns a low-fidelity approximation. A multi-fidelity approach combines different levels of fidelity to estimate a quantity of interest. A method for estimating probabilities of exceeding a threshold of a stochastic multifidelity numerical model is proposed in [1]. In this paper, we extend the methodology to a fully Bayesian approach. A stochastic multi-fidelity simulator can be seen as a black-box, which returns an output modeled by a random variable Z from a vector of inputs (x, t) ∈ X × R+ , X ⊂ Rd . The vector x is a set of input parameters of the simulation, and the scalar t controls the fidelity of the simulation. The fidelity increases when t decreases. We denote by Px,t the probability distribution of the output Z at (x, t). We assume that an input distribution 349

350

fX on the input space X and a critical threshold z crit are also provided. The PoF is the probability that the output exceeds the critical threshold Z Px,tref (Z > z crit)fX (x)dx, (1) P = X

where tref is a reference level where we would like to compute the probability. We use a Bayesian approach based on a multi-fidelity Gaussian process model of Z in order to compute a posterior distribution of the PoF. Prior distributions are added on the hyper-parameters of the Gaussian process, so we expect that the posterior distribution of the PoF has better predictive properties. This approach is compared to a classical plug-in approach. The paper is organized as follows. Section 2 explains the Bayesian multi-fidelity model. Section 3 describes how to take into account the hyper-parameter uncertainties to compute the posterior density of the PoF. Section 4 illustrates the methodology on an academic example. 2. Multi-fidelity Gaussian process In this section, we present the model proposed in [1]. The output Z at x, t is assumed conditionally Gaussian Z|ξ, λ ∼ N(ξ(x, t), λ(t)),

(2)

with ξ(x, t) and λ(t) the mean and variance functions, the latter being assumed independent of x for simplicity. Knowing ξ and λ, two different runs of the simulator produce independent outputs. Bayesian prior models are independently added on ξ and λ. For the mean function ξ, we use the multi-fidelity model proposed by [2, 3]. This model decomposes the Gaussian process ξ(x, t) in two independent Gaussian processes: ξ(x, t) = ξ0 (x) + ǫ(x, t),

(3)

where the process ξ0 describes an ideal simulator, which would be the result at t = 0, and ǫ represents the numerical error of the simulator. The  2 model imposes E ǫ(x, 0) = 0. Moreover, as the fidelity increases when t decreases, the variance of ǫ according to t is decreasing when t decreases. The ideal process ξ0 is a stationary Gaussian process with constant mean m and stationary covariance c0 . The error process ǫ is a centered Gaussian process with a separable covariance between x and t, independent of ξ0 . Thus, the distribution of ξ is ξ ∼ GP(m, c0 (x − x′ ) + r(t, t′ ) · cǫ (x − x′ )).

(4)

351

The prior distribution of m is a uniform improper distribution on R, which is a classical assumption in ordinary kriging (see [4]). Following the recommendations of [3], a Matérn 5/2 covariance function is selected for c0 and cǫ : v v u d  2  u d  2  u X hk uX hk  , cǫ (h) = σ02 G M5/2 t , c0 (h) = σ02 M5/2 t ρ0k ρǫk k=1

k=1

(5) and a distorted Brownian covariance function for the fidelity covariance:  L min {t, t′ } r(t, t′ ) = , (6) tLF  with σ02 , G, L, ρ0k , ρǫk 1≤k≤d 2d+3 positive hyper-parameters, tLF the lowest level of fidelity (to ensure r(t, t′ ) ≤ 1), and M5/2 the covariance function √  √ M5/2 (h) = 1 + 5h + 35 h2 e− 5h . In this article, even if the simulator could be observed at any level t, we assume that only S levels t1 > t2 > · · · > tS > 0 are actually observed. Thus, instead of inferring on the whole function λ(t), we consider only the parameters (λ(ts ))1≤s≤S . The vector n o of hyper-parameters θ =  2 0 ǫ σ0 , ρk 1≤k≤d , G, L, (ρk )1≤k≤d , (λ(ts ))1≤s≤S therefore has length 2d + 3 + S. 3. Dealing with hyper-parameters

In order to carry out a fully-Bayesian approach, prior distributions are added on these hyper-parameters. To simplify the estimations and the inference, the hyper-parameters are expressed in log-scale lθ = log(θ), and the joint prior distribution of lθ is chosen to be a multivariate normal distribution. The hyper-parameters of the mean function ξ are assumed mutually independent, and independent of the noise variance λ. An approximate value rout of the range of the output is assumed known, and the input doQd main X is assumed to be an hyper-rectangle X = k=1 [ak ; bk ]. We propose, for the model described in Section 2, the following prior distributions: ! ! 2 rout 2 , log(100) , (7a) lσ02 ∼ N log 1002  (7b) lG ∼ N log (1) , log(100)2 ,     b k − ak lρ0k , lρǫk ∼ N log (7c) , log(10)2 , 1 ≤ k ≤ d, 2

352

lL ∼ N(log(4), log(3)2 ), (7d) ! ! 2  rout 1S , log(100)2 · ((1 − c)IS + cUS ) lλ(ts ) 1≤s≤S ∼ N log (7e) 1002 with c the correlation between two noise variances, 1S the vector of ones of length S, IS the identity matrix of size S, and US the square matrix of ones with size S. To select the prior distributions, we propose a reference value for each hyper-parameter, and add a large prior uncertainty to get weakly-informative prior distributions. The parameters σ02 and Gσ02 are  out 2 assumed to be approximatively equal to r100 . The range parameters  k . ρ0k , ρǫk 1≤k≤d are assumed to be about the half of the domain ρk ≈ bk −a 2 For the degree parameter L, the mean is a value recommended by [3].  out 2 The noise variances are assumed to be about r100 , with a large standard deviation. However, we also assume that the prior uncertainty on the difference between two log-noise variance are really small with respect to the uncertainty of the noise variance, Var [log(λ(t1 )) − log(λ(t2 ))] ≪ Var [log(λ(t1 ))]. Consequently, we assume [1] a strong correlation between log-noise variances, which is set to c = 99%. This assumption helps to estimate noise variance on the levels with few observations. Once the prior distribution are defined, we can compute the posterior distribution conditionally to observations using Bayes theorem. Let χn = (xi , ti ; zi )1≤i≤n denote n observations of the simulator. Because of the assumption of normal output distribution and Gaussian process with unknown mean (Equations (2) and (4)), the prior and posterior processes conditioned by θ are Gaussian. Thus, for any vector of outputs Z at given input vectors, π(Z|χn , θ) and π(χn |θ) are Gaussian multivariate distributions, whose mean and covariance are given by the kriging equations [4]. The posterior distribution of θ can be expressed with Bayes formula up to a normalizing constant: π (θ|χn ) ∝ π(χn |θ) · π(θ). As there is no close expression of this posterior distribution, we sample it using a Monte-Carlo method. More precisely, we use the adaptive Metropolis-Hastings algorithm proposed by [5] to get samples (θj )1≤j≤p , distributed according to π(θ|χn ). The sampled hyper-parameters are used to compute the probability distribution of the PoF P (1). Since the density of P is intractable, we use a Monte-Carlo method to draw samples from the posterior distribution π(P |χn ). At each fixed θj , first, m  inputs are drawn accord ing to the input distribution X (j) =

(j)

(j)

xi

1≤i≤m

, xi

∼ fX .

Then,

353

we draw sample paths at the inputs X (j) at the reference    q Gaussian (j) ref (l) and compute the probability funclevel ξχn ,θj xi , t 1≤i≤m,  1≤l≤q  crit  (l) (j) ref   ξ x ,t −z i χ ,θ (l) (j) . Finally, the samples tion pj xi , tref = Φ  n jq (l)

λθ (tref ) j

(Pj,l )1≤j≤p, 1≤l≤q are computed by averaging on the input space, Pj,l = Pm (l)  (j) ref  1 . With this sample, we can estimate the PoF with xi , t i=1 pj m a measure of uncertainty, for instance, by computing the empirical median and a 95% confidence interval. 4. Application

The algorithm is illustrated on a random damped harmonic oscillator from [6]. Consider X(t) the solution of the second-order differential stochastic equation, driven by a Brownian motion with spectral density equal to ˙ = 0) = 0 as initial conditions. The paone, and with X(t = 0) = 0 and X(t rameters of the differential equation, the natural pulse ω0 and the damping ratio ζ, are the d = 2 inputs of the simulator. The stochastic equation is solved on a period t ∈ [0; tend], with tend = 30 s, by an explicit exponential en ≈ X(n · δt). The Euler scheme, which approximates X by a sequence X time step δt is the fidelity parameter. The multi-fidelity simulator is n  o en | , (8) f : (ω0 , ζ, δt) 7→ max  tend  log |X 0≤n≤

δt

with ω0 ∈ [0; 30] rad s−1 , ζ ∈ [0; 1] and δt ∈ [0; 1] s. The cost of this simulator is linear in 1/δt: observing the level δt (in s) costs C(δt) = 2.61 out = 40. δt + 5.45 (in ms). The approximate output range is r For this article, we consider S = 5 levels of fidelity: δt = 1, 0.5, 0.1, 0.05, and 0.01 s. The multi-fidelity design is a Nested Latin Hypercube Sampling (NLHS) with respectively 168, 56, 28, 14 and 7 points at each level of fidelity, generated with the algorithm of [7] and a maximin optimization. The adaptive Metropolis algorithm is applied to draw p = 103 vectors θ. The Figure 1 represents the normalized marginal prior and posterior distributions of θ, the latter being estimated with a kernel density method. The marginal posterior distributions are more concentrated than their prior counterparts, indicating that the observations χn brings information about the hyper-parameters. Particularly, for noise variances (λ(δts )), the strong correlation between levels allows to reduce the uncertainties of all

354

1

1

1

1

0.5

0.5

0.5

0.5

0

0.007

1

σ0

100

0 0.02

1

1

0.5

0.5

0 0.02 1

1

50 3000

ρǫ1

0.5

0

0

1

1

50 3000

ρ01

0

0.007

1

100

ρ02

1

0.007

1

ρǫ2

100

0

1

7

50

L 1

0

0.007

√1 100 G

0.5

0.5

0.007 1 100 p λ(δt3 ))

0

1

0.5

0.5

0.007 1 100 p λ(δt2 ))

0

0

0.007 1 100 p λ(δt )) 1 1

0.5

0.007 1 100 p λ(δt4 ))

0

0.007 1 100 p λ(δt5 ))

Figure 1. Normalized densities of the hyper-parameters of the multi-fidelity Gaussian process. The solid blue lines are the posterior densities, and the green dashed lines the prior densities. The abscissa axes are in logarithmic scale.

noise variances, including those from levels with few observations. The value of L is rather well-estimated, an observation which is opposite to the one in [3], which recommends to fix the value. With the sampling of hyper-parameters, we can estimate the posterior distribution of the PoF. The input distribution fX is an uniform distribution on the input space [0; 30] rad s−1 ×[0; 1], the critical threshold is z crit = 1. In order to make a comparison, we estimate the PoF at an observable fidelitylevel, fixed to δtref = 0.01 s. We compute a reference value P ⋆ = 5.73%. We compare two different methods of estimation: a Fully Bayesian (FB) approach, and a plug-in approach, where hyper-parameters are replaced by their Maximum A Posteriori (MAP). Our methodology is applied on 240 independent experiments. On these experiments, the input and outputs observations change, but the models and their priors are fixed. For each experiment, the posterior density of the PoF of the four models is sampled, which gives 240 × 2 posterior densities. From these posterior densities of the PoF, we compute the median, and the 95% confidence intervals. Figure 2(a) displays the empirical histograms of the 240 medians of the posterior distributions of the PoF. We can see that the medians returned by FB approach vary less from an experiment to another than those returned

355

30

40

20 20 10 0 0

0.02

0.04

0.06

0.08

0.1

0

0.12

Maximum A Posteriori

40

100

20

50

0 0

0.02

0.04

0.06

0.08

0.1

Fully Bayesian

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.04

0.06

0.08

0.1

0.12

0.14

Maximum A Posteriori

0

0.12

(a) Posterior medians

0.02

Fully Bayesian

(b) Lengths of 95% intervals

1

Coverage

0.9 0.8 0.7 0.6 0.5 0.4 0.4

0.5

0.6

0.7

Level

0.8

0.9

1

(c) Coverage rate versus level of the confidence intervals Figure 2. (a) Histograms of the medians. The vertical dotted line is the reference. (b) Estimated densities of the lengths of the 95% confidence interval. The dashed lines with squares correspond to intervals which contain the reference value, the dashed lines with stars to those which miss it, and the solid lines to all intervals. (c) Coverage of confidence intervals at level p. The coverage is the proportion of cases where the reference value is inside the confidence interval. The solid and dashed-crossed lines corresponds respectively to the FB and MAP approaches.

by MAP. Figure 2(b) plots the empirical densities of the 240 lengths of the 95% confidence intervals, estimated by kernel density regression. We can see that, for all approaches, the failing intervals have a smaller length than the successful intervals. We can also see that the FB approach always provides non-zero confidence intervals, opposite to MAP approach. Figure 2(c) presents the capacity of the models to catch the reference value. Each curve corresponds to one approach. Each point of the curve at abscissa p is the coverage, the proportion of the confidence intervals of level p which contain the reference, according to the associated approach. We can see that the FB approach provides much more conservative intervals than MAP approach. The three Figures 2(a), 2(b) and 2(c) suggest that, on this example,

356

the FB approach returns a better posterior distribution than the MAP approach. 5. Conclusion In this article, we propose a Bayesian model for stochastic multi-fidelity numerical model. The model is based on a Gaussian process, completed with prior distributions on the hyper-parameters of the covariance function and on noise variances. By comparing prior and posterior hyper-parameter distributions, we see that observations bring informations about the hyperparameters. Using sampling algorithms, we can sample the posterior distribution of the quantity of interest, here a Probability of Failure (PoF). By comparing the Fully Bayesian approach with Maximum A Posteriori plugin approach, we can see that, on an academic example, the Fully Bayesian approach provides more robust confidence intervals of the PoF. However, the priors require care when using the models. Future work will focus on assessing the impact of the different prior modeling choices on the posterior distributions of hyper-parameters and of quantities of interest. References [1] R. Stroh, J. Bect, S. Demeyer, N. Fischer, M. Damien and E. Vazquez, Assessing fire safety using complex numerical models with a bayesian multi-fidelity approach, Fire Safety Journal 91, 1016 (2017). [2] V. Picheny and D. Ginsbourger, A nonstationary space-time gaussian process model for partially converged simulations, SIAM/ASA Journal on Uncertainty Quantification 1, 57 (2013). [3] R. Tuo, C. F. J. Wu and D. Yu, Surrogate modeling of computer experiments with different mesh densities, Technometrics 56, 372 (2014). [4] T. J. Santner, B. J. Williams and W. I. Notz, The Design and Analysis of Computer Experiments, Springer Series in Statistics, Springer Series in Statistics (Springer, New York, 2003). [5] H. Haario, E. Saksman and J. Tamminen, An adaptive metropolis algorithm, Bernoulli 7, 223 (2001). [6] S.-K. Au and J. L. Beck, Estimation of small failure probabilities in high dimensions by subset simulation, Probabilistic Engineering Mechanics 16, 263 (2001). [7] P. Z. G. Qian and C. F. J. Wu, Bayesian hierarchical modeling for integrating low-accuracy and high-accuracy experiments, Technometrics 50, 192 (2008).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 357–364)

Application of ISO 5725 to evaluate measurement precision of distribution within the lung after intratracheal administration J. Takeshita∗1 , J. Ono2 , T. Suzuki2 , H. Kano3 , Y. Oshima4 , Y. Morimoto5 , H. Takehara6 , T. Numano7 , K. Fujita1 , N. Shinohara1 , K. Yamamoto1 , K. Honda1 , S. Fukushima3 and M. Gamo1 1 National

Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki, Japan 2 Tokyo University of Science, Noda, Chiba, Japan 3 Japan Bioassay Research Center, Japan Organization of Occupational Health and Safety, Hadano, Kanagawa, Japan 4 Chemical Evaluation and Research Institute, Japan, Hita, Oita, Japan 5 University of Occupational and Environmental Health, Kitakyushu, Fukuoka, Japan 6 Public Interest Incorporated Foundation BioSafety Research Center, Iwata, Shizuoka, Japan 7 DIMS Institute of Medical Science, Inc., Ichinomiya, Aichi, Japan ∗ E-mail: [email protected] Intratracheal administration testing is an in vivo screening method for evaluating the pulmonary toxicity of nanomaterials. However, no public test guidelines currently exist for this method. Thus, the present study conducts an interlaboratory comparison study and quantitatively analyses the results. More precisely, (1) it tests whether or not the true between-laboratory variances are greater than zero by applying one-way analysis of variance (ANOVA), (2) it compares the sizes of the true between-laboratory variances and the repeatability variances through the F -test and (3) it calculates the statistical powers of the statistical tests in (1) and (2). The following results were obtained: (1) the true between-laboratory variances were greater than zero, (2) the true between-laboratory variances were not larger than the repeatability variances and (3) the sample sizes of the experiments provided sufficient statistical power for detecting the expected variances, if present. We propose that to elucidate the sizes of the true between-laboratory variances, we should not only quantify their sizes, but should also compare them to those of the repeatability variances. Keywords: Intratracheal administration testing; Inter-laboratory comparison study; Repeatability; Reproducibility; One-way analysis of variance (ANOVA); F -test.

357

358

1. Introduction Inhalation toxicity testing is the gold standard in vivo testing method for evaluating the pulmonary toxicity of nanomaterials. Indeed, the Organisation for Economic Co-operation and Development has developed test guidelines for inhalation toxicity, such as TG 412 1 and TG 413 2 , which means that this method has been already harmonised internationally. However, the test procedure is very costly and time-consuming. Although animalbased evaluations can be replaced with cheaper, more easily implemented in vitro toxicity studies, these alternatives cannot be easily bridged to in vivo studies. To avoid these problems, researchers have proposed and applied in vivo screening testing methods such as intratracheal administration testing 3 . Screening tests can be conducted at a reasonable cost within a much shorter time frame than inhalation toxicity testing. However, intratracheal administration testing lacks a set of public test guidelines. Therefore, to standardise the screening method, we must conduct an inter-laboratory comparison study on the method and quantitatively analyses the results. To evaluate the measurement precision of intratracheal administration testing, the present study compares and analyses the results of inter-laboratory studies on nanoparticle distribution in the lung after intratracheal administration, which was originally reported in Gamo et al. 4 . Their result is summarised in subsection 2.1 of the present study. More precisely, the present study determines the amounts of nanomaterials in the left and right lungs as percentages of total dose administered (where the amount of nanomaterials in the right lung is the summed amounts of nanomaterials in the middle lobe, cranial lobe, caudal lobe and accessory lobe). Hereafter, we refer to the dose proportions in the left and right lungs as percentage dose in left lung and percentage dose in right lung, respectively. Because the original data were expressed as percentages, they were logit-transformed prior to analysis. The logit transformation of a given percentage p is defined as logit(p) = loge (p/(1 − p)). The logit-transformed data were analysed from two viewpoints. First, we tested whether or not the true between-laboratory variances were greater than zero by one-way analysis of variance (ANOVA). We then compared the sizes of the true between-laboratory variances (which we hope to assess) and the repeatability variances through the F -test. Here, we compared the sizes through the F -test because in vivo studies generally observe the biological responses of laboratory animals, which introduce larger variation than physicochemical studies. Therefore, in vivo studies have inherent dispersions arising

359

from the variabilities among the individual animals and the environments of individual laboratories. Therefore, testing for zero laboratory effects is ineffective in practice. As directly comparing the estimated values of the true between-laboratory variances and the repeatability variances was deemed insufficient, we instead compared them by the F -test. Finally, we calculated the statistical powers of the above analyses.

2. Inter-laboratory Comparison Study 2.1. Design of the inter-laboratory comparison study This subsection summarizes the inter-laboratory comparison study, originally reported in Gamo et al. 4 . As no public test guidelines currently exist for intratracheal administration testing, all members of the inter-laboratory comparison study initially convened to define an exact test procedure. The three nanomaterials used in this study, namely, nickel oxide nanoparticles (NiO), single-wall carbon nanotubes (SWCNT) and multi-wall carbon nanotubes (MWCNT), were characterised in the ‘Methods’ section of Gamo et al. 4 . For each nanomaterial, the inter-laboratory comparison study was designed as follows: (1) Data were collected at five test laboratories; the Japan Bioassay Research Center, the Japan Organization of Occupational Health and Safety; the Chemical Evaluation and Research Institute, Japan; the University of Occupational and Environmental Health (Japan); the Public Interest Incorporated Foundation BioSafety Research Center (Japan); and the DIMS Institute of Medical Science, Inc. (Japan). (2) Each test laboratory was provided with nanomaterials from the National Institute of Advanced Industrial Science and Technology at the same time. (3) Each test laboratory administered one dose of the nanomaterial to five rats in accordance with the test procedure. (4) Three days after the administration, the amounts of nanomaterials in the left lung, middle lobe, cranial lobe, caudal lobe, accessory lobe and trachea were measured. The final data size was 480 (six organs × five rats × five laboratories × three nanomaterials).

360

2.2. Data from the inter-laboratory comparison study Tables 1–3 show the means and standard deviations of the logit-transformed percentage doses in the left and right lungs of the five rats administered with NiO, SWCNT and MWCNT, respectively. Table 1. Means and standard deviations (SDs) of the logit-transformed percentage doses in the left and right lungs of five rats after NiO administration. Left lung Mean SD Lab Lab Lab Lab Lab

A B C D E

−0.5258 −0.2967 −1.3960 −1.3079 −1.1434

0.4185 0.1639 0.0767 0.2837 0.2577

Right lung Mean SD 0.1111 −0.7768 −0.5396 −0.5668 −0.4893

0.4962 0.7406 0.1079 0.3320 0.2608

Table 2. Means and standard deviations (SDs) of the logit-transformed percentage doses in the left and right lungs of five rats after SWCNT administration. Left lung Mean SD Lab Lab Lab Lab Lab

A B C D E

−0.7516 −0.8628 −0.1482 −1.2201 −1.1848

0.3589 0.5094 0.2742 0.6670 0.7226

Right lung Mean SD 0.1135 0.0826 −0.9223 −0.7115 0.1732

0.2416 0.4244 0.4043 0.7810 0.6046

3. Analysis of Repeatability and Reproducibility 3.1. Laboratory effects Using ANOVA, this subsection tests whether or not the true betweenlaboratory variances exceeded zero. A standard measurement method is expected to yield non-significant test results for all the six items (two lungs × three nanomaterials). As each ANOVA was performed six times, each level of significance P ∗ was determined as 0.0085. The overall significance level was 0.05. The calculation is given by P ∗ = 1 − (1 − 0.05)1/6 = 0.0085.

361 Table 3. Means and standard deviations (SDs) of the logit-transformed percentage doses in the left and right lungs of five rats after MWCNT administration. Left lung Mean SD Lab Lab Lab Lab Lab

A B C D E

−0.9384 −1.7250 −1.2364 −1.3558 −1.4009

0.5060 0.9912 0.2283 0.7799 0.4049

Right lung Mean SD 0.3712 −0.1135 −0.3827 0.2181 0.4894

0.5690 0.5232 0.3075 0.2795 0.1357

The statistical model is yik = µ + αj + εjk , where y denotes the measured data, µ is the general mean and α is the laboratory effect (bias), where 2 2 αj ∼ N (0, σL ). The error term obeys εjk ∼ N (0, σE ). Here j = 1, . . . b, where b is the number of laboratories, and k = 1, . . . , n, where n is the number of repetitions. The repeatability variance, true between-laboratory 2 2 variance and reproducibility variance are given by σr2 = σE , σL , and 2 2 2 σR = σL + σr , respectively. In the ANOVA-based hypothesis test, the null 2 2 hypothesis was H0 : σL = 0 and the alternative hypothesis was H1 : σL > 0. The ANOVA results are summarised in Table 4. One test (the percentage dose in left lung after administration of NiO) was statistically significant. From these results, we conclude that the laboratory effect exists. In other words, the true between-laboratory variances are greater than zero. 3.2. Sizes of the true between-laboratory variances In subsection 3.1, we revealed that the true between-laboratory variances exceeded zero. In this subsection, we evaluate the sizes of the true betweenlaboratory variances. More precisely, we compare the sizes of the repeatability variances and the true between-laboratory variances by applying the 2 F -test. In this test, the null hypothesis is H0 : σL = σr2 and the alternative 2 2 hypothesis is H1 : σL > σr . The F -test is derived in Appendix A.1. The ANOVA results are summarised in Table 5. The results confirm that the laboratory variances were not greater than the repeatability variances. 3.3. Statistical powers of the two analyses In this subsection, we calculate the statistical powers of the significance tests applied in subsections 3.1 and 3.2. The calculation procedure is given

362

Table 4. effects. NM

ANOVA results of laboratory

Left lung Res. P -val.

NiO SW MW

S N N

0.0000 0.0321 0.5027

Right lung Res. P -val. N N N

0.0577 0.0094 0.0138

Table 5. ANOVA results of the true between-laboratory variances.

NM NiO SW MW

Left lung Res. P -val. N N N

0.0510 0.7038 0.9544

Right lung Res. P -val. N N N

0.7567 0.5705 0.6126

In the two tables, NM, SW, MW, Res., P -val., N and S are abbreviations of Nanomaterial, SWCNT, MWCNT, Result, P -value, not significant and significant, respectively.

in Appendix A.2. The overall power π of the combined six significance tests is given by π = 1 − (1 − β)6 . Tables 6 and 7 respectively show the results under the null hypothesis 2 2 H0 : σ L = 0 and the alternative hypothesis H1 : σL > 0 (subsection 3.1), 2 2 and under the null hypothesis H0 : σL = σr and the alternative hypoth2 esis H1 : σL > σr2 (subsection 3.2). The sample sizes of the experiments provided sufficient statistical power for detecting the expected variances, if present. 2 = 0 and H : σ 2 > 0 when ∆2 = 1.0. Power of the tests under H0 : σL 1 L

Table 6. n\b

2

3

4

5

6

7

8

9

10

2 3 4 5 6 7 8 9 10

0.140 0.367 0.585 0.728 0.815 0.870 0.906 0.929 0.946

0.213 0.566 0.803 0.910 0.957 0.978 0.988 0.993 0.996

0.296 0.721 0.914 0.973 0.991 0.997 0.999 0.999 1.000

0.383 0.829 0.965 0.993 0.998 1.000 1.000 1.000 1.000

0.468 0.900 0.987 0.998 1.000 1.000 1.000 1.000 1.000

0.549 0.944 0.995 1.000 1.000 1.000 1.000 1.000 1.000

0.623 0.969 0.998 1.000 1.000 1.000 1.000 1.000 1.000

0.688 0.984 0.999 1.000 1.000 1.000 1.000 1.000 1.000

0.746 0.992 1.000 1.000 1.000 1.000 1.000 1.000 1.000

4. Conclusion This study analysed the data from an inter-laboratory comparison study on intratracheal administration testing. The results revealed that (1) a laboratory effect existed in the ANOVA results, and (2) the null hypotheses, namely, ‘the true inter-laboratory variances and the repeatability variances

363 2 = σ 2 and H : σ 2 > σ 2 when Table 7. Power of the tests under H0 : σL 1 r r L ∆2 = 4.0.

n\b

2

3

4

5

6

7

8

9

10

2 3 4 5 6 7 8 9 10

0.140 0.290 0.402 0.472 0.519 0.551 0.574 0.591 0.605

0.213 0.447 0.588 0.667 0.715 0.747 0.769 0.785 0.798

0.296 0.585 0.727 0.798 0.837 0.862 0.879 0.890 0.899

0.383 0.698 0.825 0.881 0.910 0.927 0.938 0.946 0.951

0.468 0.786 0.891 0.932 0.952 0.963 0.969 0.974 0.977

0.549 0.852 0.934 0.962 0.975 0.981 0.985 0.988 0.990

0.623 0.899 0.961 0.980 0.987 0.991 0.993 0.994 0.995

0.688 0.933 0.977 0.989 0.994 0.996 0.997 0.998 0.998

0.746 0.956 0.987 0.994 0.997 0.998 0.999 0.999 0.999

are equal in size,’ could not be rejected at the P ≤ 0.05 significance level. Roughly speaking, although the true inter-laboratory variances exceeded zero, their sizes did not exceed the sizes of the repeatability variances. Moreover, the sample sizes of the experiments provided sufficient statistical power for detecting the expected variances, if present. When testing methods (such as intratracheal administration testing) are widely used, the sizes of the true inter-laboratory variances should be evaluated, especially when the true inter-laboratory variances are greater than zero. Moreover, for understanding the sizes of the true inter-laboratory variances, we propose not only quantifying these sizes with reference to ISO 5725 Part 2 5 , but also comparing them to the sizes of the repeatability variances. In the present study, our proposal was evaluated on data from an inter-laboratory comparison study on intratracheal administration testing. Acknowledgements This study was supported ‘Survey on standardization of intratracheal administration study for nanomaterials and related issues’ funded by the Ministry of Economy, Trade and Industry (METI) of Japan, and JSPS KAKENHI Grant Numbers JP16K21674 and JP15K01207. Appendix A.1. Derivation of F -test 2 Under the model given in subsection 3.1, F = {MSB/(σr2 + nσL )}/ 2 {MSE/σr } follows the F -distribution F (b − 1, b(n − 1)), where MSB and MSE are the mean squares of laboratory and repetition, respectively.

364 2 Under the condition σr2 + nσL = cσr2 , where c is an arbitrary positive constant, and setting c = n + 1, we obtain the following F -test:

Null Hypothesis Alternative Hypothesis Test Statistic

2 H0 : σ L = σr2 2 H1 : σ L > σr2 ,

F0 = {MSB}/{(n + 1) × MSE}.

A.2. Power of the significance tests The power 1 − β of the significance test for test statistic F0′ under the null 2 hypothesis σr2 + nσL = cσr2 is calculated as follows: 1 − β = Pr {F0′ ≥ F (b − 1, b(n − 1); α)} ) (  2 2 MSB/ σr2 + nσL σr2 + nσL ≥ F (b − 1, b(n − 1); α) × = Pr c × MSE/ (cσr2 ) cσr2     F (b − 1, b(n − 1); α) F (b − 1, b(n − 1); α) = Pr F ≥ , = Pr F ≥ 2 /σ 2 )) /c (1 + n (σL (1 + n∆2 ) /c r

where ∆ = σL /σr . References

1. OECD, Subacute Inhalation Toxicity: 28-Day Study, Test Guideline No. 412, OECD Guidelines for the Testing of Chemicals (OECD, Paris, France, 2009). 2. OECD, Subchronic Inhalation Toxicity: 90-Day Study, Test Guideline No. 413, OECD Guidelines for the Testing of Chemicals (OECD, Paris, France, 2009). 3. K. E. Driscoll, D. L. Costa, G. Hatch, R. Henderson, G. Oberdorster, H. Salem and R. B. Schlesinger, Intratracheal instillation as an exposure technique for the evaluation of respiratory tract toxicity: uses and limitations, Toxicol. Sci. 55, 24–35 (2000). 4. M. Gamo, J. Takeshita, J. Ono, H. Kano, Y. Oshima, Y. Morimoto, H. Takehara, T. Numano, K. Fujita, N. Shinohara, K. Yamamoto, T. Suzuki, K. Honda and S. Fukushima, Evaluation of variability in lung burden after intratracheal administration of manufactured nanomaterials into rat based on interlaboratory study (in preparation). 5. ISO, ISO 5725-2 1994 Accuracy (trueness and precision) of measurement methods and results. — Part 2: Basic methods for the of repeatability and reproducibility of a standard measurement method (International Organization for Standardization, Geneva, Switzerland, 1994).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 365–374)

Benchmarking rater agreement: Probabilistic versus deterministic approach Amalia Vanacore* and Maria Sole Pellegrino Department of Industrial Engineering, University of Naples “Federico II”, Naples, 80125, Italy *E-mail: [email protected] In several industries strategic and operational decisions rely on subjective evaluations provided by raters who are asked to score and/or classify group of items in terms of some technical properties (e.g. classification of faulty material by defect type) and/or perception aspects (e.g. comfort, quality, pain, pleasure, aesthetics). Because of the lack of a gold standard for classifying subjective evaluations as “true” or “false”, rater reliability is generally measured by assessing her/his precision via inter/intra-rater agreement coefficients. Agreement coefficients are useful only if their magnitude can be easily interpreted. A common practice is to apply a straightforward procedure to translate the magnitude of the adopted agreement coefficient into an extent of agreement via a benchmark scale. Many criticisms have been attached to this practice and in order to solve some of them, the adoption of a probabilistic approach to characterize the extent of agreement is recommended. In this study some probabilistic benchmarking procedures are discussed and compared via a wide Monte Carlo simulation study. Keywords: rater agreement, kappa-type coefficient, probabilistic benchmarking procedures, Monte Carlo simulation.

1. Introduction Agreement coefficients are widely adopted for assessing the precision of subjective evaluations provided by human raters to support strategic and operational decisions in several contexts (e.g. manufacturing and service industries, food, healthcare, safety, among many others). Subjective evaluations are typically provided on categorical rating scale for which the common statistical tools, that work readily for continuous data, are not applicable. For this reason, rater precision is generally assessed in terms of the extent of agreement between two or more series of evaluations on the same sample of items (subjects or objects) provided by two or more raters (inter-rater agreement) or by the same rater in two or more occasions (intra-rater agreement). Specifically, inter-rater agreement is concerned about the reproducibility of 365

366

measurements by different raters, whereas intra-rater agreement is concerned about self-reproducibility (also known as repeatability). The degree of inter/intra-rater agreement for categorical rating scale is commonly assessed using kappa-type agreement coefficients that, originally introduced by Cohen [1], are rescaled measures of agreement corrected with the probability of agreement expected by chance alone. It is common practice to qualify the magnitude of a kappa-type agreement coefficient by comparing it against an arbitrary benchmark scale; by applying this straightforward benchmarking, practitioners relate the magnitude of the coefficient to an extent of agreement and then decide whether it is good or poor. Although widely adopted, the straightforward benchmarking has some drawbacks. As demonstrated for example by Thompson and Walter [2] and Gwet [3], the magnitude of an agreement coefficient may strongly depend on some experimental factors such as the number of rated items, rating scale dimension, trait prevalence and marginal probabilities. Thus, interpretation based on the straightforward benchmarking should be treated with caution especially for comparison across studies when the experimental conditions are not the same. A proper characterization of the extent of rater agreement should rely upon a probabilistic benchmarking procedure that allows to identify a suitable neighborhood of the truth (i.e. the true value of rater agreement) by taking into account sampling uncertainty. The most simple and intuitive way to accomplish this task is by building a confidence interval of the agreement coefficient and comparing its lower bound against an adopted benchmark scale. A different approach to probabilistic benchmarking is the one recently proposed by Gwet [3] which, under the assumption of asymptotically normal distribution, evaluates the likelihood that the estimated agreement coefficient belongs to any given benchmark level. The above benchmarking approaches will be in the following fully discussed and their performances will be evaluated and compared via a Monte Carlo simulation study with respect to the ability to correctly interpreting the magnitude of the agreement coefficient in terms of weighted misclassification rate. The paper focuses on agreement on ordinal rating scale, thus in the following we will deal with weighted kappa-type coefficients that allow to consider that disagreement on two distant categories are more serious than disagreement on neighboring categories. The remainder of the paper is organized as follows: in Section 2 two wellknown paradox-resistant kappa-type agreement coefficients are discussed; the commonly adopted benchmark scales are presented in Section 3; four

367

characterization procedures based on a probabilistic approach to benchmarking are discussed in Section 4; in Section 5 the simulation design is described and the main results are fully discussed; finally, conclusions are summarized in Section 6. 2. Weighted Kappa-type agreement coefficients Let n be the number of items rated by two raters on an ordinal k-points rating scale (with k > 2), nij the number of items classified into ith category by the first rater but into jth category by the second rater and wij the corresponding symmetrical weight, niꞏ be the total number of items classified into ith category by the first rater and nꞏi be the total number of items classified into ith category by the second rater. The weighted Cohen’s Kappa coefficient [4] can be computed as: Kˆ W   pa  pa |c  1  pa |c 

where

(1)

paw  i 1  j 1 wij nij n ; pa|c  i 1  j 1 wij  ni n  n j n k

k

k

k

(2)

Despite its popularity, researchers have pointed out two main criticisms with Cohen’s Kappa: it is affected by the degree to which raters disagree (bias problem); moreover, for a fixed value of observed agreement, tables with marginal asymmetry produce lower values of Kappa than tables with homogeneous marginal (prevalence problem). These criticisms were firstly observed by Brennan and Prediger [5] although they are widely known as “Kappa paradoxes” as referred to by Feinstein and Cicchetti [6]. A solution to face the above paradoxes is to adopt the uniform distribution for chance measurements, which — given a certain rating scale — can be defended as representing the maximally non-informative measurement system. The obtained weighted uniform kappa, referred to as Brennan-Prediger coefficient (although proposed also by several other authors [5, 7, 8, 9, 10]), is formulated as:

 w   p  pU  1  pU  BP aw a|c a|c

(3)

U 2 where paw is defined as in equation (2) and pa|c  Tw k , being T w the sum

over all weight values wij. Another well-known paradox-resistant agreement coefficient alternative to Cohen’s Kappa is the AC coefficient (proposed by Gwet [11]), whose weighted version (AC2) is formulated as:

368

 2   p  pG  1  pG  AC aw a|c a|c

(4)

where paw is defined as in equation (2) and the probability of chance agreement is defined as the probability of the simultaneous occurrence of random rating (R) by one of the raters and rater agreement (G): P ( G  R )  P ( G | R )  P ( R ) . Specifically, P(R) is approximated with a normalized measure of randomness k defined by the ratio of the observed variance i 1 pi (1  pi ) to the variance

expected under the assumption of totally random rating 1 (k  1) ; whereas the 2 conditional probabilities of agreement P(G|R) is given by P(G | R)  Tw k :

paG|c  Tw

 k ( k  1)   i 1 pi (1  pi ) k

(5)

being pi = (niꞏ+nꞏi)/2n the estimate of the propensity that a rater classifies an item into ith category. 3. Aid to the characterization of the extent of agreement: Benchmark scales After computing an agreement coefficient, a common question is ‘how good is the agreement?’ In order to provide an aid to qualify the magnitude of kappa-type coefficients, a number of benchmark scales have been proposed mainly in social and medical sciences over the years. The best known benchmark scales are reviewed below and reported in Table 1. According to Hartmann [12], acceptable values for kappa should exceed 0.6. The most widely adopted benchmark scale is the one with six ranges of values proposed by Landis and Koch [13], which was simplified by Fleiss [14] and Altman [15], with three and five ranges, respectively, and by Shrout [16] who collapsed the first three ranges of values into two agreement categories. Munoz and Bangdiwala [17], instead, proposed guidelines for interpreting the values of a kappa-type agreement coefficient with respect to the raw proportion of agreement. Whatever the adopted scale, the benchmarking procedure is generally straightforward since the coefficient magnitude is qualified as the extent of agreement (e.g. good) associated to the range of values where the estimated agreement coefficient falls.

369 Table 1. Benchmark scales for kappa-type coefficients. Hartmann (1977)

Landis and Koch (1977)

Fleiss (1981)

Kappa

Strength of

Kappa

Strength of

Kappa

Strength of

coefficient

agreement

coefficient

agreement

coefficient

agreement

> 0.6

Good

< 0.0

Poor

< 0.4

Poor

0.0 to 0.20

Slight

0.40 to 0.75 Intermediate to Good

0.21 to 0.40

Fair

> 0.75

0.41 to 0.60

Moderate

0.61 to 0.80

Substantial

0.81 to 1.00

Almost perfect

Altman (1991)

Shrout (1998)

Excellent

Munoz and Bengdiwala (1997)

Kappa

Strength of

Kappa

Strength of

Kappa

Strength of

coefficient

agreement

coefficient

agreement

coefficient

agreement

< 0.2

Poor

0.00 to 0.10

Virtually none

< 0.00

Poor

0.21 to 0.40

Fair

0.11 to 0.40

Slight

0.00 to 0.20

Fair

0.41 to 0.60

Moderate

0.41 to 0.60

Fair

0.21 to 0.45

Moderate

0.61 to 0.80

Good

0.61 to 0.80

Moderate

0.46 to 0.75

Substantial

0.81 to 1.00

Very good

0.81 to 1.00

Substantial

0.76 to 0.99

Almost perfect

1.00

Perfect

4. Probabilistic benchmarking procedures Despite its popularity, the straightforward benchmarking procedure can be misleading for two main reasons:  it does not associate the interpretation of the coefficient magnitude with a degree of certainty failing to consider that an agreement coefficient, as any other sampling estimate, is exposed to statistical uncertainty;  it does not allow to compare the extent of agreement across different studies, unless they are carried out under the same experimental conditions (i.e. the number of observed items, the number of categories or the distribution of items among the categories). In order to have a fair characterization of the extent of rater agreement, the benchmarking procedure should be probabilistic so as to associate a degree of certainty to the interpretation of the kappa coefficient [18].

370

Under asymptotic conditions, the magnitude of the kappa type coefficient can be related to the notion of extent of agreement by benchmarking the lower bound of its asymptotic normal (1 − 2α)% CI:

LBN  Kˆ  z  se( Kˆ )

(6)

where Kˆ is the estimated kappa coefficient, se( Kˆ ) its standard error and zα the α percentile of the standard normal distribution. Recently, Gwet [3] proposed a probabilistic benchmarking procedure based on the Interval Membership Probability (IMP). Gwet’s procedure characterizes the magnitude of agreement by benchmarking the lowest value K L such that the probability that K exceeds K L is equal to 1–2α. The above two benchmarking procedures rely on the asymptotically normal distribution assumption and thus they can work well only for reasonable large sample sizes. Vice-versa, under non-asymptotic conditions, bootstrap resampling can be adopted for building approximated as well as exact non-parametric CIs [19, 20]. Among the available bootstrap methods, in this study we focus on percentile CI and Bias-Corrected and Accelerated (BCa) CI [21]. The former is by far the easiest and most widespread method, the latter is recommended for severely skewed distribution. Being G the cumulative distribution function of the bootstrap replications of the kappa-type coefficient, the lower bound of the (1–2α)% percentile CI is:

LB p  G 1  

(7)

whereas, being  the standard normal CDF, b the bias correction parameter and a the acceleration parameter, the lower bound of the (1–2α)% BCa bootstrap CI is:

   z  b LBBCa  G 1    b        1  a  z  b   

(8)

5. Simulation study The statistical properties of the above-discussed probabilistic benchmarking procedures have been investigated via a Monte Carlo simulation study across 72 different settings defined by varying three parameters: number of rated items (i.e. n = 10, 30, 50, 100), number of categories (i.e. k = 2, 3, 4) and strength of

371

agreement (low, moderate and high), represented by six levels of agreement ranging from 0.4 to 0.9, computed assuming a linear weighting scheme [22]: wij  1 

|i  j | k 1

(9)

Specifically, the simulation study has been developed considering two raters classifying n items into one of k possible ordinal rating categories. The data have been simulated by sampling r = 2000 Monte Carlo data sets from a multinomial distribution with parameters n and p = (π11,…, πij,… πik), being πij the probability that an item is classified into category ith by the first rater and into jth category by the second rater. For each rating scale dimension and assuming a linear weighting scheme, the values of the joint probabilities πij have been chosen so as to obtain the six true population values of agreement (viz. 0.4, 0.5, 0.6, 0.7, 0.8, 0.9) for a total of 18 different vectors p for each sample size; for example p = (0.22, 0.06, 0.05, 0.06, 0.22, 0.06, 0.05, 0.06, 0.22) for obtaining an agreement level equal to 0.5 with k = 3 rating categories. The four probabilistic benchmark procedures have been applied to characterize the simulated AC2 and BPw agreement coefficients across all different settings and their performances have been evaluated in terms of the weighted proportion of misclassified benchmarks (weighted misclassification rate, Mw). The simulation results obtained for each coefficient and each combination of n and k values are represented in the bubble chart in Figure 1, where the size of each bubble expresses the Mw value. Since the parametric benchmarking procedures apply only under asymptotic conditions, the Figure 1 is divided into 2 sections by a dashed line: on the left side only the non-parametric procedure are compared each other (i.e. two overlapping bubbles for LBBCa and LBp, respectively), whereas the right side refers to all the benchmarking procedures under comparison (i.e. four overlapping bubbles for LBBCa, LBp, KL and LBN, respectively). The bubble chart displays all the 24 analyzed comparisons: for each of them, the foreground bubble represents the benchmarking procedure with the best performance (i.e. the one with the smallest Mw), whereas the background bubble represents the procedure with the worst performance (i.e. the one with the highest Mw, whose value is reported in the label). For small sample sizes Mw slightly differs across non-parametric benchmarking procedures — with a difference no more than 5% — and agreement coefficients. The results seem to suggest that the best choice is benchmarking the lower bound of the percentile CI for AC2 and benchmarking the lower bound of the BCa CI for BPw. For large sample sizes, instead, Mw is comparable across benchmarking procedures and agreement coefficients.

372

Specifically, it is worthwhile to pinpoint that the differences in Mw across non-parametric benchmarking procedures and agreement coefficients get smaller as n increases because of the decreasing skewness in the distributions of the agreement coefficients: if the distribution is symmetric, the BCa and percentile CIs agree.

Fig. 1. Mw for BPw and AC2, for different benchmarking procedures, n and k values.

6. Conclusions One of the main issues related to the widely adopted agreement coefficients regards the characterization of the extent of agreement. Most research studies characterize the extent of agreement by comparing the obtained agreement coefficient against well-known threshold values, like 0.5 or 0.75, whereas only few research studies adopt probabilistic approaches, overcoming the straightforward comparison. Anyway, the probabilistic approaches commonly adopted in the literature are generally based on parametric asymptotic CIs that, by definition, are only applicable for large sample sizes so that small sample sizes become the most critical for statistical inference, although the most affordable in many experimental contexts.

373

The conducted Monte Carlo simulation study suggests that the nonparametric probabilistic benchmarking procedures based on bootstrap resampling have satisfactory and comparable (with a difference up to 5%) properties for moderate or small number of rated items. Specifically, with n = 30 the performances of the procedures based on bootstrap CIs differ from each other at most for 2%, therefore benchmarking the lower bound of the percentile bootstrap CI could be suggested because of the less computation burden. Otherwise, with large sample sizes, being the performances indistinguishable across all benchmarking procedures, parametric procedures should be preferred because of their lower computational complexity. References 1. 2. 3.

4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

J. Cohen, A coefficient of agreement for nominal scale, EPM 20(1), 37-46. (1960). W. D. Thompson and S. D. Walter, A reappraisal of the kappa coefficient, J Clin Epidemiol 41(10), 949–58 (1988). K. L. Gwet, Handbook of inter-rater reliability: The definitive guide to measuring the extent of agreement among raters (Advanced Analytics, LLC 2014). J. Cohen, Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit, Psychological Bulletin 70(4), 213–219 (1968). R. L.Brennan and D. J. Prediger, Coefficient Kappa: Some Uses, Misuses, and Alternatives, EPM 41, 687–699 (1981). A. Feinstein and D. Cicchetti, High agreement but low kappa: I. The problems of two paradoxes, J Clin Epidemiol 43(6), 543–549 (1990). L. Guttman, The test-retest reliability of qualitative data, Psychometrika 11, 81–95 (1945). E. M. Bennett, R. Alpert and A. C. Goldstein, Communications through limited response questioning. Public Opin Q. 18(3), 303–308 (1954). J.W. Holley and J. P. Guilford, A note on the G index of agreement, EPM 24, 749–753 (1964). S. Janson and J. Vegelius, On generalizations of the G index and the Phi coefficient to nominal scales, Multivariate Behav Res 14(2), 255–269 (1979). K. L. Gwet, Computing inter-rater reliability and its variance in the presence of high agreement, Br. J. Math. Stat. Psychol 61, 29–48 (2008). D. Hartmann, Considerations in the choice of interobserver reliability estimates, J Appl Behav Anal 10(1), 103–116 (1977). J. R. Landis and G. G. Koch, The measurement of observer agreement for categorical data, Biometrics 33(1), 159–174 (1977).

374

14. J. L. Fleiss, Statistical Methods for Rates and Proportions (John Wiley & Sons. 1981). 15. D. G. Altman, Practical Statistics for Medical Research (Chapman and Hall, 1991). 16. P. E. Shrout, Measurement reliability and agreement in psychiatry, Stat Methods Med Res 7(3), 301–317 (1998). 17. S. R. Munoz and S. I. Bangdiwala, Interpretation of Kappa and B statistics measures of agreement, J Appl Stat, 24(1), 105–112 (1997). 18. J. Kottner, L. Audige, S. Brorson, A. Donner, B. J. Gajewski, A. Hrobjartsson, C. Roberts, M. Shoukri, D. L. Streiner, Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed, Int J Nurs Stud, 48, 661–671 (2011). 19. N. Klar, S. Lipsitz, M. Parzen and T. Leong, An exact bootstrap confidence interval for κ in small samples. J R Stat Soc Ser D (The Statistician), 51(4), 467–478 (2002). 20. J. Lee and K. P. Fung, Confidence interval of the kappa coefficient by bootstrap resampling [letter]. Psychiatry Research 49, 97–98 (1993). 21. J. Carpenter and J. Bithell, Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians, Statist Med. 19, 1141–64 (2000). 22. D. V. Cicchetti and T. Allison, A new procedure for assessing reliability of scoring EEG sleep recordings. Amer. J. EEG Technol. 11(3), 101–10 (1971).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 375–382)

Regularisation of central-difference method when applied for differentiation of measurement data in fall detection systems* Jakub Wagner and Roman Z. Morawski Institute of Radioelectronics and Multimedia Technology, Faculty of Electronics and Information Technology, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland E-mail: [email protected], [email protected] The research reported in this paper is related to the depth sensor technology when applied in care services for the elderly and disabled persons. It is focused on a system for nonintrusive fall detection, in which the trajectory of the mass centre of a monitored person is estimated on the basis of depth-sensor data and differentiated numerically in order to estimate the velocity of that person. Several differentiation methods, based on different strategies for regularisation of the central-difference method, are compared in terms of their applicability in that system using synthetic data and real-world data. The method based on weighted summing of derivative estimates — the estimates obtained using various values of the differentiation step — with weights selected using the discrepancy principle, provides the most promising results. Keywords: Numerical Differentiation; Regularisation; Depth Sensor; Fall Detection.

1. Introduction The research reported in this paper is focused on a non-intrusive system for monitoring of human movements, based on depth sensors. Such a system provides data representative of the trajectory of the monitored person’s mass centre [1]. Numerical differentiation of that trajectory allows for the estimation of the monitored person’s velocity, which is useful in some healthcare applications, such as the detection of that person’s falls. The following rules for generation of mathematical symbols are applied throughout this paper:  x, x, X — a scalar variable, a vector of scalar variables and a matrix of scalar variables; * This work has been initiated within the project PL12-0001 financially supported by EEA Grants — Norway Grants (http://eeagrants.org/project-portal/project/PL12-0001), and continued within the statutory project supported by the Institute of Radioelectronics and Multimedia Technology, Faculty of Electronics and Information Technology, Warsaw University of Technology.

375

376

  

x , x , X — the values of x, x and X, subject to source errors; xˆ, xˆ , Xˆ — the values of x, x and X, subject to estimation errors; x , x , X — a scalar random variable, a vector of scalar random variables and a matrix of scalar random variables;   ,  — an absolute increment and a relative increment, in particular — an absolute error and a relative error. The component of the monitored person’s movement trajectory, corresponding to a selected spatial dimension, can be modelled by a real-valued function x  f  t  , where t is a variable modelling time. The estimation of the velocity of the monitored person’s movement requires the differentiation of f  t  on the basis of its error-corrupted values, viz.:

x n  f  tn   n

where tn  t0  nt ; t   t N  t0  N

for n  0,  , N

(1)

is the differentiation step; n are

realisations of identical, independent, zero-mean random variables n whose standard deviation is   . An estimate of   is assumed to be available a priori.

The error of differentiation by means of the central-difference method (called CD method hereinafter): 1 xˆn    xn 1  xn 1  2t for n  1, , N  1

(2)

depends on t in a way resulting from the proportion of its two components: a component due to the approximation of the derivative (which is growing with t ), and a component resulting from the propagation of the errors in the data to the result of differentiation (which is diminishing with t ); therefore, the differentiation step t plays the role of a regularisation parameter. It can be selected via minimisation of the expanded uncertainty of estimation errors 1 1 1 xˆ n   xˆn   f    tn  , defined as: 1 1 1 U  xˆ n    E  xˆ n    k Var  xˆ n        

which yields: where f 

3

 tn 

tnopt  1.28 3 k 

f

3

with k  3,  , 6

 tn 

(3)

(4)

is the value of the third derivative of f  t  , which — in

practical situations — is unknown. Various ways for overcoming this obstacle can be found in the literature; the aim of this study is to compare their performance when applied in a system for monitoring movements of persons, in particular — for detection of their falls.

377

2. Compared methods of numerical differentiation 2.1. Methods based on selection of best estimate

The methods of numerical differentiation, described in this subsection, consist in the selection of a near-optimal value of the differentiation step tˆnopt for each time instant tn, n = 0, …, N, and estimation of the derivative using the centraldifference formula modified in the following way: 1 xˆn   f  tn  tˆnopt ; x   f  tn  tˆnopt ; x  2tˆnopt (5)





T where f  t; x  is the piecewise-linear interpolation of x   x1 ,  , x N  .

The differentiation step may be selected by means of the algorithm described in [2], which consists of the following operations, repeated for n = 1, …, N − 1: (i) Select a finite set t of possible values of the differentiation step,

t  ti i  1,  , I  .

(ii) For n  1, , N  1 : 1 (a) Compute xˆn   ti   f  tn  ti ; x   f  tn  ti ; x  2ti





ti  t . (b) Select the largest ti  t which satisfies the condition:  x xˆ n1  ti   xˆ n1  t j   2 3 for each pair ti and t j  ti t j

for each

(6)

where  3 x  3 is an estimate of the upper limit of errors corrupting

the data. This method of differentiation will be called S1 method hereinafter. 3 An alternative approach consists in estimation of f    tn  by means of the

S1 method, and using the result for the selection of tˆnopt according to Eq. (4) with k  3 . The method of numerical differentiation following this approach will be called S2 method hereinafter. 2.2. Methods based on fusion of estimates

The methods of numerical differentiation, described in this subsection, consist in computing a weighted sum of estimates of the derivative, obtained by means of the CD method on the basis of various values of the differentiation step: 1 xˆn    wm  xn  m  xn m  2mt for n  M , , N  M M

m 1

(7)

where wm , m  1, , M are the weights, and M is a parameter determining the largest value of the differentiation step taken into account. In the simplest case,

378

equal weights wm  1 M , m  1,  , M may be assigned to each value of the differentiation step; such a method will be called F1 method hereinafter. Since numerical differentiation by means of the central-difference method is, for a given time instant, equivalent to the analytical differentiation of a linear function approximating the data in the neighbourhood of that time instant, the weights can be optimised on the basis of the residual errors of such a function [3] using the discrepancy principle [4]:



wm  m

where:

m   2m  1 2 

  x  m

 m

n m

M

m 1

(8)

m

 2mm  xn  m  xn  m   xn  



2

(9)

The method in which the above weights are used for the fusion of derivative estimates will be called F2 method hereinafter. It can be seen from Eq. (7) that the fusion of the derivative estimates, obtained using different step values, is equivalent to filtering the data by means of an antisymmetric finite-impulse-response filter. Therefore, the weights wm can be selected via optimisation of the frequency characteristics of such a filter. A set of weights, equivalent to a low-pass filter yielding accurate estimates of the derivatives of polynomial functions, has been designed in [5]. The method of numerical differentiation, based on that set of weights, will be called F3 method hereinafter. The Savitzky-Golay algorithm can also be used for the selection of weights in Eq. (7); this approach is equivalent to the least-squares local polynomial approximation of the data xˆ    x ; d , l  , and differentiation of the resulting polynomial [6]. The degree of that polynomial d and the size of the time interval in which the data are approximated l can be selected via minimisation of the socalled Stein’s unbiased risk estimator (SURE) [7]: 2 (10) SURE  d , l   xˆ  x  2tr  D  d , l   N  2





where D  d , l     x ; d , l  x is the matrix of partial derivatives of the function   x ; d , l  . The method of numerical differentiation, based on a set of weights obtained using the Savitzky-Golay algorithm with parameter values resulting from minimisation of the SURE criterion, will be called F4 method hereafter. In the reported study, the minimum of the SURE criterion has been found by means of systematic search in a predefined space of admissible d and l values.

379

2.3. Reference method

The performance of the methods of numerical differentiation, described in Subsections 2.1 and 2.2, has been compared with the performance of a method based on the smoothing approximation of the data using differentiable compactsupport functions, described in [8], called SA method hereinafter. 3. Methodology of experimentation 3.1. Experiments based on synthetic data

The synthetic data, used for numerical experiments reported in this paper, have been generated using two functions, defined by the following formulae: f1  t   t sin t , f 2  t   f1  4t  (11)

The additively disturbed data have been generated according to the formula: (12) x n  f i  tn   x n for n  0,  , N and i  1, 2

where tn  5n N , N  100 , and x n are pseudorandom numbers following a

zero-mean normal distribution with the standard deviation   .

The experiments have been performed for several values of   . The level

of disturbances in the data has been characterised by the signal-to-noise ratio, defined in the following way: N 2 2  N (13) SNR0  10 log    f  tn     x n  f  tn    n 1  n 1  For each value of SNR0 , R  30 data sets have been generated using different pseudorandom sequences

x

r ,n



n  0,  , N . The differentiation methods

1 have been compared in terms of the signal-to-noise ratio in the estimates xˆ n  , defined in the following way: N 2 2  N 1 1 1 (14) SNR1  10 log   f    tn  xˆ n   f    tn    n 1  n 1 









3.2. Experiments based on real-world data

The experiments reported in this paper have been conducted using data from Kinect devices, included in two publicly available datasets:  the UR Fall Detection Dataset – available at http://fenix.univ.rzeszow.pl/ mkepski/ds/uf.html [9] – including data representative of 60 recordings of falls of persons and 40 recordings of non-fall movements (called non-falls hereinafter);

380



the IRMT Fall Detection Dataset – available at http://home.elka.pw.edu.pl/ ~pmazurek/ [1] – including data representative of 80 recordings of falls of persons and 80 recordings of non-falls. The experiments have been conducted for each dataset separately and for a dataset being the combination of both of them. The lengths of the recordings, captured with a rate of 30 frames per second, varied between ca. 2 s and ca. 13 s; in order to obtain a uniform dataset, the recordings have been divided into sequences of N  15 images — each sequence corresponding to ca. 0.5 s — and labelled manually. Sequences of the three-dimensional coordinates of the monitored person’s mass centre x n , y n and zn have been computed on the basis of the depth images using a procedure described in [1]. The standard deviation of the errors corrupting the coordinate values   has been estimated according to a procedure described in [8]. The sequences of the space coordinates have been differentiated by means of all the compared methods in order to obtain the estimates of the 1 1 1 corresponding components of the velocity xˆ n  , yˆ n  and zˆn  . A set of features, based on those estimates, described in [8], has been determined for each sequence. The results have been compared using the leave-one-out crossvalidation procedure for each sequence. For the i-th sequence, i  1, , I , the following operations have been completed:  training of a Naïve Bayes classifier using the feature vectors corresponding to all the sequences except the i-th sequence;  estimation of the probability pˆ i that the i-th sequence represents a fall by applying the trained Naïve Bayes classifier to the feature vector corresponding to that sequence;  classification of the i-th sequence as a fall if pˆ i exceeds a threshold value pTHR , or as a non-fall otherwise. The results of classification have been evaluated by generating the so-called receiver-operator-characteristic curves [10]. The area under those curves (AUC) — which approaches the value of 1 for a perfect classifier, and the value of 0.5 for a classifier providing right and wrong decisions with equal probability — has been used as the indicator of the classification performance. 4. Results of experiments

The dependence of the ratio SNR1 SNR0 on SNR0 , obtained in experiments based on synthetic data, is presented in Fig. 1. The values of AUC, corresponding to the results of fall detection based on different sets of feature values — the values computed using velocity estimates obtained by means of each of the described differentiation methods — are presented in Table 1.

381 x = f 1(t)

3

x = f 2(t)

3

2

2

1 0

1

-1

0

-2 -1

-3 -4

5

10

15

20

SNR 0 [dB]

25

30

-2

5

10

15

20

SNR 0 [dB]

25

30

Fig. 1. Dependence of SNR1 SNR0 on SNR0 for the estimates of the derivative, obtained by means of the compared differentiation methods for synthetic data.

Table 1. Values of AUC corresponding to the results of fall detection, based on the velocity estimates obtained by means of the compared methods of numerical differentiation. Method CD S1 S2 F1 F2 F3 F4 SA

UR dataset 0.934 0.987 0.964 0.981 0.982 0.973 0.981 0.980

AUC IRTM dataset 0.942 0.922 0.892 0.980 0.981 0.967 0.976 0.930

Combined dataset 0.922 0.848 0.852 0.981 0.984 0.969 0.943 0.888

5. Conclusion

The F4 and SA methods have outperformed other methods of numerical differentiation in the experiments based on synthetic data. This has been made possible by the exhaustive optimisation of their parameters, i.e., at the cost of significantly larger computation time. On the other hand, in the experiments based on real-world data, the F2 method has provided significantly better results than the F4 and SA methods. This apparent contradiction is a consequence of the fact that the results of experiments based on real-world data have been compared in terms of the accuracy of fall detection, which has involved processing of the derivative

382

estimates; in the case of synthetic data, the reference data have been available a priori, and the results have been compared in terms of the accuracy of unprocessed derivative estimates. The presented results indicate that the differentiation method should be optimised specifically for a target monitoring system, and that the approach underlying the F2 method is the most promising one. References

1.

P. Mazurek, J. Wagner and R. Z. Morawski, Use of kinematic and melcepstrum-related features for fall detection based on data from infrared depth sensors, Biomedical Signal Processing and Control, 40 (2018). 2. S. Lu and S. Pereverzev, Numerical differentiation from a viewpoint of regularization theory, Mathematics of Computation, 75, 256 (2006). 3. M. Niedźwiecki, Easy recipes for cooperative smoothing, Automatica, 46, 4 (2010). 4. O. Scherzer, The use of Morozov's discrepancy principle for Tikhonov regularization for solving nonlinear ill-posed problems, Computing, 51, 1 (1993). 5. P. Holoborodko, Smooth Noise Robust Differentiators (2008), http://www.holoborodko.com/pavel/numerical-methods/numericalderivative/smooth-low-noise-differentiators/. 6. R. W. Schafer, What is a Savitzky-Golay filter?, IEEE Signal Processing Magazine, 28, 4 (2011). 7. S. R. Krishnan and C. S. Seelamantula, On the selection of optimum Savitzky-Golay filters, IEEE Transactions on Signal Processing, 61, 2 (2013). 8. J. Wagner, P. Mazurek and R. Z. Morawski, Regularized numerical differentiation of depth-sensor data in a fall detection system, in Proc. 2017 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA). 9. B. Kwolek and M. Kepski, Human fall detection on embedded platform using depth maps and wireless accelerometer, Computer Methods and Programs in Biomedicine, 117, 3 (2014). 10. P. Cichosz, Data Mining Algorithms: Explained Using R (John Wiley & Sons, 2014).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 383–400)

Polynomial estimation of the measurand parameters for samples from non-Gaussian distributions based on higher order statistics Zygmunt Lech Warsza Industrial Research Institute for Automation and Measurements PIAP, Al. Jerozolimskie 202 02 486 Warszawa, Poland E-mail: [email protected] Sergiej V. Zabolotnii Cherkasy State Technological University, Cherkasy, Ukraine E-mail: [email protected] This paper proposes an unconventional method (PMM) for evaluating the uncertainty of the estimator of measurand value obtained from the non-Gaussian distributed samples of measurement data with a priori partial description (unknown PDF). This method of statistical estimation is based on the apparatus of stochastic polynomial maximization and uses the higher-order statistics (moment and cumulant description) of random variables. The analytical expressions for estimates of uncertainty, obtained with use the polynomial of the degree r = 2 for samples from population of asymmetrical pdf and degree r = 3 — for symmetrical pdf, are given. It is shown that these uncertainties are generally smaller than the uncertainty based only on the arithmetic average, as it is in GUM. Reducing the value of estimated uncertainty of measurement depends on the skewness and kurtosis of samples from asymmetrical pdf or on kurtosis and six order moment of samples from symmetrical pdf. The results of statistical modeling carried out on the basis of the Monte Carlo method confirm the effectiveness of the proposed approach. Keywords: estimator, non-Gaussian model, stochastic polynomial, means value, variance, skewness and kurtosis.

1. Introduction In statistical processing of measurements with multiple observations, different mathematical models of random errors are used. Typically, random errors are characterized by a unimodal symmetric law of probability distribution: Gaussian [1], uniform, triangular, trapezoidal, Laplace, and others [2]. However, the results of studies of real-world laws of the measured distribution data show that in some cases the probability distribution is not symmetrical because of the non383

384

linear processing of measurement signals with the presence of asymmetric unknown systematic errors and/or random errors [3-5]. This leads to a certain difficulty of using the measurement uncertainty framework GUM [1]. One of the main ways of overcoming these problems is to use Bayesian approach and obtain the estimate of uncertainty based on the maximum likelihood method. This requires building mathematical models based on the description in the form of probability density function of the input data. Thus, in order to properly select the appropriate methods of calculating the measurement uncertainty it is necessary to carry out the preliminary identification and choose the approximation distribution law adequate to the specific measuring task [6]. Such approximate problems are solved both by using analytical methods and statistical modeling based on the Monte Carlo method (Supl.1 of [1]). The Bayesian approach is connected with the requirement of a priori information about the form of distribution, as well as with the potentially high complexity of its implementation and analysis of the properties. Therefore, we propose to use an alternative approach for the estimator of measurand value and its uncertainty evaluation based on higher order statistics [7]. Also, description by cumulants is used. The cumulant function is the Laplace transform of the probability distribution. The cumulants of a random variable  are the coefficients of expansion of the logarithm of characteristic function of probability density in the Taylor-MacLaurin series [8]:

 d i ln f (u )  1  , f  u   i du  2   u0

 i  j i 

 p x e 





jux

dx ,

where p x  — the probability density function (PDF).

There are known relationships between cumulants and moments, for example  1   1/ 1 ,  2   2 ,  3   3 ,  4   4  3  22 ,  5   5  10  2 3 ,  6   6  15  2  4  10 32  30  23 , ...

where:  1/ — the first initial moment ,  i ,  i — cumulants and central moments of order i of the distribution. To analyze the probability properties of random variables, the dimensionless (normalized) parameters — cumulant coefficients  i   i   2 i 2 are very useful. The most important are coefficients of asymmetry  3 and of kurtosis  4 . The cumulant coefficient of 4-th order (kurtosis) is studied in detail in literature, e.g.: [9, 10].

385

2. The purpose of research This paper discusses the application of a new unconventional approach in solving problems of finding estimates of values in multiple measurements under the influence of random errors. As the mathematical tool, this approach applies the stochastic polynomial maximization method (PMM) proposed by Kunchenko [11]. This mathematical apparatus is used in a variety of areas related to statistical data processing, such as: pattern recognition based on template matching technology [12], probabilistic diagnosis of disorder (changepoint problem) [13], detection of signals against non-Gaussian noise and parameter estimation [14]. The objectives of this work include:  application of the polynomial maximization method (PMM) based on cumulants to synthesis of algorithms of estimation of measurand parameters for the model of asymmetrically or symmetrically distributed random errors,  theoretical analysis of the accuracy of the polynomial estimates,  investigation of effectiveness of those algorithms using statistical modeling. 3. The polynomial maximization method (PMM)

If  is a measured value determined as a result of repeated observations, which are characterized by the presence of measurement errors, then the set of  measurement results can be interpreted as a sample x  x1 , x2 ,...xn  consisting of n independent and identically distributed random components that are described by the model      . In this model,  is a permanent component (value of measurand), and  — randomly-distributed variable (measurement error) with probabilistic properties. According to the algorithm of PMM [11], the estimate of parameter  can be found as the solution of stochastic power equation of the estimated parameter:

 h   m     r

i 1

i

i

i

 ˆ

0,

(1)

where: r is the order of the polynomial used for parameter estimation, i  —

the theoretical (for population) and mi  sample) initial moments of the i-th order.

1 n

x n

v1

i v

— the experimental (for

386

Coefficients hi   (for i  1, r ) can be found by solving the system of linear algebraic equations given by conditions of minimization of variance (with the appropriate order r) of the estimate of parameter  , namely:

d  h   F    d    , r

i 1

i

i, j

(2)

j

where: j  1, r , Fi , j     i j     i   j   .

Equations (2) are linear and can be solved analytically using the Cramer method, i.e. i r hi    , (2a) r where: i  1, r ,  r  det Fi , j ; ( i, j  1, r ) is the volume of the stochastic

polynomial of dimension r,  i r — the determinant obtained from  r by replacing the i-th column by the column of free terms of eq. (2).

In [11] it was shown that polynomial evaluations ˆ , which are the solutions of stochastic equations of the form (1), are consistent and asymptotically unbiased. To calculate the evaluation of uncertainty, it is necessary to find the volume of extracted information on the estimated parameters  , which is generally described as: r d J r n    n hi   i  . (3) d i 1



The statistical sense of function J r n   is similar to the classical Fisher concept

of information quantity, since for n   its inverse approaches to the variance of estimates:  (2 ) r  lim J rn1  . (4) n 

4. Polynomial estimation of the distribution parameters

Using the general formula (1), it can be shown that for the degree of the polynomial-stochastic r  1 , the estimation ˆ of parameter  based on PMM 1

as measurand value (result of measurements) can be found as the solution of equation:



  n h1  m1  1    ˆ  h1   xv      v 1

 ˆ

0.

(5)

387

It is obvious that an arbitrary factor h1    0 in expression for finding the estimated parameter can be transformed into a linear statistics, e.g. the arithmetic average:

ˆ 1  m1   xv . 1 n n v1

(6)

Parameter ˆ of the form (6) is the estimate of mathematical expectation of the random variable found by the method of moments (MM). The estimates of the form (6) are optimal with respect to minimization of their dispersion only when the random variable has Gaussian distribution and its values taken randomly  x   x1 , x2 ,... xn  are uncorrelated. If the error distribution of measurand differs from the Gaussian law, there are alternative methods of finding estimates, based on nonlinear transformations, which can provide a reduction of the resulting uncertainty. Below we consider the way of constructing nonlinear estimates of parameters, which are based on the use of exponential polynomials. Mathematical model of errors can also be described by a given sequence of cumulants. The first cumulant 1  0 determines scattering bias (systematic error), the second order cumulant  2 determines a variance of the random error component, and cumulant coefficients of higher orders  3 ,  4 , etc. numerically describe the degree of deviation of random errors from the Gauss distribution. In the PMM using polynomials of degree r  2 , estimation of parameter  is the solution of an equation (taking into account the normalization): h1  m1  1    h2  m2   2   ˆ 









 n   n 2   0, h1   xv     h2   xv   2   2   v1   v1   ˆ

(7)

where: h1   and h2   are optimal coefficients, which for r  2 minimize the dispersion of searched estimates of parameter  . Coefficients h1   and h2   are found by solving a system of two linear equations of the form (2) and are described by expressions: h1   

2  3   2 2   4 

 2     4  3 2

2 3

where  3   3 /  23 ,  4   4 /  22 .

, h2    

3

 2   32   4  3 2

,

(8)

After inserting the coefficients (8) into (1), the equation of parameter  is:





 3 2  2 3m1   2 2   4     2 2   4  m1   3 m2   2 

 ˆ

0.

(9)

388

The analysis shows that in the case of symmetrical error distribution (  3  0 ), the quadratic equation (9) converts into a linear one, having only one root, similar to (6). If  3  0 equation (9) has two roots:

ˆ2 1, 2  m1 





 2 2   4   24   .   2  m2  m12    2 3  2 3  2

(10)

In the PMM method, when there are multiple solutions, one should take the real root of the equation (5), which maximizes the amount of recoverable data (3) as the required result. In the results of experimental studies shown below, selection of optimal solution to find the estimation ˆ is determined by the sign of the asymmetry cumulant coefficient  3 . Thus, the final equation for the estimation of  using polynomials of degree r  2 can be written as:

ˆ2   ˆ1   2 

(11)

where the correction factor in the expanded form is:

2 2 2  1 n  2 2   4   1 n    2   4   2     . (11a)  2     sign 3   2   xv    xv       n v1  n 2 3 2   1 v     3      1

5. Analysis of the 2nd-degree polynomial estimates

To compare the measurement uncertainties estimate using the PMM method and classic method of moments MM (as in GUM), the dimensionless coefficient of the variance estimate reduction is proposed: g   r 

 2  r .  2 1

(12)

This coefficient is the ratio of variance  2 r of estimate ˆ  r  of measurand  ,

using a polynomial of r-th order, and the variance  2 1 of linear estimate ˆ1 (6), which is found using the method of moments (it is equivalent to the use of polynomials of degree r =1 in PMM). The linear estimate ˆ of the form (6) is unbiased and consistent. Its variance

 2 1 does not depend on the value of the estimated parameter, but is determined

389

only by the variance (second-order cumulant  2 ) of the random component of measurement errors and by sample size n

 2 1 

2 n

.

(13)

Using expressions (8) describing the optimal PMM coefficients and based on the general formula (3) we can obtain an expression describing the amount of information J 2 n   of the estimated parameters  , which is extracted from the sample size n using stochastic polynomials of degree r  2 :

24 d d   n . (13a) J 2 n    n h1   1    h2    2    2 d d   2 2  3   4

Asymptotic variance  2 2 is the inverse of the value J 2 n   and expressed as:

 2 2 

 32  1  . n  24 

2 

Thus, the coefficient of variance reduction

g  2  1 

 32 24

(14)

(15)

is a function of the value of the cumulative coefficients of skewness and kurtosis and does not depend on the values of  2 and the sample size n . Formula (15) can also be described by central moments as

g   2  1 

32 . (  4   22 )  2

(15a)

It should be noted that the coefficients of higher order cumulant of random variables are not arbitrary values, because their combination has the domain of admissible values [2]. For example, for random variables the probability properties of which are given by cumulant coefficients of the 3-rd and 4-th order, the domain of admissible values of these parameters are limited by inequality  4  2   32 . Taking into account this inequality, from the analysis of (15) we can conclude that the coefficient of variance reduction g  2 belongs to the range 0;1 .

Figure 1 shows graphs built with this inequality. They visualize dependences of coefficient g  2 .

390

Fig. 1. Coefficient of variance reduction g(θ)2 dependency on cumulant coefficients 3, 4.

These graphs are constructed as sections of a two-variable function of the form (15) and are dependent on one of the parameters (skewness  3 ) with fixed values of the other (kurtosis  4 ). One can see that the variance of polynomial estimates significantly decreases with the increasing asymmetry of the distribution (absolute asymmetry coefficient values  3 ) and asymptotically goes to zero for the boundary of the domain of acceptable values. 6. The polynomial estimation of samples from asymmetrical pdf-s

Based on the above considerations, a software package in the environment MATLAB/OCTAVE has been developed. It allows to perform the statistical modeling of the polynomial estimation of parameters for asymmetrically distributed measurement observations based on higher-order statistics. This package is based on multi-experimental test (in the sense of the Monte Carlo method). It gives a comparative analysis of the accuracy of various algorithms of statistical estimation, and allows to explore the probabilistic properties of polynomial estimates. The experimental value of the coefficient of variance reduction g   2 (12) is used as a comparative effectiveness criterion. It is calculated by the k-multiple experiments with the same baseline of observations of the model parameters. Estimation gˆ  2 is formed as the ratio of the empirical variances s2 2 of the parameter estimates (found by PMM with the polynomial of 2-nd order) to the variances s2 1 of linear estimates of this parameter (6). It should be noted that the credibility of statistical estimation algorithms for the simulation results is influenced by two factors: the total size n of input  vector x (which contains values of estimated parameter) and the number of

391

experiments m which are carried out with the same initial conditions (values of skewness and kurtosis, which describe the probabilistic character of model). The set of results obtained with the Monte Carlo method is given in Table 1. Table 1. The results from Monte-Carlo simulation of parameters estimation. Theoretical values Distribution

  0 .5  1

Gamma 1  x

x e  

 2

(Exponential)

Lognormal ln x   2 

1

x 2

2

e

2

Weibull

b x   aa

b 1

e

x   a

b

 4

 2  0.1 ,  1

a 1,

b2

3

4

g   2

2.83

12

0.43

2

6

0.5

Monte-Carlo simulation n  20

gˆ   2  s2  2 / s2  1

0.47

0.58

n  50

n  200

0.52

0.5

0.46

0.43

1.41

3

0.6

0.63

0.61

0.6

1

1.5

0.71

0.74

0.72

0.71

1

1.86

0.74

0.76

0.075

0.74

0.63 0.25

0.82

0.84

0.83

0.82

The experimental data of values of the coefficient of variance reduction are obtained for M  10 4 various types of asymmetrical distributions of measurement observations. The calculations of polynomial estimates (11) of parameter  do not take into account a priori information about the type of distribution, but only the values of cumulant coefficients as model parameters. These values can be obtained from analytical expressions between the parameters of density distributions and their moments, on the basis of which the corresponding cumulants can be calculated. For practical situations where information on the distribution of population is not available a priori, assessment of the required cumulative coefficients is obtained using the sample central moment ( mi  i ):

ˆ3  m3 m2 2 , 3

ˆ4  m4 m22   3

(16)

where the i -th order central moment mi of the sample is:

mi 

1 k i   xv  x  . k v 1

(17)

The analysis of the data given in Table 1 shows a significant correlation between the results of analytical calculations and statistical modelling. It is

392

obvious that with an increase of the initial sample size n , the discrepancy between the theoretical and experimental values of variance reduction coefficient (e.g., when n  20 difference does not exceed 15%, and when n  50 is less than 5%) is decreasing. Generally, these results confirm the asymptotic property (4), which is characteristic for the value of the amount of extracted information J r n   about the estimated parameters. This information is based on formula (3) and is used in the calculation of the dispersion of polynomial estimates which are solutions of the general form equations (1). Another important result of statistical modelling is the test of assumption that with the increase of n the estimates of polynomial distribution of parameter  , calculated by the formula (11), are asymptotically approaching the Gaussian law. The validity of the Gaussian distribution of polynomial estimates of hypotheses was investigated using Lilliefors test, which is based on Kolmogorov-Smirnov statistics [16]. Table 2 presents the test results as a series of output parameters of the Lilliefors test. Table 2. The result of testing the adequacy of the Gaussian distribution model linear (r = 1) and polynomial (r = 2) estimates on the basis of Lilliefors test.

Distribution

  0 .5 Gamma Exponential (Gamma,   1 )

r 1

n  20

r2

Output parameters of Lilliefors test LSTAT n  50 n  200 r 1

r2

r 1

r2

0.045

0.036

0.028

0.021

0.018

0.009

0.034

0.027

0.023

0.013

0.011

0.008

0.021

0.017

0.013

0.012

0.009

0.007

0.02

0.017

0.012

0.011

0.008

0.007

 2  0.1 ,   1

0.016

0.014

0.012

0.011

0.008

0.007

a 1, b  2

0.013

0.017

0.011

0.011

0.006

0.004

Gamma

 2  4

Lognormal Weibull

CV

0.009

LSTAT is the sample value of the statistic test and CV is the critical value of the test statistic. If LSTAT  CV , the null hypothesis for a given critical level is valid.

The results given in Table 2 are obtained for different types of measurement  error distributions and the size n of the sample x at a fixed significance level 4 0.05 of zero (Gaussian) hypothesis and M  10 experiments. In addition, Figure 2 presents some examples of the simulation results. These plots show the distribution of the experimental values of linear r  1 and

393

a)

b)

Fig. 2. Gaussian probability plots approximating the experimental values of the measured parameter estimates in the lognormal distribution model of errors: a) r = 1; b) r = 2.

polynomial of r  2 estimates of ˆ . In these examples the input data have the M  10 4 samples of different size ( n  20, 50, 200 ), containing measurement results of parameter  in the additive model of errors based on the random variable with lognormal distributions. Under specified conditions of normalization, the analytical expressions obtained in the section 5 allow to calculate the expanded uncertainty of measurement results. To do this, one must have a priori information not only about the type of distribution, but also about the values of a limited number of moments or cumulants, describing the probabilistic properties of measurement observations. In addition, the use of properties of the additive function cumulant description allows to take into account the errors generated by multiple sources with various probabilistic properties in a straightforward way. 7. Estimates obtained with the 3rd-degree polynomial A polynomial PMM estimation of measurand parameters for samples of nonGaussian symmetrically distributed data was described by authors in [15]. In accordance with the PMM, using polynomials of degree r  3 , estimation ˆ is a solution of the following equation (18) in moments or (18a) in cumulants.

h1  m1  1    h2  m2   2    h3  m3   3    ˆ  0 .







 

h1   xv     h2   xv2   2   2  h3   xv3   3  3 2 n

n

v 1

v 1

n

v 1



 ˆ

(18)

 0 (18a)

Where h1   , h2   , h3   optimal coefficients for order of polynomial r  3 .

394

These coefficients minimize the dispersion of searched estimates ˆ of parameter  . They are found in paper [15] by solving a system of three linear equations of the form (2). Using expression (18) describing the optimal PMM coefficients (for the symmetrically-distributed model) and based on the general formula (2) we can obtain an expression describing the amount of information of the estimated parameters  , which is extracted from the sample size n using stochastic polynomials of degree r  3 :

J 3n   







n 12  24 4  9 42  2 6   4 6 . 3



Where  3  2   4  6  9 4   42   6 .

(19)

Asymptotic variance  2 3 is inversed to the value of J 3 n   presented as

 2 1 

2 

  42 1  . n  6  9 4   6 

(20)

Thus, the coefficient of variance reduction

g 3  1 

 42 6  9 4   6

(21)

is a function of the value of the cumulative coefficients of higher orders (  4 and  6 ) and does not depend on the values of  2 and the sample size n . For symmetrically distributed random variables, the domains of admissible values of the 4th and 6th order cumulant coefficients, are limited to two inequalities [2]:

 4   2 and  6  9 4  6   42 .

With these inequalities, from the analysis of (21) we can conclude that the dimensionless coefficient g  3 of variance reduction has the range 0; 1 . Figure 3 shows the graphs which are built with the above limitations. The coefficient g  3 depends on cumulative coefficients  4 or  6 . They are constructed as function (22) of one of them for three fixed values of the other. The variance of polynomial estimates ˆ3 greatly increases and tends asymptotically to zero when cumulant coefficients approach the border region of a parabola left side. The reduction of estimate variances is not observed only when the kurtosis coefficient is zero. In other cases (with  4  0 ) parameter  , found by the PMM, has less uncertainty than their linear estimations.

395

a)

b) Fig. 3. Coefficient of reduction variance g()3 dependency on cumulative coefficients 4, 6.

The value of profit is determined solely by the values of the higher order cumulant coefficients. To confirm the effectiveness of the proposed approach, the results of the Monte Carlo statistical modelling are presented and final conclusions are given below. 8. Statistical simulation of PMM estimation by 3rd-degree polynomial

The data obtained by MC simulations are given in Table 3. Table 3. Results from Monte-Carlo simulation of parameters estimation. Theoretical values

 2 3

Monte-Carlo simulation

gˆ a 3  s2 3 / s2 1

Distribution

4

6

n  20

n  50

Arcsines

-1.3

8.2

0.2

0.27

0.22

0.21

Uniform

  0.75

-1.2

6.9

0.3

0.41

0.33

0.31

-1.1

6.4

0.36

0.47

0.39

0.37

  0.25

-1

5

0.55

0.63

0.58

0.55

-0.7

2.9

0.76

0.82

0.78

0.77

-0.6

1.7

0.84

0.89

0.86

0.85

Trapezoidal

  0.5

Triangular

g  3 

 2 1

n  200

The analysis of the results from Table 3 shows a significant relation between the results of analytical calculations and statistical modeling. From the MC simulations it is clear that even for small samples of n  20 , the PMM3 estimation results in significantly lower standard deviation s  3 than the

standard uncertainty u A  s  1 calculated according to GUM [1]. For the arcsin

396

pdf s 3  0.52u A and for uniform 0.64 u A . For triangular pdf, as close to the

normal pdf, s 3  0.94 u A only is obtained. With the increase of the initial

sample size n the discrepancy between the theoretical and experimental values ˆ  a 3 decreases (e.g., for n  50 the difference does not of reduction coefficient g

exceed 10% and for n  200 the difference is less than 5%). Another important result is the confirmation that with the increase of n the polynomial distribution of parameter  , calculated according to Cardano formulas [15], asymptotically approach the Gaussian law. As an example, Figure 4 shows the distributions of the experimental values of linear ( r  1 ) and polynomial (of r  3 ) estimates of ˆ . In this case the input data have the M  10 4 samples of size n  50 , containing the measurement results of parameter  with the additive model of errors, which describe the centered random variable of a trapezoidal distribution (ratio of their shorter upper and longer bottom basis   0.5 ). The graphs presented in Figure 4 show that the results of this experiment have a high degree of adequacy to the Gauss model.

a)

b)

Fig. 4. Example of experimental estimates of the measured parameter with a trapezoidal distribution (β = 0.5) of observations: a) histograms and Gaussian approximation PDF, b) empirical distribution function and Gaussian probability plots.

In addition, the validity of the Gauss distribution of polynomial estimates of hypotheses was investigated by using Lilliefors test [16] based on KolmogorovSmirnov statistics. The results given in Table 4 are obtained for different types  of the measurement distribution and three n values of the sample x at a fixed significance level 0.05 of zero (Gaussian) hypothesis and M  104 experiments.

397 Table 4. The result of testing the adequacy of the Gaussian distribution model of polynomial estimates (r = 3) on the basis of Lilliefors test. Output parameters of Lilliefors test Distribution

n  20

LSTAT

n  50

n  200

0.02

0.01

0.007

0.02

0.008

0.006

0.008

0.007

0.006

0.007

0.007

0.006

Triangular

0.007

0.006

0.005

Arcsines

0.03

0.01

0.007

Uniform

  0.75

Trapezoidal

  0.5

  0.25

CV

0.009

The analysis of these data and other results of experimental research shows that for the non-Gaussian models of measurement data, the normalization of the distribution of polynomial estimates is observed only for a sufficiently large size of initial sample elements, i.e. n  100 and their expanded uncertainty can be estimated as for normal pdf. For a trapezoidal distribution with the basis ratio   0.75 normalization comes at n  50 . However, for smaller values of   0.5 and for triangular distribution the normalization is observed even at small samples of n  20 . The examples of distribution that are significantly different from the Gaussian model include: arcsines and uniform, which have large absolute values of higher order cumulant coefficients. It should be noted that under specified conditions of normalization, the analytical expressions obtained in sections (5) and (9) describing the accuracy properties of polynomial estimates allow to calculate the expanded uncertainty of measurement results. To do this, one must have a priori information, which is not about the type of distribution, but only about the values of a limited number of cumulants, describing the probabilistic properties of measurement errors. In addition, the use of properties of the additive function cumulant description allows you to simply take into account errors generated by multiple sources with various probabilistic properties. 9. Conclusions

The analysis of the theoretical and experimental results together allows us to make a general conclusion that it is possible to use the apparatus of stochastic polynomials to construct algorithms for finding nonlinear estimates of the

398

measured parameter in the presence of asymmetrically-distributed errors and also non-Gaussian symmetrically-distributed errors, described by models based on moments and cumulants. The research has shown that the polynomial evaluation, synthesized at r = 2, is essentially characterized by the greater accuracy in comparison with the linear estimates of the arithmetic average, as well as a higher rate of normalization of their distribution. Reduction of the estimated uncertainty value is determined by the degree of the non-Gaussianity of the measurement error distribution. It is numerically expressed as absolute values of cumulant skewness and kurtosis. The polynomial evaluations synthesized for order of r  3 are characterized by the higher accuracy compared with the linear estimates (arithmetic mean). The improved value is determined by the level of non-Gaussianity which is also numerically expressed by the absolute values of the coefficients of skewness, kurtosis and higher order cumulants. From the MC simulations it is clear that even for small samples, e.g. of n = 20, a significantly lower standard deviation s is obtained than the standard uncertainty u A calculated according to GUM [1]. The decrease in standard uncertainty (variance) of estimates is achieved by taking into account additional information about the properties of the probability of measurement errors. This information depends on values of a specified number of cumulants. However, such estimation seems to be a much simpler task compared to selection and testing the adequacy of the law of distribution chosen for approximation of the measurement errors dispersion and for the description of uncertainties. Among many possible directions of further research one should mention the following:  analysis of the dependence of the accuracy of determining parameters of non-Gaussian model on the stability of polynomial estimation of the measured parameter;  synthesis and analysis of recurrent algorithms for PPM estimation of the measurand parameters:  comparison of expanded uncertainty obtained by PMM and by other methods, i.e. to select for the sample after the boot-strap resembling the best (of minimal SD) from few single and double-element estimators of measured value [17], and the maximum entropy method based on moments [18].

399

References

1.

2. 3. 4. 5.

6. 7.

8. 9. 10.

11. 12.

13.

Guide to the Expression of Uncertainty in Measurement, GUM (2008) with Supplement 1 Evaluation of measurement data - Propagation of distributions using a Monte Carlo method. JCGM 101: 2008. OIML Geneva, Switzerland. Cramer H. (2016), Mathematical Methods of Statistics (PMS-9) (Vol. 9). Princeton University Press. Schmelling M. (2000), Averaging Measurements with Hidden Correlations and Asymmetric Errors. Mpi 1(1), p.18. http://arxiv.org/abs/hep-ex/0006004 Barlow R. (2004), Asymmetric Statistical Errors. arXiv, p. 14 http://arxiv.org/abs/physics/0406120. Danilov A., Shumarova S. A. (2013), On the asymmetry of the probability density function of the error of the results of measurements obtained by means of the complex measurement channels of measurement systems. Measurement Techniques, 55(11), pp. 1316–1318. Levin S.F. (2005), The Identification of Probability Distributions. Measurement Techniques, 48(2), pp. 101–111. Mendel J. M. (1991), Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proc. IEEE, 79(3), pp. 278–305. Cumulant. Wolfram Mat World, http://mathworld.wolfram.com/Cumulant.html De Carlo L.T. (1997), On the meaning and use of kurtosis. Psychological methods, 2(3), pp. 292–307. Beregun V. S., Garmash O. V., Krasilnikov A. I. (2014). Mean square error of estimates of cumulative coefficients of the fifth and sixth order. Electronic Modelling. V. 36, N.1, pp. 17–28. Kunchenko Y. (2002), Polynomial Parameter Estimations of Close to Gaussian Random variables. Aachen: Shaker Verlag. Chertov O., Slipets T. (2014), Epileptic Seizures Diagnose Using Kunchenko’s Polynomials Template Matching, in Fontes, M., Günther, M., and Marheineke, N. (eds) Progress in Industrial Mathematics at ECMI 2012. Springer International Publishing, pp. 245 –248. doi: 10.1007/978-3319-05365-3_33. Zabolotnii S. V., Warsza Z. L., Semi-parametric polynomial method for retrospective estimation of the change-point of parameters of non-Gaussian sequences. In: F. Pavese et al. (Eds.) Advanced Mathematical and Computational Tools in Metrology and Testing X, vol. 10, Series on Advances in Mathematics for Applied Sciences vol. 86, World Scientific, Singapore, (2015), ISBN: 978-981-4678-61-2 pp. 400–408.

400

14. Palahin, V., Juhár, J. (2016). Joint Signal Parameter Estimation in Non– Gaussian Noise by the Method of Polynomial Maximization. Journal of Electrical Engineering, 67(3), 217-221. DOI: 10.1515/jee-2016-0031. 15. Zabolotnii S.V., Warsza Z. L., A polynomial estimation of measurand parameters for samples of non-Gaussian symmetrically distributed data. In: R. Szewczyk et al. (eds.) Automation 2017 Innovations in Automation, Robotics and Measurement Techniques, series: Advances in Intelligent Systems and Computing. 550 Springer Int. Publishing AG 2017 pp. 468– 480. doi 10.1007/978-3-319-54042-9-45. 16. Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), pp. 399–402. 17. Warsza Z. L., Effective Measurand Estimators for Samples of Trapezoidal PDFs. JAMRIS (Journal of Automation, Mobile Robotics and Intelligent Systems) vol. 6, no 1, 2012 pp. 35–41. 18. Rajan, A., Kuang, Y. C., Ooi, M. P. L., Demidenko, S. N. (2017, May). Moments and maximum entropy method for expanded uncertainty estimation in measurements. In: Instrumentation and Measurement Technology Conference (I2MTC), 2017 IEEE International pp. 1–6. IEEE.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 401–408)

EIV calibration model of thermocouples Gejza Wimmer Mathematical Institute, Slovak Academy of Sciences, Bratislava, Slovakia Faculty of Natural Sciences, Matej Bel University, Bansk´ a Bystrica, Slovakia E-mail: [email protected] ˇ s and Rudolf Palenˇ Stanislav Duriˇ c´ ar Faculty of Mechanical Engineering, Slovak University of Technology, Bratislava, Slovakia E-mail: [email protected], [email protected] Viktor Witkovsk´ y Institute of Measurement Science, Slovak Academy of Sciences, Bratislava, Slovakia E-mail: [email protected] A new statistical procedure for evaluating the calibration of thermocouples is presented. This procedure describes the calibration model as a multivariate errors-in-variables (EIV) model which is linearized and the resulting solution is found by iterations. Keywords: Errors-in-variables model; calibration; continuous temperature scale; uncertainty evaluation.

1. Linear regression model with type-II constraints Here we consider the problem of comparative calibration of type S thermocouples, according to ITS-90, by using the errors-in-variables (EIV) [1], represented here by a linear regression model with type-II constraints, see [2], which is also known as the incomplete direct measurements model with conditions. Let X = (X1 , . . . , Xn )′ be a vector of measurements with its mean value E(X) = β 1 (β 1 denotes the vector of unknown parameters) and a known non-singular covariance matrix cov(X) = Σ1 . Further, let Y = (Y1 , . . . , Yn )′ be another vector of measurements with its mean value E(Y) = β 2 (β 2 denotes another vector of unknown parameters) and a known non-singular covariance matrix cov(Y) = Σ2 . In matrix notation, 401

402

we have a statistical model of direct measurements,       X β1 Σ1 0 ∼ , . Y β2 0 Σ2

(1)

We add into this model, however, also another (m×1)-vector of parameters (the parameters of primary interest), say γ, such that the following linear system of conditions holds true,   β1 B1 + B2 γ + b = 0 (2) β2 with known (q ×1)-vector b, known (q ×2n)-matrix B1 , and known (q ×m)matrix B2 , such that rank([B1 , B2 ]) = q < 2n and rank(B2 ) = m < q. We say, that the measurements (X′ , Y′ )′ follow the regular linear regression model with type-II constraints, i.e. the model (1)–(2). Let us denote β ′ = (β ′1 , β ′2 )′ . Then, according to [2] and [3], the best linear unbiased estimator (BLUE) of the vector parameter (β ′ , γ ′ )′ , in the model (1)–(2), is given by            Σ1 0 Σ1 0 ′ ′ ˆ B Q B I − B Q β 1 11 1  X 1 11  (3) = −  0 Σ2 b+ 0 Σ2 Y ˆ γ −Q21 B1 Q21 where 

Q11 Q12 Q21 Q22



   −1 Σ1 0 ′ B B B 1 2 1  . = 0 Σ2 ′ B2 0

ˆ ′, γ ˆ ′ )′ is Further, the covariance matrix of the BLUE (β   ˆ β cov ˆ γ        Σ1 0  0 Σ2

=

Σ1 0 Σ1 0 B′ Q B 0 Σ2  1 11 1 0 Σ2 Σ1 0 −Q21 B1 0 Σ2





  Σ1 0 B′1 Q12 0 Σ2  Q22

. (4)

2. Calibration of a thermocouple We consider the following calibration of thermocouples. The calibration is carried out by comparison of the type S thermocouple under test against the standard type S thermocouple, calibrated in defined fixed points according to ITS-90. The calibration is represented as a curve fitting to the measured

403

values of the deviation ei − Eref (ti ) and generally is supposed as a function (polynomial of degree 4) of temperature (ti represents the true value of measurand in units of the standard thermocouple and ei represents the true value of measurand in units of the calibrated thermocouple). The reference function is 8 X Eref (ti ) = bi tki , i = 1, 2, . . . , n. (5) k=0

The coefficients b0 , b1 , . . . , b8 are given by Reference standard IEC 584.2. The theoretical measurement model for standard thermocouple is Ti = tSi + cSi δESIH + cSi δESRV + cSi δESK + cSi δESD + δtS + δtSD + δtSR0 + δtSF ,

(6)

where the considered quantities are • tSi — i-th temperature measured by the standard thermocouple, i = 1, 2, . . . , n; realization of a normally distributed random variable with known standard uncertainty Type A uA,tSi ; • δESIH — correction linked to the reading of voltmeter (in connection with the standard thermocouple); uniformly distributed random variable with zero mean and known standard uncertainty Type B uB,δESIH ; • δESRV — correction linked to the resolution of the voltmeter (in connection with the standard thermocouple); uniformly distributed random variable with zero mean and known standard uncertainty Type B uB,δESRV ; • δESK — correction obtained from the voltmeter calibration (in connection with the standard thermocouple); normally distributed random variable with zero mean and known standard uncertainty Type B uB,δESK ; • δESD — correction linked to the drift of voltmeter (in connection with the standard thermocouple); normally distributed random variable with zero mean and known standard uncertainty Type B uB,δESD ; • δtS — correction due to the calibration of the standard thermocouple; normally distributed random variable with zero mean and known standard uncertainty Type B uB,δtS ; • δtSD — correction linked to the drift of the standard thermocouple; normally distributed random variable with zero mean and known standard uncertainty Type B uB,δtSD ;

404

• δtSR0 — correction due to the deviation of the ice bath temperature at 0 ◦ C (in connection with the standard thermocouple); normally distributed random variable with zero mean and known standard uncertainty Type B uB,δtSR0 ; • δtSF — correction linked to the non-uniformity of the temperature profile (in connection with the standard thermocouple); uniformly distributed random variable with zero mean and known standard uncertainty Type B uB,δtSF . The constants cSi , i = 1, 2, . . . , n are known constants. The theoretical measurement model for the calibrated thermocouple is Ei = eKi + δEKIH + δEKRV + δEKK + δEKD + δEKC + δEKN + cKi δtKR0 + cKi δtKF + cKi δtRF ,

(7)

where the considered quantities are • eKi — the emf value on the calibrated thermocouple at i−th measured (errorless) temperature; realization of a normally distributed random variable with known standard uncertainty Type A uA,eKi ; • δEKIH — correction linked to the reading of voltmeter (in connection with the calibrated thermocouple); uniformly distributed random variable with known standard uncertainty Type B uB,δEKIH ; • δEKRV — correction linked to the resolution of the voltmeter (in connection with the calibrated thermocouple); uniformly distributed random variable with known standard uncertainty Type B uB,δEKRV ; • δEKK — correction obtained from the voltmeter calibration (in connection with the calibrated thermocouple); normally distributed random variable with known standard uncertainty Type B uB,δEKK ; • δEKD — correction linked to the drift of the voltmeter (in connection with the calibrated thermocouple); normally distributed random variable with known standard uncertainty Type B uB,δEKD ; • δEKC — correction due to extension cable (constant); uniformly distributed random variable with known standard uncertainty Type B uB,δEKC ; • δEKN — correction due to inhomogeneity of the thermocouple wires; uniformly distributed random variable with known standard uncertainty Type B uB,δEKN ;

405

• δtKR0 — correction due to the deviation of the ice bath temperature at 0 ◦ C (in connection with the calibrated thermocouple); normally distributed random variable with known standard uncertainty Type B uB,δtKR0 ; • δtKF — correction linked to the non-uniformity of the temperature profile (in connection with the calibrated thermocouple); normally distributed random variable with known standard uncertainty Type B uB,δtKF ; • δtRF — error of the reference function; normally distributed random variable with known standard uncertainty Type B uB,δtRF . The constants cKi , i = 1, 2, . . . , n are known constants. 3. EIV model as mathematical-statistical model of linear calibration Let Ti and Ei be the random variables representing our current state-ofknowledge (probability distributions) about the values which could be reasonably attributed to the measurand at the i-th calibration point, say ti and ei , based on measurements by two measuring devices (here thermocouples) and using the expert judgment (ti represents the value of measurand in units of the standard thermocouple and ei represents the value of measurand in units of the calibrated thermocouple). Further, let tSi = E(Ti ) and eKi = E(Ei ) denote the best estimates [4] of the measurand at the i-th experimental point, and ξi and ηi represent the random variables with a known centered (zero-mean) probability distribution based on our current state-of-knowledge, such that Ti = tSi + ξi and Ei = eKi + ηi , i = 1, . . . , n. The assumed functional relationship, namely ei − Eref (ti ) = a0 + a1 ti + a2 t2i + a3 t3i + a4 t4i ,

(8)

brings additional information in a form of constraints, which should be used to improve our knowledge about the measurand values ti and ei , and consequently about the calibration curve parameters a0 , a1 , a2 , a3 and a4 . Let ξ t and η e represent the random variables modeling the measurement (real) process with their observed values (realizations) ξ t = (tS1 , . . . , tSm )′ (real) ′ and η e = (eK1 , . . . , eKm ) ,       ξ1 t1 ξ t1  ξ t 2   t2   ξ 2        (9) ξ t =  .  =  .  +  .  = t + ξ,  ..   ..   ..  ξn tn ξtn

406

     e1 η1 η e1  η e2   e 2   η 2        η e =  .  =  .  +  .  = e + η,  ..   ..   ..  

η en

en

(10)

ηn

where the expectations E(ξ t ) = t and E(η e ) = e represent the true unknown values of the measurands, expressed in units of the two measuring devices, and the known covariance matrix is     ξt Σt 0 cov = . (11) ηe 0 Σe Finally, we suppose that the deviation function for the calibrated thermocouple is in the observed temperature region a polynomial of degree 4 (of course, we can choose any reasonable degree of that polynomial) as given by (8), which can be alternatively expressed as ei = a0 + b0 + (a1 + b1 )ti + · · · + (a4 + b4 )t4i + b5 t5i + · · · + b8 t8i (12) = c0 + c1 ti + · · · + c4 t4i + b5 t5i + · · · + b8 t8i , for i = 1, 2, . . . , n, with unknown vector parameter c = (c0 , c1 , c2 , c3 , c4 )′ . Hence, the (unknown) calibration model parameters e, t and c are interconnected with a system of nonlinear conditions. Now, we shall linearize the constraints given in (12) about the values ti = tSi , for all i = 1, 2, . . . , n, and about proper values c(0) = (0) (0) (0) (c0 , c1 , . . . , c4 )′ , and neglect terms of higher order than one including (0) (ti − tSi )(cj − cj ), i = 1, . . . , n, j = 0, . . . , 4. So, we obtain (0)

(0)

(0)

ei = c0  + ci tSi+ · · · + c4 t4Si +b5 t5Si + · · · +b8 t8Si  (0) (0) (0) + c0 − c0 + tSi c1 − c1 + · · · + t4Si c4 − c4  (0) (0) (0) (0) + c1 + 2tSi c2 + 3t2Si c3 + 4t3Si c4 +  + 5t4Si b5 + · · · + 8t7Si b8 (ti − tSi ) , i = 1, . . . , n.

(13)

By denoting (0)

ei (0)

(0)

(0)

(0)

= c0 + c1 tSi + · · · + c4 t4Si + b5 t5Si + · · · + b8 t8Si , (0)

(0)

(0)

(0)

(14) (0)

e(0) = (e0 , e1 , . . . , en )′ , c = (c0 , c1 , . . . , c4 )′ , c(0) = (c0 , c1 , . . . , c4 )′ ,

407

tS = (tS1 , . . . , tSn )′ , ∆e(0) = e − e(0) , ∆t(0) = t − tS , ∆c(0) = c − c(0) ,

(0)

B2

 1 tS1 1 tS2  = 

 t2S1 t3S1 t4S1 t2S2 t3S2 t4S2    ..  .

(15)

1 tSn t2Sn t3Sn t4Sn

(0)

and B1 the diagonal (n × n)-matrix with the (i, i)-th diagonal element n o (0) (0) (0) (0) (0) B1 = c1 + 2c2 tSi + 3c3 t2Si + 4c4 t3Si + 5t4Si b5 + · · · + 8t7Si b8 i,i

for i = 1, . . . , n, the constraints (13) can be rewritten in matrix form as    . (0) ∆e(0) (0) + B2 ∆c(0) = 0. (16) −I .. B1 ∆t(0) The shifted measurements, ξ t − tS ,

and

η e − e(0) ,

(17)

(0) (0) (0) with  themeans E(ξ t −tS ) = ∆t , E(η e −e ) = ∆e , covariance matrix Σt 0 , and with linear restrictions (16) on parameters, meet the regular 0 Σe linear regression model with type-II constraints. \ \ (0) , ∆t (0) , and ∆c, d together The BLUE of the unknown parameters ∆e with the is covariance matrix of the estimators, is obtained using results of Section 1. It remains to stipulate location of the vector c(0) . Proposed is   P8 eK1 − u=5 bu tuS1 P8    −1 eK2 − u=5 bu tuS2  (0) ′ (0) (0) ′  (0) . c = B2 B2 B2  ..   .   P8 eKn − u=5 bu tuSn

\ \ (0) , t(1) = t + ∆t (0) and Now we proceed iteratively. Put c(1) = c(0) + ∆c S (1) (1) obtain B2 , B1 , etc. Usually after r iterations (our experience shows that r is about 4-5) the process meets the required convergence criteria, and we get the final (r) and the covariance matrix cov(ˆ ˆ = cd estimates c c). Then the required estimates of the calibration curve coefficients ai are derived from the BLUEs, aˆi = cˆi − bi for i = 0, 1, . . . , 4.

408

4. Conclusion We have suggested a new approach for obtaining the locally best linear unbiased estimators (BLUEs) of the calibration curve parameters, fitted to the deviations ei −Eref (ti ) measured by the calibrated thermocouple. If the random variables ξ1 , . . . , ξn , η1 , . . . , ηn are independent and fully specified, the marginal state-of-knowledge distributions about the parameters ai can be derived, e.g., by using the Monte Carlo methods or by computing the numerical inversion of its characteristic function, as suggested in [5, 6]. Otherwise, alternative methods should be used, see e.g. [7]. Acknowledgements The work was supported by the Slovak Research and Development Agency, projects APVV-15-0295 and APVV-15-0164, and the Scientific Grant Agency of the Ministry of the Ministry of Education of the Slovak Republic and the Slovak Academy of Sciences, projects VEGA 2/0054/18, VEGA 2/0011/16, and VEGA 1/0748/15. References [1] G. Casella and R. L. Berger, Statistical Inference (Duxbury Pacific Grove, CA, 2002). [2] L. Kub´ aˇcek, Foundations of Estimation Theory (Elsevier, 2012). [3] E. Fiˇserov´a, L. Kub´ aˇcek and P. Kunderov´a, Linear Statistical Models: Regularity and Singularities (Academia, 2007). [4] JCGM100:2008, Evaluation of measurement data – Guide to the expression of uncertainty in measurement (GUM 1995 with minor corrections), in JCGM - Joint Committee for Guides in Metrology, (ISO, BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2008). [5] V. Witkovsk´ y, Numerical inversion of a characteristic function: An alternative tool to form the probability distribution of output quantity in linear measurement models, Acta IMEKO 5, 32 (2016). [6] V. Witkovsk´ y, CharFunTool: The characteristic functions toolbox for MATLAB (2017), https://github.com/witkovsky/CharFunTool. ˇ sov´a, S. Duriˇ ˇ s, R. Palenˇc´ar [7] V. Witkovsk´ y, V. G. Wimmer, Z. Duriˇ and J. Palenˇc´ar, Modeling and evaluating the distribution of the output quantity in measurement models with copula dependent input quantities, in International Conference on Advanced Mathematical and Computational Tools in Metrology and Testing XI, (University of Strathclyde, Glasgow, Scotland, 2017).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 409–416)

Modeling and evaluating the distribution of the output quantity in measurement models with copula dependent input quantities Viktor Witkovsk´ y Institute of Measurement Science, Slovak Academy of Sciences, Bratislava, Slovakia E-mail: [email protected] Gejza Wimmer Mathematical Institute, Slovak Academy of Sciences, Bratislava, Slovakia Faculty of Natural Sciences, Matej Bel University, Bansk´ a Bystrica, Slovakia E-mail: [email protected] ˇ sov´ Zuzana Duriˇ a Slovak Institute of Metrology, Bratislava, Slovakia E-mail: [email protected] ˇ s, Rudolf Palenˇ Stanislav Duriˇ c´ ar and Jakub Palenˇ c´ ar Faculty of Mechanical Engineering, Slovak University of Technology, Bratislava, Slovakia E-mail: [email protected], [email protected], [email protected] Proper uncertainty analysis in metrology frequently requires using model with a nonlinear measurement equation Y = f (X), where X = (X1 , . . . , XN ), and non-trivial joint multivariate distribution of the input quantities X. This is exactly the situation suitable for application of the approach suggested in the GUM Supplements 1 and 2. In this paper we emphasize the advantage of using the Characteristic Function Approach for sampling from the distribution of the output quantity Y , based on the joint copula-type distribution of the inputs X, with given convolution-type marginal distributions of the input quantities Xi specified by using and combining the Type A and/or Type B evaluation methods. Keywords: Copula; Monte Carlo Method; Characteristic Function Approach.

1. Introduction Based on the GUM 1 and its Supplements 2,3 , evaluation of the measurement uncertainty should be based on a correct measurement model specification 409

410

and the state-of-knowledge distribution of the input quantities. Proper evaluation of the measurement uncertainty and/or the coverage intervals typically requires evaluation of the probability density function (PDF), the cumulative distribution function (CDF) and/or the quantile function (QF) of a random variable Y associated with the measurand (here, the random variable Y represents distribution of possible values attributed to the measurand, based on current knowledge). A mathematical model of measurement of a single (scalar) quantity can be expressed as a functional relationship, Y = f (X),

(1)

where Y is the scalar output quantity and X represents the vector of N input quantities (X1 , . . . , XN ). Here, each Xi is regarded as a random variable with possible values ξi , reasonably attributed to the i-th input quantity based on current knowledge, and Y is a random variable with possible values η, consequently attributed to the measurand. The joint PDF for X is denoted by gX (ξ) and CDF is denoted by GX (ξ), where ξ = (ξ1 , . . . , ξN ) is a vector variable describing the possible values of the vector quantity X. PDF for Y is denoted by gY (η) and the CDF by GY (η). Marginal distribution functions (PDFs/CDFs) for Xi are denoted by gXi (ξi ) and GXi (ξi ), respectively. Frequently, it is adequate to assume that the functional relationship Y = f (X) is linear in Xi and the input random variables Xi are mutually independent, leading to the convolution-type distribution of Y . However, this is a strong assumption which could be inadequate in many important situations, and usage of the nonlinear functional relation f and/or the non-trivial joint multivariate distribution of X becomes an indispensable requirement. This is exactly the situation suitable for application of the approach suggested in the GUM Supplements. Here we focus on the problem of sampling from a fully specified (by an expert knowledge) joint multivariate copula-type distributions with dependence structure given by specific copula and with known continuous marginal distributions. In particular, we emphasize the advantage of using the characteristic function approach (CFA) for sampling from the distribution of the output quantity Y , based on the joint copula-type distribution of the inputs X, with convolution-type marginal distributions assigned to the input quantities Xi specified by using and combining the Type A and/or Type B evaluation methods 4 .

411

2. The copula distributions Copulas are being increasingly used to model multivariate distributions with specified continuous margins 5 . Their modeling flexibility and conceptual simplicity makes it a natural choice also for modeling multivariate distributions in metrology and measurement science 6 . The copula approach to dependence modeling is rooted in a representation theorem due to Sklar 7 . Let X = (X1 , . . . , XN ) be a random vector with joint distribution function GX and with continuous marginal CDFs GXi , for i = 1, . . . , N . If the marginal distributions are continuous, then the CDF GX of X can be uniquely represented as GX (ξ) = C (GX1 (ξ1 ), . . . , GXN (ξN )) ,

(2)

where C(ω) = C(ω1 , . . . , ωN ) is the N -dimensional joint distribution function with uniform marginals on the interval (0, 1), called a copula. Alternatively,  −1 (3) C(ω1 , . . . , ωN ) = GX G−1 X1 (ω1 ), . . . , GXN (ωN ) ,

where G−1 Xi denote the inverse distribution function (the quantile function) of the marginal distribution GXi . Equation (3) can be used to specify particular copula distributions. In particular, the Gaussian copula specified by the correlation matrix R is given by  G CR (ω1 , . . . , ωN ) = ΦR Φ−1 (ω1 ), . . . , Φ−1 (ωN ) , (4)

where Φ−1 is the inverse cumulative distribution function of a standard normal and ΦR is the joint CDF of a multivariate normal distribution with mean vector zero and covariance matrix equal to the correlation matrix R. Similarly, the Student’s t copula specified by the correlation matrix R is given by  t −1 Cν,R (ω1 , . . . , ωN ) = tν,R t−1 (5) ν (ω1 ), . . . , tν (ωN ) ,

where tν be a univariate t distribution with ν degrees of freedom and tν,R is the multivariate Student’s t distribution with a correlation matrix R and ν degrees of freedom. Further, the Archimedean copulas are a prominent class of copulas, able to capture complicated tail dependencies, specified by its generator ψ,  Cψ (ω1 , . . . , ωN ) = ψ ψ −1 (ω1 ) + · · · + ψ −1 (ωN ) . (6)

Common families of Archimedean copulas defined by their specific generator functions are illustrated in Table 1.

412 Table 1.

Specific families of Archimedean copulas.

Archimedean copula

Parameter space

Generator function ψ(t)

Clayton

θ>0

ψ(t) = (1 +t)− θ 

1

1 θ

Gumbel

θ>1

ψ(t) = exp −t

Frank

θ ∈ (−∞, ∞)\{0} for N = 2 and θ > 0 for N ≥3

ψ(t) = − θ1 log (1 − exp(−t)[1 − exp(−θ)])

2.1. Sampling from the specific copulas For the Gaussian copula the input parameter is the correlation matrix R. Let U = (U1 , . . . , UN ) denotes one random draw from the copula: (1) Generate a multivariate normal vector Z ∼ N (0, R) where R is an N -dimensional correlation matrix. This can be achieved by Cholesky decomposition of the correlation matrix R = LLT where L is a lower triangular matrix with positive ˜ ∼ N (0, I), then LZ ˜ ∼ N (0, R). elements on the diagonal. If Z (2) Transform the vector Z into U = (Φ(Z1 ), . . . , Φ(ZN )), where Φ is the CDF of univariate standard normal. For the Student’s t copula the input parameters are the degrees of freedom ν and the correlation matrix R. To generate one random draw from the copula: (1) Generate a multivariate vector T ∼ tν,R following the centered t distribution with ν degrees of freedom and p correlation matrix R. This can be achieved by using T = ν/QZ, where Z ∼ N (0, R) is a multivariate normal vector and Q ∼ χ2ν is independent univariate chi-square distributed random variable with ν degrees of freedom. (2) Transform the vector T into U = (tν (T1 ), . . . , tν (Tm ))T , where tν is the CDF of univariate t distribution with ν degrees of freedom. Archimedean copulas are usually sampled with the algorithm of Marshall and Olkin 8 . To generate one random draw from the copula: (1) Generate a random variable V with the specific distribution function F , defined by the inverse Laplace transform of the copula generator function ψ(t):  • For Clayton copula generate V ∼ Gamma θ1 , 1 .

413

 • For Gumbel copula generate V ∼ Stable θ1 , 1, γ, 0 with γ =  π , where by Stable (α, β, γ, µ) we denote the α-stable discosθ 2θ tribution function with the parameters α (stability), β (skewness), γ (scale), and µ (location). • For Frank copula generate V ∼ Logarithmic (1 − exp(−θ)). (2) Generate independent uniform random variables (R1 , . . . , RN ), with Ri ∼ R(0, 1) . T (3) Return U = (ψ(− log(R1 )/V ), . . . ψ(− log(RN )/V )) . 3. Sampling from the multivariate copula-type distributions For metrological applications it is important to sample realizations of X = (X1 , . . . , XN ) from the joint multivariate distribution GX specified by the given copula C(ω) and the marginal distributions GX1 (ξ1 ), . . . , GXN (ξN ). Based on that, we can further evaluate realization of the output quantity defined by the measurement equation Y = f (X) and use the methods of GUM Suplements for proper determination of the measurement uncertainty. If U = (U1 , . . . , UN ) is a random vector with given copula distribution and uniformly distributed marginals Ui ∈ (0, 1), i.e. U ∼ C, then  −1 X = (X1 , . . . , XN ) = G−1 (7) X1 (U1 ), . . . , GXN (UN ) ,

has the required joint multivariate distribution, X ∼ GX , as defined in (2). Hence, sampling many realizations of X = (X1 , . . . , XN ) requires many evaluations of the quantiles G−1 Xi (Ui ) of the marginal distributions GXi , i = 1, . . . , N , for specific probabilities (U1 , . . . , UN ) generated from the specified copula distribution. Here we shall consider situation motivated by the hierarchical modeling approach, that each input variable Xi is modeled as a linear combination P ni of simple and independent inputs Xij , i.e. Xi = j=1 cij Xij . Such distribution functions and their quantile functions can be evaluated efficiently by using the characteristic function approach 9 . CFA was suggested to form the state-of-knowledge probability distribution of the output quantity in linear measurement model, based on the numerical inversion of its CF, which is defined as a Fourier transform of its PDF. Table 2 presents CFs of selected univariate distributions frequently used in metrological applications for modeling marginal distributions of the input quantities. CF of a weighted sum of independent random variable is simple to derive if the measurement model is linear and the input quantities are independent.

414 Table 2. Characteristic functions of selected distributions used in metrological applications. Here, Γ(a) denotes the gamma function, Jν (z) is the Bessel function of the first kind and Kν (z) is the modified Bessel function of the second kind. Probability distribution

Student’s tν

Characteristic function ϕ(t)  2 ϕ(t) = exp − t2  1 ν  1  2 ν 2 |t| K ν ν 2 |t| , ν > 0 ϕ(t) = ν −11 ν

Rectangular R(−1, 1) Triangular T (−1, 1) Arcsine U (−1, 1)

ϕ(t) = t 2−2 cos(t) ϕ(t) = t2 ϕ(t) = J0 (t)

Gaussian N (0, 1)

Γ( 2 ) 22 sin(t)

2

P ni In particular, let Xi = j=1 cij Xij with coefficients cij and independent Xij . In such situation, CF of the input quantity Xi is given by ϕXi (t) = ϕXi1 (ci1 t) × · · · × ϕXini (cini t),

(8)

where by ϕXij (t) we denote the (known) CFs of the input quantities Xij . Then, the distribution function GXi of the quantity Xi can be evaluated by numerical inversion of its CF ϕXi (t) by a simple trapezoidal quadrature: GXi (ξi ) ≈

  −itk ξi K ϕXi (tk ) δX e 1 − , wk ℑ 2 π tk

(9)

k=0

where K is large integer, wk are the trapezoidal quadrature weights (w0 = wK = 21 , and wk = 1 for k = 1, . . . , K − 1), tk are equidistant nodes from T the interval (0, T ) for sufficiently large T , and δ = K . Here, by ℑ(z) we denote the imaginary part of the complex value z. The quantile function G−1 Xi is evaluated at Ui by using the barycentric interpolation from the distribution function GXi evaluated at small number of Chebyshev points from the support of the distribution, ξiCheb ∈ Supp(Xi ). This is a highly efficient method which can be used for generating multiple realizations of Xi . Figure 1 illustrates the random sample from the joint bivariate distribution specified by Student’s t copula and convolution-type marginal distributions of the input quantities X1 and X2 , where X1 ∼ XN + 4XR and X2 ∼ 2XT + 3XU , with independent input random variables: XN standard normal, XR rectangular on (−1, 1), XT triangular on (−1, 1), and XU arcsine on (−1, 1).

415

Bivariate Students t copula with given marginals 6

4

X2

2

0

-2

-4

-6 -8

-6

-4

-2

0

2

4

6

8

X1

Fig. 1. Random sample of size n = 10000 from the joint bivariate copula-type distribution, specified by Student’s t copula with ν = 3 degrees of freedom and given correlation matrix R (with ̺ = 0.8), and the convolution-type marginal distributions of the input quantities X1 and X2 , where X1 ∼ XN + 4XR and X2 ∼ 2XT + 3XU , with independent input random variables: XN standard normal, XR rectangular on (−1, 1), XT triangular on (−1, 1), and XU arcsine on (−1, 1).

4. Conclusions Sampling from the multivariate copula-type distributions is important for many applications in measurement science and metrology. Here we have presented some basic properties of the copula distributions together with the algorithms for sampling from specific copulas. We have emphasized the advantage of using the characteristic function approach, i.e. the method for computing the required marginal distribution functions and the quantile functions by numerical inversion of the associated characteristic functions.

416

Acknowledgements The work was supported by the Slovak Research and Development Agency, project APVV-15-0295, and by the Scientific Grant Agency of the Ministry of Education of the Slovak Republic and the Slovak Academy of Sciences, projects VEGA 1/0748/15, VEGA 2/0011/16, and VEGA 2/0054/18. References 1. JCGM100:2008, Evaluation of measurement data – Guide to the expression of uncertainty in measurement (GUM 1995 with minor corrections), in JCGM - Joint Committee for Guides in Metrology, (ISO, BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2008). 2. JCGM101:2008, Evaluation of measurement data – Supplement 1 to the Guide to the expression of uncertainty in measurement – Propagation of distributions using a Monte Carlo method, in JCGM - Joint Committee for Guides in Metrology, (ISO, BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2008). 3. JCGM102:2011, Evaluation of measurement data – Supplement 2 to the Guide to the expression of uncertainty in measurement – Extension to any number of output quantities, in JCGM - Joint Committee for Guides in Metrology, (ISO, BIPM, IEC, IFCC, ILAC, IUPAC, IUPAP and OIML, 2011). 4. V. Witkovsk´ y, Numerical inversion of a characteristic function: An alternative tool to form the probability distribution of output quantity in linear measurement models, Acta IMEKO 5, 32 (2016). 5. C. Genest and A.-C. Favre, Everything you always wanted to know about copula modeling but were afraid to ask, Journal of Hydrologic Engineering 12, 347 (2007). 6. A. Possolo, Copulas for uncertainty analysis, Metrologia 47, p. 262 (2010). 7. A. Sklar, Fonctions de r´epartition `a n dimensions et leurs marges, Publications de l’Institut de Statistique de l’Universit´e de Paris 8, 229 (1959). 8. A. W. Marshall and I. Olkin, Families of multivariate distributions, Journal of the American Statistical Association 83, 834 (1988). 9. V. Witkovsk´ y, CharFunTool: The characteristic functions toolbox for MATLAB (2017), https://github.com/witkovsky/CharFunTool.

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 417–426)

Bayesian estimation of a polynomial calibration function associated to a flow meter C. Yardin*, S. Amar and N. Fischer Mathematics and Statistics Department, Laboratoire national de métrologie et d'essais, Paris, France *E-mail: [email protected] www.lne.fr M. Sancandi Commissariat à l'énergie atomique et aux énergies alternatives (CEA), CEA/CESTA, Bordeaux, France M. Keller Electricité de France, recherche et développement (EDF), EDF R&D, Chatou, France The calibration function of a measurement device, denoted by Y = f(X, β), enables us to obtain a measurand value X corresponding to an indication Y delivered by the device. The function f is estimated with some pairs (Xiobs, Yiobs) of values resulting from measurements realized with standards and the device. So far, f is evaluated with frequentist methods and in particular, GLS (Generalized Least Squares) which take into account data uncertainty. However, GLS does not exploit previous knowledge on the device, except during a later step of results validation. On the contrary, Bayesian inference utilizes directly this information and recently a Bayesian calibration model has been developed. We apply this model to a second degree polynomial with observed data related to both standard values and device’s indications. The posterior density of the unknowns is approximated with a “tailored” Metropolis-Hastings algorithm whose proposal distribution parameters are estimated with Laplace’s approximation. Keywords: Polynomial calibration function, Bayesian inference, Errors-in-variables model, Uncertainty, Laplace’s method, Metropolis-Hastings algorithm.

417

418

1. Calibration function of the flow meter 1.1. Calibration process The device is a Molbloc laminar flow meter which is calibrated in the range 2 mg/s to 20 mg/s with nitrogen gas. The calibration process uses the gravimetric dynamic method in which the flow is continously measured as a loss of mas over a period of time. The principle is generated by the gravimetric bench composed of a gas bottle under pressure which loses gas, a balance to measure the depleted mass and a computer for counting the passing time. The run is stopped when the depleted mass corresponding to a target flow is reached. The gravimetric bench and the calibration method are described in [1]. The flow meter is calibrated in five points of the range and the data are converted into volume flow (ml/min). An example of a calibration data is presented in the Table 1. Table 1. Example of Calibration data. Response Yobs Uncertainty

Standard Value

1000.02

0.00289

999.93

1.4434

i=2

2501.23

0.00289

2504.4

2.4925

i=3

5001.93

0.00289

5008.87

4.3345

i=4

7502.86

0.00289

7508.8

6.4995

i=5

10 010.895

0.00289

10 010.7

8.6646

Points

Flow meter Indication

i= 1

Xobs (ml/min) Uncertainty

As the data are observations registered during the calibration process, we use the superscript “obs” to denote Yiobs an indication of the flow meter and Xiobs a value of the standard flow. These associated uncertainties are denoted respectively u(Yiobs) and u(Xiobs). When it exists some covariances between two Xobs values, they are denoted u(Xiobs, Xjobs). The overall data with their associated uncertainties are used to estimate the calibration function of the flow meter. 1.2. GLS estimation of the calibration function Historical calibration and knowkledge on the device lead us to postulate a second degree polynomial function. So far, this function has been estimated in a frequentist framework in using the errors-in-variables model (EIV) [2]. In the flow meter case, the model is expressed as follows:

419

Yi =  0 + 1 X i +  2 X i2 = f ( X ,  )

(1)

Yi obs = Yi + Ei X iobs = X i + Li ,

where the unknowns of the system are X, Y and the β parameters. As Y is defined by the f function, only X and β are to be evaluated. As uncertainties (u(Xiobs), u(Yiobs)) varie in the data set and Xiobs values are correlated, the GLS method is generally used [3]. This method takes into account data uncertainty structure, and then presents two benefits: assign to data some weights which are inversely proportional to the size of the associated uncertainty, propagate this uncertainty to the covariance matrix of the estimates themselves. The GLS estimator in this case is given by this system of equations:



S( BGLS ) = DT VD-1 D

 obs D = L =  Xobs - X 2 E  Y - (  0 1n +  1 X +  2 X

 Σ obs VD =  X  0

0  , ΣY obs 

 ) 

(2)

and the symetric covariance matrix of the estimates is expressed in Eq. (3):  ΣX VBGLS = (JP -1 P -T J)-1 =   Σ X ,

Σ X , Σ

 , 

(3)

where J is the matrix of partial derivatives of D evaluated at the solution BGLS, and P is the Choleski decomposition of VD. GLS estimates are presented with Bayesian estimates in Table 3. 2. Bayesian calibration model 2.1. Equations of the model The bayesian calibration model is expressed as follows in the case of the flow meter:

420

Y   0 1n  1 X   2 X 2  f ( X ,  ) Y obs / Y   ( f ( X ,  ),  Y obs ) X obs / X  N ( X ,  X obs )

 ( )  N ( ,   )

(4)

 ( X )  N (  X ,  X ). A DAG representation of the model is included in Appendix. Compared to EIV (Eq. (2)), we affect the same errors on the observations data and we added a probability distribution on these errors. A notable difference concerns the probability distributions associated with the unknown parameters and true value X to take into account prior knowledge about them. This point is discussed on the next subsection. 2.2. Choice of the priors In bayesian inference, the priors for the unknowns form an important part of the model. While the method allows for the choice of non-informative priors (e.g. Jeffrey’s priors), there is usually available knowledge in the field of metrology. This knowledge could come from a variety of sources and enables us to assign an informative prior distribution to all the parameters of the model. So it seems reasonable to affect standard distributions to all parameters of the model. 2.2.1. X values When the standard X values are not observed during the calibration process, the choice of the prior is simple. In general, X values are obtained from the calibration certificate. This document refers to a Gaussian distribution as N(µX, ∑X) where the mean value µX is given with an associated standard deviation u(X), with or without covariances u(Xi, Xj) in an included table. In the flow meter example, this process is not possible because the standard value is evaluated during the calibration process and the result of this evaluation is registered in the observed standard value. So the question of choosing the priors should be examined. The prior distribution for the X values is assigned a Gausian distribution with mean and standard uncertainty taken to be the nominal value obtained from the calibration process and the specified tolerance respectively. The prior means and standard deviations for X are presented in Table 2.

421 Table 2. Parameters of the Gaussian Prior distribution associated to X “true” values. X value

mean

Standard deviation

X1

999.99

1.44

X2

2505

2.89

X3

5005

4.33

X4

7505

6.50

X5

10005

8.66

2.2.2. Parameters β In general, the knowledge about calibration function may come from three sources: measurement device manufacturer, calibration laboratory and end-user. In this case of the flow meter, the values retained are given by the laboratory himself, the one that both calibrates and uses the device. To simplify the task, we use values evaluated during the previous calibration. These values are expressed by a Gaussian distribution with the parameters indicated in Table 3. To study the robustness of the model, a uniform distribution was also tested. Gaussian central values are conserved. Lower and upper bounds of the uniform distribution are chosen to be sufficiently large in order to cover computational difficulties. The bounds of these distributions are presented in Table 3. Table 3. Prior distribution parameters of coefficients β. Gaussian distribution

Uniform distribution

Parameters

mean

standard deviation

lower bound

upper bound

β0

3.00

1.73E-01

2

4

β1

0.99

5.70E-02

0

2

β2

3.0E-07

1.73E-08

-1

1

3. Bayesian inference The joint posterior density deduced from Bayes’ theorem is generally expressed proportionally to a constant:

 ( X ,  / X obs ,Y obs )  l( X obs ,Y obs / X ,  )   ( X )   (  ).

(5)

This posterior pdf of parameters is not easy to estimate due to the complexity of the model: high number of unknown parameters, non-linear calibration function and non-conjugate priors-likelihood.

422

So, we use Metropolis-Hastings algorithm with a tailored proposal distribution based on Laplace’s approximation. The estimation process runs in two steps: build up proposal distribution, sampling in this distribution. 3.1. Laplace’s approximation to build the proposal distribution Laplace’s method is used in Bayesian inference to get approximatively and analytically the pdf of the unknowns [4,5]. The approximate pdf is Gaussian whose parameters are evaluated as follows:  Step 1: form the logarithm of the pdf denoted log  ( X ,  / X obs , Y obs ) log  ( X ,  / X obs , Y obs )  log l ( X obs / X ,  )  log l (Y obs / X ,  )  log  ( X )  log  (  ).



(6)

Step 2: compute the MAP (Maximum A posteriori) estimates of the parameters which are the values that maximise the posterior or its logarithm

( X M AP , 

  

M AP

)  M ax log  ( X ,  / X obs , Y obs ). X ,

(7)

Step 3: evaluate the Hessian matrix on the MAP denoted by

H (X

MAP

,

M AP

).

(8)

Step 4: calculate the Hessian inverse. Step 5: express the Gaussian pdf with the MAP corresponding to the mean and the minus inverse of the Hessian corresponding to the variance-covariance matrix

 ( X ,  / X obs ,Y obs )  N (( X MAP ,  MAP ),  H 1 ( X MAP ,  MAP )).

(9)

3.2. Tailored Metropolis-Hastings algorithm The principle of Metropolis-Hastings algorithm is to approximate the target distribution by a proposal distribution that is easy to draw samples from and that satisfy some conditions [4]. Draws are generated from the proposal distribution and are accepted as part of the sample from the target distribution with an acceptance probability. This algorithm produces a series of values that converge to the target distribution. The Tailored Metropolis-Hastings algorithm is characterized by:

423

-

use a proposal distribution near to the target distribution. For example, distribution given by Eq. (9) enlarge this drawing region so the algorithm can get out of lowprobability regions to reach the solution. The covariance matrix is then multiplied by a factor τ equal 1.1 to 1.3 and the correlation structure remains the same. These numerical values for τ result from empirical considerations.

4. X and β estimates Tailored M-H is applied to data with prior means as initial values and N = 105 simulations. Gaussian and uniform β priors are considered. They are compared to GLS and Laplace methods. The first derivatives of the logarithm of the posterior in Eq. (6) with respect to the parameters are not linear so MAP and Hessian matrix are evaluated using numerical methods. These calculations and Tailored M-H method are implemented in a R software developed in our laboratory. 4.1. Gaussian β prior case In this case, Tailored M-H is applied with factor τ equal to 1.1 and we obtain an acceptance rate of 0.25. Means and standard deviations of the marginal posterior distributions of the parameters are presented in Table 4–Table 7. GLS estimates are also given. Table 4. Mean of β given by three methods (Gaussian β prior). Parameter

GLS

Laplace

Tailored M-H

β0

3.470

3.063

3.068

β1

0.996

0.997

0.997

β2

3.32E-07

2.99E-07

2.98E-07

Table 5. Standard deviation of β given by three methods (Gaussian β prior). Parameter

GLS

Laplace

Tailored M-H

β0

5.4E-01

1.5E-01

1.1E-01

β1

1.0E-03

4.1E-04

2.7E-04

β2

6.1E-08

1.5E-08

1.1E-08

424 Table 6. Mean of X given by three methods (Gaussian β prior). Variable

GLS

Laplace

Tailored M-H

X1

999.92

999.68

999.67

X2

2504.94

2503.86

2503.85

X3

5008.66

5006.51

5006.51

X4

7508.48

7505.64

7505.67

X5

10011.20

10008.14

10008.21

Table 7. Standard deviation of X values given by three methods (Gaussian β prior). Variable

GLS

Laplace

Tailored M-H

X1

0.92

0.41

0.27

X2

2.30

1.00

0.65

X3

4.57

1.98

1.27

X4

6.82

2.96

1.89

X5

9.54

3.98

2.54

Estimates are equivalent within associated uncertainties. Bayesian estimates are closer with associated uncertainties smaller than GLS results. Tailored M-H allows both to confirm adequacy of Laplace approximation and to reduce uncertainty. 4.2. Uniform β prior case Means and standard deviations of the marginal posterior distributions of the parameters obtained using a uniform prior for β are presented in Table 8 and Table 9. Table 8. Statistics for β estimates given by Bayesian methods (Uniform β prior). Parameter β0

Laplace

Tailored M-H

Mean

Standard deviation

Mean

Standard deviation

2.999

5.29E-01

3.432

3.19E-01

β1

0.997

6.29E-04

0.997

4.23E-04

β2

2.84E-07

5.94E4-08

3.25E-07

3.66E-08

As priors have larger standard deviation than Gaussian case, Tailored M-H estimates of β are increased towards GLS estimates. Laplace’s approximation which is suitable in the case of Gaussian β prior performs poorly with uniform β

425

prior: means of the β posterior distribution depart from those obtained with tailored M-H and standard deviations are larger. Compared to the Gaussian β prior case, uncertainties associated with estimates are larger when using the uniform β prior above. X estimates are less modified. Table 9. Statistics for X values given by Bayesian methods (Uniform β prior). Variable

Laplace

Tailored M-H

mean

standard deviation

mean

standard deviation

X1

999.651

0.44

999.542

0.29

X2

2503.75

1.06

2503.973

0.73

X3

5006.413

2.03

5006.77

1.35

X4

7505.756

2.97

7505.728

1.88

X5

10008.65

4.26

10007.72

2.54

5. Conclusion The Bayesian calibration model allows one to both incorporate prior information and estimate a function more complex than the straight line analysed so far. The second degree polynomial studied here implies more coefficients to be determined and perhaps other difficulties like correlation between the powers of X. The tailored Metropolis-Hastings performed well and demonstrated robustness to different types of prior. As the standard values are observed during the calibration process, knowledge on them is not originate from a calibration certificate. In this study, the set values displayed in the calibration process are chosen. Appendix

Fig. 1. DAG of the Bayesian calibration model.

426

The unknown of this model are the β parameters of the function and the “true” values of both the standards (X) and the response (Y). As Y is given by the function it is not estimated. Under this heart of the model, are presented the observed variables Xobs and Yobs which are generated by the true variables X and Y. References 1.

2. 3. 4. 5.

J. Barbe, C. Yardin and T. Macé, New national standard for the calibration of Molbloc laminar flowmeters from 0,2 mg.s-1 to 200 mg.s-1, Revue française de métrologie, Volume n°42 (2016). C.L. Cheng and J. W. Ness, Kendall’s library of statistics 6, Statistical regression with measurement error, (London, 1999). Determination and use of straight-line calibration functions, ISO/TS 28037:2010. Gelman A., Carlin J.B., Stern H.S. and Rubin D.B., Bayesian data analysis, second edition, Chapman & Hall/CRC Texts in Statistical Science, (2003). L. Thierney and J. B. Kadane, Accurate Approximations for Posterior Moments and Marginal Densities, Journal of the American Statistical Association, Volume 81, Issue 393 (Mar 1986).

A B Forbes, N F Zhang, A Chunovkina, S Eichstädt, F Pavese (eds.): Advanced Mathematical and Computational Tools in Metrology and Testing XI Series on Advances in Mathematics for Applied Sciences, Vol. 89 © 2018 World Scientific Publishing Company (pp. 427–438)

Dynamic measurement errors correction adaptive to noises of a sensor* E. V. Yurasova† and A. S. Volosnikov Information and Measurement Technology Department, School of Electrical Engineering and Computer Science, South Ural State University (National Research University), Chelyabinsk, Russian Federation †E-mail: [email protected] www.susu.ru/en The paper considers the data measuring system structure containing a model of a primary measuring transducer and implementing the modal control of dynamic characteristic. On the basis of this structure, the method for optimized adjusting of measuring channel dynamic parameters to the noise condition on the measuring transducer output has been developed. This method provides an internal validation of the results obtained. The computational simulation confirming the efficiency of the dynamic measurement errors correction method has been performed. Keywords: Dynamic Measurement Error, Dynamically Distorted Signal Recovery, Model of a Primary Measuring Transducer, Modal Control, Adaptive Measuring Systems.

1. Introduction Measurements performed in the dynamic mode are characterized by dynamic errors caused mostly by the inertia of the primary measuring transducer and its output random noises. These measurement error components exceed significantly all the other components of overall error and, consequently, require correction. To date, the most developed methods of a dynamically distorted signal recovery have been based either on the A. N. Tikhonov regularization method [1], [2], with the need of using the inverse Fourier transformation, or on the numerical solution of the convolution integral equation with the introduction of the regularization parameter [3], [4]. However, in practical measuring problems a noise signal at the sensor output is not Gaussian, and the spectral density of this noise is often unknown. The work was supported by Act 211 Government of the Russian Federation, contract № 02.A03.21.0011.

*

427

428

Practically, there have been no results of assessing the DME according to the measuring device output signal and its dynamic characteristic data. The questions of effective correction of the DME with a low sensitivity to the noises of the primary measuring transducer have not been raised specially. This limits the accuracy of the measuring systems by the characteristics of the equipment and does not allows for using the computing capabilities of these systems to improve their metrological characteristics significantly. The idea of dynamic measuring systems with modal control of dynamic characteristics is developed [5], [6]. The paper presents the DMEs correction method which is optimal in accuracy for real noises of measuring systems in the presence of a priori information about both the maximum frequency of the input signal and the minimum frequency of the noise signal. The a priori information about the properties of the measured signal and the additive noise can be their spectral density. The approach has internal control of the reliability of the results obtained. The method is based on using a basic MS with a model of a primary measuring transducer (sensor) [5] and additional possibility of the dynamic error evaluation [6]. 2. Measuring System Adaptive to Characteristics of Sensor Output Noises The principal difference between control systems and measuring systems is that the latter cannot be covered by feedback from the input, because of the fundamental inaccessibility of the input information signal. It is advisable to use special structures of measuring systems that allows it to implement the principle of modal control and obtain temporal estimates of the DME. Before the emergence of a basic dynamic model of a measuring system with the additional DME estimator [5], such a formulation of the problem and methods for its solution have not been described in the literature. The measuring system (MS) adaptive to the characteristics of noises is proposed to be synthesized by minimizing the total DME caused by slow sensor response as well as by the noises and interferences existing at its output. For this purpose, the basic structure of the MS with a modal control of the dynamic characteristics based on the sensor model is used [5]. The block diagram of the MS shown in Figure 1 includes the complete dynamic model of the sensor, the output of which is connected with the similar complete dynamic model covered by the feedback loop with the measured coefficients k 0 , k1 , ..., k n1 .

429

The presence of the complete dynamic sensor model in the structure of the MS sets the identity of the differential equations describing the dynamic characteristics of the sensor and its model. Therefore, if their output signals are close to each other, then the input signals of the sensor and its model will not differ much from one another in the presence of measuring channel regularization. Consequently, the model input signal, which is available for observation, allows for evaluating the sensor input signal, which is not available for observation. The criterion of the MS coefficients k 0 , k1 , ..., k n1 adjustment is the proximity of the output signals of both sensor and its model.

Fig. 1. Block diagram of the dynamic parameters MS.

Suppose the transfer function of the sensor and the sensor model is as follows: WS  p  

y  p

U  p



bm p m  bm 1 p m 1  ...  b1 p +b0 p n  an 1 p n 1  ...  a1 p  a0

WM  p   WS  p  

yM  p 

UM  p

,

(1)

430

where y ( p ) is the Laplace representation of the sensor output signal; U ( p) is the Laplace representation of the sensor input signal; a n 1 , ..., a1 , a0 and bm , ..., b1 , b0 are constant coefficients ( m  n ); p is a complex frequency variable; yM ( p ) is the Laplace representation of the sensor model output signal; U M ( p ) is the Laplace representation of the sensor model input signal. The sensor output signal representation with considering the real noise of the MS is as follows: yS  p   y  p   V  p  ,

(2)

where V ( p ) is the Laplace representation of the high-frequency noise at the sensor output. Consider the case when control is performed only by the poles of the transfer function of the MS. According to the block diagram shown in Figure 1, the transfer function of the MS in case of the noise absence at the sensor output is as follows: WMS  p  

U *  p U  p V

 p  0

 a0  k0  bm p m  ...  b1 p  b0   p n   an 1  kn 1  p n 1  ...   a1  k1  p   a0  k0  b0

,

(3)

where U * ( p ) is the MS output signal reduced to a single gain coefficient. The transfer function of the MS in the reduced noise component, determined in the absence of the useful input signal, is as follows: WNS  p  



U *  p V  p

U  p 0

WMS  p   an 1  ...  a1 p  a0  p n   an 1  kn 1  p n 1  ...   a1  k1  p   a0  k0  WS  p  pn

p n 1

.

(4)

The analysis of (3) shows that changing the adjustable parameters k0 , k1 , ..., k n 1 makes it possible to obtain any desired transfer function of the MS, wherein each adjustable parameter influences one coefficient of the transfer function. Changing these parameters leads to variation of the transfer function in the reduced noise component, which results in the noise increase in the output signal of the MS. The total DME evaluation of the measuring transducer is based on the structure of the additional channel of the DME evaluation shown in Figure 2 [6].

431

A signal sent to the channel input has a structure similar to the signal at the input of the MS correcting unit: a0  k 0 b0 , *  U ( p )  U ( p )  WS  p   V ( p )  eMS ( p )WS  p   V ( p ) e0 ( p )  yS ( p )  yM ( p )

(5)

where eMS ( p )  U ( p)  U * ( p) is the Laplace representation of the MS error. This makes it possible to correct the DME evaluation in the same way as the sensor signal. The transfer function of the DME evaluation channel under no-noise condition at the sensor output is as follows:

We  p  

e1*  p  eMS  p 

V ( p)  0 , bm a0  k0  ...  b1 p  b0   p n   an 1  kn1  p n 1  ...   a1  k1  p   a0  k0  b0

(6)

pm

where e1*  p  is the DME channel output signal.

Fig. 2. Block diagram of the DME evaluation channel.

The following transfer function of the DME evaluation channel in its reduced noise component is determined when no useful input signal exists:

432

WeNC  p   

V ( p) 

e1  p  eMS  p  U ( p )0

WMS ( p ) .  1  WMS ( p )  WS ( p )  1  WMS ( p ) WMS ( p ) V ( p)  WS ( p )

(7)

The transfer functions (6) and (7) are similar to the transfer functions of the MS and its coefficients are likewise influenced by the adjustable parameters of the DME evaluation channel k 0 , k1 , ..., k n 1 . Therefore, the criterion of the coefficients k0 , k1, ..., kn 1 adjustment to the value optimal for this DME evaluation is considered to be a minimum of the DME evaluation obtained from the additional evaluation channel. The presented above block diagram shows all relations essential for a measuring transducer implementation in the analogue form. Furthermore, it can be regarded as a structural representation of the differential equations that should be numerically integrated when implementing the MS as a program for digital processing of a sensor signal. 3. Method of the Measuring System Adaptation to Sensor Output Noises

Obtaining the optimal transfer functions WMS  p  и WNS  p  requires control

of all ki , i = 0, 1, ..., n  1 adjustable parameters. This is attended with serious

computational problems and is difficult-to-implement in practice for the largeorder systems. Therefore, the procedure of the MS parameters adjustment for the DME evaluation, derived from the additional channel, is implemented with respect to a generalized parameter  , which is associated with all zeros and poles in the transfer function of the MS. To create an adaptive algorithm of the DME correction for the arbitraryorder MS, the following method can be used. The transfer function of the MS is represented as follows:

WMS  p  

where

1 , 2 , ..., n

 p  1  p  2   ...  p  m    p  1  p  2   ...  p  n 

 ...  2 ... 

1 2

n

1

m

are poles of the MS transfer function;

  1nm ,

1 , 2 , ..., m

(8) are zeros

of the MS transfer function. Meanwhile, some poles and zeros can be complex. Consequently, the transfer function of the sensor is

433

WS  p   where

 p  11  p  12   ...   p  1m    p  11  p  12   ...   p  1n 

 ...  12  ... 

11 12

1n

11

1m

  1n  m ,

are poles of the sensor transfer function;

11 , 12 , ..., 1n

(9)

11 , 12 , ..., 1m

are zeros of the sensor transfer function. Suppose the MS is described by the transfer function with dominant poles lying in the complex plane substantially closer to the imaginary axis than the others. There is no loss of generality in supposing that corresponding pairs of dominant poles are complex conjugate as follows: 1

 1  jβ1,

2

 1  jβ1 .

(10)

Suppose respectively that corresponding pairs of dominant zeros are also complex conjugate: 1



1

j 1,

2



1

 j 1.

(11)

Limiting the search area of parameters, let us relate them with a variable  by linear dependencies:  1  a1 , β1  b1 , 1  m1 , 1  n1 , where a1 , b1 ,

m1 , n1 are constants. Taking into account (10) and (11), the equations for the transfer functions of the MS and the sensor from (8) and (9) are presented as follows: WS  p  

 12 1  a12i    p 2  21 pm1i  12  m12i  n12i   n 2

i 1 m 2

 12  m12i  n12i  i 1

WMS  p  

m 2



i 1

  p 2  21 p  12 1  a12i   n 2

n 2

  2  mi2  ni2  i 1

(12)

.

(13)

i 1

  2 1  ai2    p 2  2 pmi   2  mi2  ni2  

i 1 m 2

,

m 2



i 1 n 2

  p 2  2 p   2 1  ai2   i 1

Having substituted the transfer functions from (12) and (13) to the equation for the MS DME evaluation, we find out that the process of the MS parameters adjustment to the DME evaluation, obtained from the additional channel, can be performed by varying only one adjustable parameter  of the MS. This greatly simplifies the adaptation algorithm of the dynamic parameters for the MS with the system order of n  2 . The adjustable parameter  variation produces changing the time constants of the whole MS. On the one hand, this results in the reduction of the inherent

434

DME of the MS due to the bandwidth extension and on the other hand, there is a noticeable noise increase in the output signal of the MS. Thus, determining of the adjustable parameter value that is optimal for the MS DME evaluation, which, in its turn, depends on the noise parameters reduced to the sensor output, should be built as a search of extremum (minimum) of the unimodal function e1 t ,   that is the output signal of the DME evaluation channel. The value of the adjustable parameter, at which the minimal standard deviation of the DME evaluation signal is achieved, is taken as optimal. In order to find the minimum of the function, the optimal sequential algorithm of the function extremum search is applied. The best solution to this class of problems is the Fibonacci search technique [7], but it is not acceptable in this case since it requires the exact knowledge of the number N of the supposed observations on the minimized function values. The modified goldensection search method [7] does not depend on the intended number N and, at sufficiently large N , is as effective as the Fibonacci plan. Its use in the algorithm of searching an optimal value of the adjustable parameter provides the required accuracy and the rate of the algorithm convergence to the best value of  , at which the MS DME evaluation is minimal. An essential point in the search method is the problem of the dynamic error e1 t ,   evaluation reliability. This is provided by the total DME minimization during the MS adjustable parameters optimization. Based on the DME evaluation, a decision about the optimal accuracy of the MS is made. In order to evaluate the reliably in relation to the true error of the MS, the equation for the DME evaluation derived from the block diagram shown in Figure 2 should be written as the respective signals Fourier transformations: e1  j   U  js   WMS  js   1  WMS  js   V  jn 

WMS  jn   1  WMS  jn   WS  jn 

.

(14)

where s is the angular frequency of the measured input signal; n is the angular frequency of the noise signal reduced to the sensor output. The first term in (14) is the inherent dynamic component of the error, the second term is the noise component of the DME. The Fourier transformation of the true DME signal decomposed into its inherent component and the noise component is defined as eMS  j   U  js   1  WMS  js    V  jn 

WMS  jn  WS  jn 

.

(15)

435

In order to determine the limits of the method applicability, the difference between the Fourier transformation of the true DME value and its evaluation obtained from the additional DME evaluation channel is restricted as follows: eMS  j   e1  j    ,

(16)

where  is the maximum allowed difference between the true MS DME value and the DME evaluation derived from the DME evaluation channel. The last equation is a condition for an adequate assessment of the dynamic measurement error. The left part of (16) is expanded as follows: eMS  j   e1  j   U  js   1  WMS  js    V  jn   2

WMS  jn  WS  jn 

2

. (17)

Based on (17), the condition of an adequate DME evaluation at the harmonic input signal is defined as follows:

1  1  WMS  js   1  1 ,

(18)

where ε1 is small quantity. The value of the second term in (17) limited by  2 defines the condition of an adequate evaluation of the DME noise component at the harmonic noise signal as follows: WMS  j n    2 .

(19)

The fulfilment of the conditions (18) and (19) guarantees the reliability of the DME evaluation obtained from the DME evaluation channel. However, these conditions impose restrictions on the limit frequencies of the input measured signal and the noise signal, thus defining the scope of the optimal parameter search algorithm and requiring a priori information about the measured and noise signals. Nevertheless, the simulation results have shown that even if the noise frequency at the output of the measuring transducer differs by an order of magnitude from the calculated value (19), but, at the same time, the condition (18) is strictly satisfied, the absolute value of the mismatch between the true error and its evaluation of the DME evaluation channel does not exceed 30% of the measuring transducer true DME value. This is due to the fact that, when the total DME is intentionally decreased, the great increase in the noise component of the error is impossible in both the MS as a whole and the DME evaluation channel. This results in the inequality (18) satisfaction.

436

4. Simulation Study In order to illustrate the potential of the considered method for the DMEs correction, a computational simulation of the second-order MS was performed. The harmonic signal U t   1.00 sin 8.5t  , which provides strict satisfying the condition (18), was sent to the sensor input with the time constant T  0.01 s and the damping ratio ξ  0.3 . The noise signal V t   0.05 sin 100t  was added to the sensor output. A point without additional correction of the DME was chosen as a starting point for the algorithm. As a result of applying the algorithm, the bandwidth of the measuring transducer was expanded. The total time constant became T  0.0045 s , the resulting damping ratio became   0.7 . The DME decreased by about 50% compared with the measurement without additional correction. The values of the DME in the search starting point and the final point (at the values of the adjustable parameters optimal for the given noise level) are shown in Figure 3. At the given noise level, both upward and downward deviations of the MS adjustable parameter  from the optimum by 10%, resulted in the DME increase by 4% when simulating. This confirms the optimality of the parameters found.

Fig. 3. Simulation results of the second-order MS, where 1 is the signal of the measuring transducer dynamic error without correction and 2 is the same after correction.

437

5. Conclusions The examined MS with the measuring transducer model and the additional DME evaluation channel has the feedback factors of the model as adjustable parameters. These coefficients remained constant during the experimental data processing, and their selection is carried out in accordance with the MS error obtained from the additional DME evaluation channel. However, in the real data MSs, the characteristics of the noises in the output signal of the measuring transducer are known very approximately and may vary during the measurement. Moreover, the parameters of the sensor transfer function itself can be unstable or known with inaccuracy. This significantly reduces the accuracy of the input signal recovery of the MS with the constant adjustable parameters. On the other hand, one of the most promising trends in the development of the modern information-measuring technology is its intellectualization [8]. The characteristic features of this process are the use of special hardware for performing complex measuring procedures and the development of MSs capable of the processing method individualization including that by means of the adaptive changes in their own structure based on the measurement data accumulated a priori or obtained [9]. The intelligent MSs adaptive to the DME evaluation are of practical interest. It is possible in principle to adjust the feedback coefficients of a measuring transducer model directly during the measurement process or the measurement data processing. The parameters adjusting criterion in this case may be not only a convex functional of the DME, but also a specially formed signal of the DME evaluation, that is a continuous time function dependent on the adjustable parameters. It was this approach which produced the effective results in theory of self-adaptive control systems [10] and can be used in future research for MSs operating in the dynamic mode. References 1. 2.

A. N. Tikhonov and V. J. Arsenin, Solution of Ill-posed Problems (V.H. Winston & Sons, Washington, 1977). V. A. Granovskii, Models and methods of dynamic measurements: results presented by St. Petersburg metrologists, in Advanced Mathematical and Computational Tools in Metrology and Testing X, eds. F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A.B. Forbes, (World Scientific Publishing Company, Singapore, 2015), pp. 29–37.

438

3.

D. Dichev, H. Koev, T. Bakalova and P. Louda, A Measuring System with an Additional Channel for Eliminating the Dynamic Error, Journal of Theoretical and Applied Mechanics 44 (1), 3 (2014). 4. T. J. Esward, Investigating dynamic measurement applications through modelling and simulation, Technisches Messen 83 (10), 557 (2016). 5. A. L. Shestakov, Dynamic error correction method, IEEE Transactions on Instrumentation and Measurement 45 (1), 250 (1996). 6. A. L. Shestakov, Dynamic measurements based on automatic control theory approach, in Advanced Mathematical and Computational Tools in Metrology and Testing X, eds. F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A.B. Forbes, (World Scientific Publishing Company, Singapore, 2015), pp. 66–77. 7. J. Gregory, Numerical-Method for Extremal Problems in the Calculus of Variations and Optimal-control Theory, Bulletin of the American Mathematical Society 18 (1), 31 (1988). 8. Y. Fu, W. Yang, O. Xu, L. Zhou and J. Wang, Soft Sensor Modelling by Time Difference, Recursive Partial Least Squares and Adaptive Model Updating, Measurement Science and Technology 28 (4), (2017). 9. V. N. Ivanov and G. I. Kavalerov, Theoretical Aspects of the Design of Smart Measurement Systems. Measurement Techniques 34 (10), 978 (1991). 10. E. V. Yurasova, M. N. Bizyaev and A. S. Volosnikov, General Approaches to Dynamic Measurements Error Correction Based on the Sensor Model, Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control, Radio Electronics 16 (1), 64 (2015).

Author index Amar S 417

Heißelmann D 192 Honda K 357 Hornby R 128 Hutzschenreuter D 203

Batista E 186 Bect J 349 Beskachko V P 294 Bizyaev M N 153 Bodnar O 1 Bremser W 229

Infantino I 241 Jagan K 211 Jiang X 315

Chetvertakova E S 162 Chimitova E V 162 Chrétien S 128 Chunovkina A G 9 Cooper P 128 Cox M G 324

Kano H 357 Keller F 220 Keller M V 417 Kern S 229 Kistner T 192 Kniel K 220

Demeyer S 349 Dovica M 170 Ďuriš S 170, 401, 409 Ďurišová Z 170, 409

Lira I 38 Lucas P 186 Maisano D 235 Maiwald M 229 Maniscalco U 241 Mastrogiacomo L 235 Mazurek P 249 Meyer K 229 Morawski R Z 249, 375 Morimoto Y 357

Elster C 1 Ferreira M C 186 Fischer N 349, 417 Forbes A B 178, 211 Franke M 192 Frazer R 128 Fujita K 357 Fukushima S 357 Furtado A 186

Neumann C 307 Numano T 357

Gamo M 357 Godinho I 186 Granovskii V A 20 Grientschnig D 38 Guhl S 229

Ono J 357 Oshima Y 357 Pagani L 48 Palenčár R 401, 409 Palenčár J 409 439

440

Paul A 229 Pavese F 58 Pearce J V 257 Pellegrino M S 365 Philonenko P 265 Possolo A 70, 273 Postovalov S 265

Stroh R 349 Sun W 128 Suzuki T 357

Qi Q 315

Vanacore A 365 Vazquez E 349 Volosnikov A S 153, 427

Rieutord T 286 Romanov V A 294 Rost K 192 Rottner L 286 Ruhm K H 85 Rusby R L 257 Sancanti M 417 Sayanca I L 307 Schwehn C 192 Scott P J 48, 119, 315 Shinohara N 357 Shirono K 324 Siraya T N 332 Smith T O M 341 Stein M 220 Stepanov A 9

Takehara H 357 Takeshita J 357 Trampert K 307

Wagner J 375 Wander L 229 Warsza Z L 383 Wendt K 192 Wimmer G 170, 401, 409 Witkovský V 401, 409 Yamamoto K 357 Yardin C 417 Yurasova E S 427 Zabolotnii S V 383 Zhang J 128 Zhang N F 138

Keyword index accuracy 20 adjoint functions 119 air quality 178 Allan variance 332 analysis 1 analysis, data 178 ANOVA, one-way 357 Bayesian, inference 38, 211, 417, 249 Bayesian, model 349 best linear unbiased estimator 138 Bezier, triangular 48 Calibration 178, 186, 401 calibration, linear 170 calibration, pitch 220 calibration, polynomial function 417 Category Ontology Language (COL) 315 category theory 315 CCM.M-K7 1 Characteristic Function Approach 409 Chebyshev 203 classification 58 cognitive 241 comparison, interlaboratory 357 computer simulation 294 consistency check 324 convection 38 coordinate metrology 203 copula 409 Coriolis flowmeter 294 critical ill-conditioning 20 data, right-censored 265 distribution

measurement 307 dynamic measurement error 153, 427 dynamically distorted signal recovery 153, 427 effect, random 58, 162 effect, systematic 58 emission monitoring 349 EN 14181:2014. 341 349 equation, approximate 20 ergodic theorem 138 error, dynamic 153, 427 error, random 58 error, separating 220 error, systematic 58 errors-in-variables 38, 401, 417 estimator 383 estimator, best linear unbiased 138 experimental 20 failure, probability 349 fallure, detection. 375 filtering technique 286 first principles 229 fitting software 203 fitting, hole pattern 203 flow 186 fluid–structure interaction 294 F-test 357 fuzzy variable 20 gas mixture 170 gears 220 Geometrical Product Specifications (GPS) 315 healthcare 249 ill-conditioning, critical 20 image merging 307 441

442

Indirect Hard Modellino 229 indirect light 307 integrating hyperparameter 349 intensity distribution 307 inter-laboratory comparison 357 interval variable 20 intratracheal administration testing 357 inverse problem 119 ISO 5725 357 ITS-90 257 Key Comparison 9, 324 knowledge modelling 315 k-type coefficient 365 kurtosis 383 Laplace method 417 large-volume metrology 235 least squares, generalized 235 least squares, partial 229 linking laboratory 324 liquidus slopes 257 localization, real-time 235 lock point 203 lung 357 mathematics 20, 85 maximum likelihood estimation 162 mean value 383 measurement 70, 85, 273 measurement error, dynamic 153, 427 measurement, equation 273 measurement, distribution 307 measurement, mass flow 294 measurement, plan 315 measurement, uncertainty 192, 273 measurement, wind 286 measuring machine, virtual coordinate 192 measuring system 153

measuring systems, adaptive 427 medical infusion instruments 186 metrology 20, 85, 119 metrology, coordinate 203 metrology, large-volume 235 Metropolis-Hastings algorithm 417 MIN3 test 265 mixed effects model 273 mixture model, Gaussian 128 modal control 427 model 20, 85, 273, 332, 427 model, extended random effects 1 model, mixed effects 273 model, mixture Gaussian 128 model, numerical 294 model, selection 273 model, statistical 273 Monte Carlo, Markov chain 211, 249 Monte Carlo, method 265, 409 Monte Carlo, simulation 192, 349, 365 multifidelity 178, 349 multi-target probe 235 near field 220 NMR Spectroscopy, online 229 non-Gaussian model 383 non-linear least-squares regression 138, 211 non-stationary process 138 numerical differentiation 375 observation 85, 273 outliers 9 parameter estimation 211 pitch calibration 228 polynomial calibration function 417 primary measuring transducer 427

443

probabilistic benchmarking procedures problem, inverse problem, two-sample process, control process, degradation process, stationary process, uncertainty assessment property property, dissipative qualitative quantitative quantity quantum mechanics radar impulse random effects model, extended random, effect random, error random, functions rater agreement real-time localization reference reference value regional comparison regression regularisation relation reliability repeatability reproducibility right-censored data robotics scattering transform Seebeck coefficient sensitivity analysis sensor, depth signal Signal and System Theory signal recovery, dynamically distorted

365 119 265 229 162 138 349 70 294 70 70 20 229 249 1 58, 162 58 332 365 235 1 324 324 38 375 85 162 357 357 265 241 128 257 286 375 85 85 153, 427

skewness 383 sliding mode 153 soft sensors 241 software testing, metrological 119 software, fitting 203 software, test 203 somatosensory system 241 stationary increment 332 statistical model 273 statistics 85 stochastic 85 stochastic, polynomial 383 structure 85 surface, metrology 128 surface, reconstruction 48 surface, texture 48 survival analysis 265 system 85 system, somatosensory 241 systematic effect 58 systematic error 58 temperature scale, continuous 401 test, data generator 203 test, goodness-of-fit 162 test, power 265 theorem, ergodic 138 thermocouples 257 thermometers, resistance 257 thermometry. contact 257 traceability 257 transfer standards 1 uncertainty 58, 70, 186, 417 uncertainty, confirmation 9 uncertainty, evaluation 170, 401 variance 383 virtual coordinate measuring machine 192 Wiener degradation model 162 wind measurement 286

This page intentionally left blank

Series on Advances in Mathematics for Applied Sciences Editorial Board N. Bellomo Editor-in-Charge Department of Mathematics Politecnico di Torino Corso Duca degli Abruzzi 24 10129 Torino Italy E-mail: [email protected]

M. A. J. Chaplain Department of Mathematics University of Dundee Dundee DD1 4HN Scotland C. M. Dafermos Lefschetz Center for Dynamical Systems Brown University Providence, RI 02912 USA J. Felcman Department of Numerical Mathematics Faculty of Mathematics and Physics Charles University in Prague Sokolovska 83 18675 Praha 8 The Czech Republic M. A. Herrero Departamento de Matematica Aplicada Facultad de Matemáticas Universidad Complutense Ciudad Universitaria s/n 28040 Madrid Spain S. Kawashima Department of Applied Sciences Engineering Faculty Kyushu University 36 Fukuoka 812 Japan

F. Brezzi Editor-in-Charge IMATI - CNR Via Ferrata 5 27100 Pavia Italy E-mail: [email protected]

M. Lachowicz Department of Mathematics University of Warsaw Ul. Banacha 2 PL-02097 Warsaw Poland S. Lenhart Mathematics Department University of Tennessee Knoxville, TN 37996–1300 USA P. L. Lions University Paris XI-Dauphine Place du Marechal de Lattre de Tassigny Paris Cedex 16 France B. Perthame Laboratoire J.-L. Lions Université P. et M. Curie (Paris 6) BC 187 4, Place Jussieu F-75252 Paris cedex 05, France K. R. Rajagopal Department of Mechanical Engrg. Texas A&M University College Station, TX 77843-3123 USA R. Russo Dipartimento di Matematica II University Napoli Via Vivaldi 43 81100 Caserta Italy

Series on Advances in Mathematics for Applied Sciences Aims and Scope This Series reports on new developments in mathematical research relating to methods, qualitative and numerical analysis, mathematical modeling in the applied and the technological sciences. Contributions rlated to constitutive theories, luid dynamics, kinetic and transport theories, solid mechanics, system theory and mathematical methods for the applications are welcomed. This Series includes books, lecture notes, proceedings, collections of research papers. Monograph collections on specialized topics of current interest are particularly encouraged. Both the proceedings and monograph collections will generally be edited by a Guest editor. High quality, novelty of the content and potential for the applications to modern problems in applied science will be the guidelines for the selection of the content of this series.

Instructions for Authors Submission of proposals should be addressed to the editors-in-charge or to any member of the editorial board. In the latter, the authors should also notify the proposal to one of the editors-in-charge. Acceptance of books and lecture notes will generally be based on the description of the general content and scope of the book or lecture notes as well as on sample of the parts judged to be more signiicantly by the authors. Acceptance of proceedings will be based on relevance of the topics and of the lecturers contributing to the volume. Acceptance of monograph collections will be based on relevance of the subject and of the authors contributing to the volume. Authors are urged, in order to avoid re-typing, not to begin the inal preparation of the text until they received the publisher’s guidelines. They will receive from World Scientiic the instructions for preparing camera-ready manuscript.

Series on Advances in Mathematics for Applied Sciences Published*: Vol. 74

Wavelet and Wave Analysis as Applied to Materials with Micro or Nanostructure by C. Cattani and J. Rushchitsky

Vol. 75

Applied and Industrial Mathematics in Italy II eds. V. Cutello et al.

Vol. 76

Geometric Control and Nonsmooth Analysis eds. F. Ancona et al.

Vol. 77

Continuum Thermodynamics by K. Wilmanski

Vol. 78

Advanced Mathematical and Computational Tools in Metrology and Testing eds. F. Pavese et al.

Vol. 79

From Genetics to Mathematics eds. M. Lachowicz and J. Miękisz

Vol. 80

Inelasticity of Materials: An Engineering Approach and a Practical Guide by A. R. Srinivasa and S. M. Srinivasan

Vol. 81

Stability Criteria for Fluid Flows by A. Georgescu and L. Palese

Vol. 82

Applied and Industrial Mathematics in Italy III eds. E. De Bernardis, R. Spigler and V. Valente

Vol. 83

Linear Inverse Problems: The Maximum Entropy Connection by H. Gzyl and Y. Velásquez

Vol. 84

Advanced Mathematical and Computational Tools in Metrology and Texting IX eds. F. Pavese et al.

Vol. 85

Continuum Thermodynamics Part II: Applications and Examples by B. Albers and K. Wilmanski

Vol. 86

Advanced Mathematical and Computational Tools in Metrology and Testing X eds. F. Pavese et al.

Vol. 87

Mathematical Methods for the Natural and Engineering Sciences (2nd Edition) by R. E. Mickens

Vol. 88

Road Vehicle Dynamics: Fundamentals of Modeling and Simulation by G. Genta and A. Genta

Vol. 89

Advanced Mathematical and Computational Tools in Metrology and Testing XI eds. A. B. Forbes et al.

*To view the complete list of the published volumes in the series, please visit: https://www.worldscibooks.com/series/samas_series.shtml