238 20 8MB
English Pages 446 Year 2015
This page intentionally left blank
World Scientific
Published by World Scientiic Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA ofice: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK ofice: 57 Shelton Street, Covent Garden, London WC2H 9HE
Library of Congress Cataloging-in-Publication Data Advanced mathematical and computational tools in metrology and testing X / edited by Franco Pavese (Istituto Nazionale di Ricerca Metrologica, Italy) [and four others]. pages cm. -- (Series on advances in mathematics for applied sciences ; volume 86) Includes bibliographical references and index. ISBN 978-9814678612 (hardcover : alk. paper) 1. Metrology. 2. Statistics. I. Pavese, Franco, editor. QC88.A38 2015 389'.1015195--dc23 2015008632
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
Copyright © 2015 by World Scientiic Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
Printed in Singapore
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (p. v)
Foreword This volume contains original refereed worldwide contributions. They were prompted by presentations made at the tenth Conference held in St. Petersburg, Russia, in September 2014 on the theme of advanced mathematical and computational tools in metrology and also, in the title of this book series, in testing. The aims of the IMEKO Committee TC21 “Mathematical Tools for Measurements” (http://www.imeko.org/index.php/tc21-homepage) supporting the activities in this field and this book Series were •
To present and promote reliable and effective mathematical and computational tools in metrology and testing.
•
To understand better the modelling, statistical and computational requirements in metrology and testing.
•
To provide a forum for metrologists, mathematicians, software and IT engineers that will encourage a more effective synthesis of skills, capabilities and resources.
•
To promote collaboration in the context of EU and International Programmes, Projects of EURAMET, EMPR, EA and of other world Regions, MRA requirements.
•
To support young researchers in metrology, testing and related fields.
•
To address industrial and societal requirements.
The themes in this volume reflect the importance of the mathematical, statistical and numerical tools and techniques in metrology and testing and, also keeping the challenge promoted by the Metre Convention, to access a mutual recognition for the measurement standards. Torino, February 2015
The Editors
v
This page intentionally left blank
Contents Foreword
v
Fostering Diversity of Thought in Measurement Science F. Pavese and P. De Bièvre
1
Polynomial Calibration Functions Revisited: Numerical and Statistical Issues M.G. Cox and P. Harris
9
Empirical Functions with Pre-Assigned Correlation Behaviour A.B. Forbes
17
Models and Methods of Dynamic Measurements: Results Presented by St. Petersburg Metrologists V.A. Granovskii
29
Interval Computations and Interval-Related Statistical Techniques: Estimating Uncertainty of the Results of Data Processing and Indirect Measurements V.Ya. Kreinovich
38
Classification, Modeling and Quantification of Human Errors in Chemical Analysis I. Kuselman
50
Application of Nonparametric Goodness-of-Fit Tests: Problems and Solution B.Yu. Lemeshko
54
Dynamic Measurements Based on Automatic Control Theory Approach A.L. Shestakov
66
Models for the Treatment of Apparently Inconsistent Data R. Willink
78
Model for Emotion Measurements in Acoustic Signals and Its Analysis Y. Baksheeva, K. Sapozhnikova and R. Taymanov
90
vii
viii
Uncertainty Calculation in Gravimetric Microflow Measurements E. Batista, N. Almeida, I. Godinho and E. Filipe
98
Uncertainties Propagation from Published Experimental Data to Uncertainties of Model Parameters Adjusted by the Least Squares V.I. Belousov, V.V. Ezhela, Y.V. Kuyanov, S.B. Lugovsky, K.S. Lugovsky and N.P. Tkachenko
105
A New Approach for the Mathematical Alignment Machine Tool-Paths on a Five-Axis Machine and Its Effect on Surface Roughness S. Boukebbab, J. Chaves-Jacob, J.-M. Linares and N. Azzam
116
Goodness-of-Fit Tests for One-Shot Device Testing Data E.V. Chimitova and N. Balakrishnan
124
Calculation of Coverage Intervals: Some Study Cases A. Stepanov, A. Chunovkina and N. Burmistrova
132
Application of Numerical Methods in Metrology of Electromagnetic Quantities M. Cundeva-Blajer
140
Calibration Method of Measuring Instruments in Operating Conditions A.A. Danilov, Yu.V. Kucherenko, M.V. Berzhinskaya and N.P. Ordinartseva
149
Statistical Methods for Conformity Assessment When Dealing with Computationally Expensive Systems: Application to a Fire Engineering Case Study S. Demeyer, N. Fischer, F. Didieux and M. Binacchi
156
Overview of EMRP Joint Reserch Project NEW06 “Traceability for Computationally-Intensive Metrology” A.B. Forbes, I.M. Smith, F. Härtig and K. Wendt
164
Stable Units of Account for Economic Value Correct Measuring N. Hovanov
171
A Novel Approach for Uncertainty Evaluation Using Characteristic Function Theory A.B. Ionov, N.S. Chernysheva and B.P. Ionov
179
Estimation of Test Uncertainty for TraCIM Reference Pairs F. Keller, K. Wendt and F. Härtig
187
ix
Approaches for Assigning Numerical Uncertainty to Reference Data Pairs for Software Validation G.J.P. Kok and I.M. Smith
195
Uncertainty Evaluation for a Computationally Expensive Model of a Sonic Nozzle G.J.P. Kok and N. Pelevic
203
EllipseFit4HC: A MATLAB Algorithm for Demodulation and Uncertainty Evaluation of the Quadrature Interferometer Signals R. Köning, G. Wimmer and V. Witkovský
211
Considerations on the Influence of Test Equipment Instability and Calibration Methods on Measurement Uncertainty of the Test Laboratory A.S. Krivov, S.V. Marinko and I.G. Boyko
219
A Cartesian Method to Improve the Results and Save Computation Time in Bayesian Signal Analysis G.A. Kyriazis
229
The Definition of the Reliability of Identification of Complex Organic Compounds Using HPLC and Base Chromatographic and Spectral Data E.V. Kulyabina and Yu.A. Kudeyarov
241
Uncertainty Evaluation of Fluid Dynamic Simulation with One-Dimensional Riser Model by Means of Stochastic Differential Equations E.A.O. Lima, S.B. Melo, C.C. Dantas, F.A.S. Teles and S. Soares Bandiera
247
Simulation Method to Estimate the Uncertainties of ISO Specifications J.-M. Linares and J.M. Sprauel
252
Adding a Virtual Layer in a Sensor Network to Improve Measurement Reliability U. Maniscalco and R. Rizzo
260
Calibration Analysis of a Computational Optical System Applied in the Dimensional Monitoring of a Suspension Bridge L.L. Martins, J.M. Rebordão and A.S. Ribeiro
265
x
Determination of Numerical Uncertainty Associated with Numerical Artefacts for Validating Coordinate Metrology Software H.D. Minh, I.M. Smith and A.B. Forbes
273
Least-Squares Method and Type B Evaluation of Standard Uncertainty R. Palen ár, S. uriš, P. Pavlásek, M. Dovica, S. Slosar ík and G. Wimmer
279
Optimising Measurement Processes Using Automated Planning S. Parkinson, A. Crampton and A.P. Longstaff
285
Software Tool for Conversion of Historical Temperature Scales P. Pavlásek, S. uriš, R. Palen ár and A. Merlone
293
Few Measurements, Non-Normality: A Statement on the Expanded Uncertainty J. Petry, B. De Boeck, M. Dobre and A. Peruzzi
301
Quantifying Uncertainty in Accelerometer Sensitivity Studies A.L. Rukhin and D.J. Evans
310
Metrological Aspects of Stopping Iterative Procedures in Inverse Problems for Static-Mode Measurements K.K. Semenov
320
Inverse Problems in Theory and Practice of Measurements and Metrology K.K. Semenov, G.N. Solopchenko and V.Ya. Kreinovich
330
Fuzzy Intervals as Foundation of Metrological Support for Computations with Inaccurate Data K.K. Semenov, G.N. Solopchenko and V.Ya. Kreinovich
340
Testing Statistical Hypotheses for Generalized Semiparametric Proportional Hazards Models with Cross-Effect of Survival Functions M.A. Semenova and E.V. Chimitova
350
Novel Reference Value and DOE Determination by Model Selection and Posterior Predictive Checking K. Shirono, H. Tanaka, M. Shiro and K. Ehara
358
Certification of Algorithms for Constructing Calibration Curves of Measuring Instruments T. Siraya
368
xi
Discrete and Fuzzy Encoding of the ECG-Signal for Multidisease Diagnostic System V. Uspenskiy, K. Vorontsov, V. Tselykh and V. Bunakov
377
Application of Two Robust Methods in Inter-Laboratory Comparisons with Small Samples .T. Volodarsky and Z.L. Warsza
385
Validation of CMM Evaluation Software Using TraCIM K. Wendt, M. Franke and F. Härtig
392
Semi-Parametric Polynomial Method for Retrospective Estimation of the Change-Point of Parameters of Non-Gaussian Sequences S.V. Zabolotnii and Z.L. Warsza
400
Use of a Bayesian Approach to Improve Uncertainty of Model-Based Measurements by Hybrid Multi-Tool Metrology N.-F. Zhang, B.M. Barnes, R.M. Silver and H. Zhou
409
Application of Effective Number of Observations and Effective Degrees of Freedom for Analysis of Autocorrelated Observations A. Zieba
417
Author Index
425
Keywords Index
427
This page intentionally left blank
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 1–8)
FOSTERING DIVERSITY OF THOUGHT IN MEASUREMENT SCIENCE FRANCO PAVESE Torino, Italy PAUL DE BIÈVRE Kasterlee, Belgium The contrast between single thought and diversity is long since inherent to the search for ‘truth’ in science—and beyond. This paper aims at summarizing the reasons why scientists should be humble in contending about methods for expressing experimental knowledge. However, we suppose that there must be reasons for the present trend toward selection of a single direction in thinking rather than using diversity as the approach to increase confidence that we are heading for correct answers: some examples are listed. Concern is expressed that this trend could lead to ‘political’ decisions, hindering rather than promoting, scientific understanding.
1. Introduction In many fields of science we think we see increasing symptoms of an attitude that seems to be fostered by either the anxiety to take a decision, or by the intention to attempt to ‘force’ a conclusion upon the reader. Limiting ourselves to a field where we have some competence, measurement science, a few sparse examples of exclusive choices have been selected in no particular order, including two documents that are widely assumed to master this field: − The Guide for the Expression of Uncertainty in Measurement (GUM) [1], –being which is now in favour of choosing a single framework, the ‘uncertainty approach’, discontinuing the ‘error approach’ [2, 3]; –seeming now to be heading for a total ‘Bayesian approach’ replacing all ‘frequentist’ approaches [4–6]. − The International System of Measurement Units (SI) [7], which now seems to be proposed for a fundamental change by the Consultative Committee for Units (CCU) to the CIPM and CGPM [8,9], to change, with the ‘fundamental’ or ‘reference constants’ replacing ‘physical states’ or ‘conditions’ for the definitions of units. 1
2
− −
− −
The VIM with the basic change from “basic and general terms” to “basic and general concepts and associated terms”. The “recommended values” of the numerical values of fundamental constants, atomic masses, differences in scales …, e.g., specific data from CODATA [8,10] being restricted to one single ‘official’ set. The stipulation of numerical values in specific frames, claimed to have universal and permanent validity. The traditional classification of the errors/effects in random and systematic, with the concept of “correction” associated to the latter, being pretended to be exclusive.
This paper does not intend to discuss any specific example, since its focus is not on the validity of any specific choice, but on the importance of creating the choice. The two issues should not be confused with each other. Parts of the paper may look ‘philosophical’, but they are only intended to concern philosophy of science, i.e. not to be extraneous to scientist’s interests: any concerned scientist should be aware of the difference between ‘truth’ and ‘belief’. Accepting diversity in thinking is a mental attitude, which should never be ignored or considered a mere option for scientists. It is a discriminating issue. The paper is only devoted to this science divide: either one goes to single thought, or one picks up from diversity a wider view of solutions to scientific problems. We think that disappointment with this position can be expected from single thought advocates only. 2. Truth in philosophy and science The gnoseological1 issue of truth is itself a dilemma, since different fundamental aspects can be attributed to this concept: as one can have truth by correspondence, by revelation (disclosure), by conformity to a rule, by consistency (coherence), by benefit. They are not reciprocally alternative, are diverse and not-reducible to each other [11]. Several of them are appropriate in science. With particular respect to consistency, it is relevant to recall a sentence of Steven G. Vick: “Consistency is indifferent to truth. Once can be entirely consistent and still be entirely wrong” [12]. In the search for truth, the history of thinking shows that general principles are typically subject to irresolvable criticism, leading to—typically two— contrasting positions: it is the epistemological dilemma, long since recognized 1
Gnoseological: it means concerning with the philosophy of knowledge and the human faculties for learning.
3
(e.g., David Hume (1711-1776): “Reason alone is incapable of resolving the various philosophical problems”) and has generated several ‘schools of thinking’: pragmatism, realism, relativism, empirism, …. Modern science, as basically founded on one of the two extreme viewpoints—empiric, as opposed to metaphysical—is usually considered exempt from the above weakness. Considering doubt as a shortcoming, scientific reasoning aims at reaching, if not truth, at least certainties, and many scientists tend to believe that this goal can be fulfilled in their field of competence. Instead, they should remember the Francis Bacon (1605) memento: “If we begin with certainties, we shall end in doubts; but if we begin with doubts, and are patient with them, we shall end with certainties” … still an optimistic one. 3. Does certainty imply objectivity? The rise of the concept of uncertainty as a remedy in science As alerted by philosophers, the belief in certainty simply arises from the illusion of science being able to attain objectivity as a consequence of being based on information drawn from the observation of natural phenomena, and considered as ‘facts’. A fact, as defined in English dictionaries, means: “A thing that is known or proven to be true” [Oxford Dictionary] “A piece of information presented as having objective reality” [MerriamWebster Dictionary]. Objectivity and cause-effect-cause chain are the pillars of single-ended scientific reasoning. Should this be the case, the theories developed for systematically interlocking the empirical experience would similarly consist of a single building block, with the occasional addition of ancillary building blocks accommodating specific new knowledge. This is a ‘static’ vision of science (and of knowledge in general). In that case “Verification” [13]” would become unnecessary, ‘Falsification’ [14] a paradox, and the road toward any “Paradigm change” or “Scientific revolution” [15] prevented. On the contrary, confronted with the evidence available since long, and reconfirmed everyday that the objectivity scenario does not normally apply, the concept of uncertainty came in. To be noted that, strictly speaking, it applies only if the object of the uncertain observations is the same (the ‘measurand’ in measurement science), hence the issue is not resolved, the problem is simply shifted to another concept, the uniqueness of the measurand, a concept of nonrandom nature, leading to “imprecision”. This term is used here in the sense indicated in [16]: “Concerning non-precise data, uncertainty is called
4
imprecision … is not of stochastic nature … can be modelled by the so-called non-precise numbers”. 4. From uncertainty to chance: the focus on decision in science Confronted with the evidence of diverse results of observations, modern science way-out was to introduce the concept of ‘chance’—replacing ‘certainty’. This was done with the illusion of reaching firmer conclusions by establishing a hierarchy in measurement results (e.g. based on the frequency of occurrence), in order to take a ‘decision’ (i.e. for choosing from various measurement results). The chance concept initiated the framework of ‘probability’, but expanded later into several other streams of thinking, e.g., possibility, fuzzy, cause-effect, interval, non-parametric, … reasoning frames depending on the type of information available or on the approach to it. Notice that philosophers of science warned us about the logical weakness of the probability approach: “With the idol of certainty (including that of degrees of imperfect certainty or probability) there falls one of the defences of obscurantism which bars the way of scientific advance” [14] (emphasis added). Limiting ourselves to the probability frame, any decision strategy requires the choice of an expected value as well of the limits of the dispersion interval of the observations. The choice of the expected value (‘expectation’: “a strong belief that something will happen or be the case” [Oxford Dictionary]) is not unequivocal, since several location parameters are offered by probability theory—with a ‘true value’ still standing in the shade, deviations from which are called ‘errors’. As to data dispersion, most theoretical frameworks tend to lack general reasons for bounding a probability distribution, whose tails thus extend without limits to infinitum. However, without a limit, no decision is possible; and, the wider the limit, the less meaningful a decision is. Stating a limit becomes itself a decision, assumed on the fitness of the intended use of the data. In fact, the terms used in this frame clearly indicate the difficulty and the meaning that is applicable in this context: ‘confidence level’ (confidence: “the feeling or belief that one can have faith in or rely on someone or something” [from Oxford Dictionary]), or ‘degree of belief’ (belief: “trust, faith, or confidence in (someone or something)” or “an acceptance that something exists or is true, especially one without proof” [ibidem])
5
Still about data dispersion: one can believe in using truncated (finite tailwidth) distributions. However, reasons for truncation are generally supported by uncertain information. In rare cases it may be justified by theory, e.g. a bound to zero –itself not normally reachable exactly (experimental limit of detection). Again, stating limits becomes itself a decision, also in this case, on the fitness for the intended use of the data. 5. The fuzziness of decision and the concept of risk in science The ultimate common goal of any branch of science is to communicate measurement results and to perform robust prediction. Prediction is necessary to forecast, i.e. to decision. However, what about the key term ‘decision’? When (objective) reasoning is replaced by choice, a decision can only be based on (i) a priori assumptions (for hypotheses), or (ii) inter-subjectively accepted conventions (predictive for subsequent action). However, hypotheses cannot be proved, and inter-subjective agreements are strictly relative to a community and for a given period of time. The loss of certainty resulted in the loss of uniqueness of decisions, and the concept of ‘risk’ emerged as a remedy. Actually, any parameter chosen to represent a set of observations becomes ‘uncertain’ not because it must be expressed with a dispersion attribute associated to an expected value, but because the choice of both parameters is the result of decisions. Therefore, when expressing an uncertain value the components are not two (best value and its uncertainty), but three, the third being the decision taken for defining the values of the first two components, e.g., the chosen width of the uncertainty interval, the chosen ‘level’ of confidence, …. A decision cannot be ‘exact’ (unequivocal). Any decision is fuzzy. The use of risk does not alleviate the issue: if a decision cannot be exact, the risk cannot be zero. In other words: the association of a risk to a decision, a recent popular issue, does not add any real benefit with respect to the fundamental issue. Risk is only zero for certainty, so zero risk is unreachable. This fact has deep consequence, as already expressed by Karl Popper in 1936: “The relations between probability and experience are also still in need of clarification. In investigating this problem we shall discover what will at first seem an almost insuperable objection to my methodological views. For although probability statements play such a vitally important role in empirical science, they turn out to be in principle impervious to strict falsification.” [14]
6
6. The influence of the observer in science In conclusion, chance is a bright prescription for working on symptoms of the disease, but is not a therapy for its deep origin, subjectivity. In fact, the very origin of the problem is related to our knowledge interface—human being. It is customary to make a distinction between the ‘outside’ and the ‘inside’ of the observer, ‘the ‘real world’ and the ‘mind’. We are not fostering here a vision of the world as a ‘dream’: there are solid arguments for conceiving a structured and reasonably stable reality outside us (objectivity of the “true value”). However, this distinction is one of the reasons having generated a dichotomy since at least since a couple of centuries, between ‘exact sciences’ and other branches, often called ‘soft’, like psychology, medicine, economy. For ‘soft’ science we are ready to admit that the objects of the observations tend to be dissimilar, because every human individual is dissimilar from any other. In ‘exact science’ we are usually not ready to admit that the human interface between our ‘mind’ and the ‘real world’ is a factor of influence affecting very much our knowledge. Mathematics stay in between, not being based on the ‘real world’ but on an ‘exact’ construction of concepts based in our mind. 7. Towards an expected appropriate behaviour of the scientist All the above should suggest scientists to be humble about contending on methods for expressing experimental knowledge—apart from obvious mistakes (“blunders”). Different from the theoretical context, experience can be shared to a certain degree, but leads, at best, to a shared decision. The association of a ‘risk’ to a decision, a relatively recent very popular issue, does not add any real benefit with respect to the fundamental issue, and this new concept basically is merely the complement to one of the concept of chance: it is zero for certainty, zero risk being an unreachable asymptote. For the same reason, one cannot expect that a single decision be valid in all cases, i.e. without exceptions. In consequence, no single frame of reasoning leading to a specific type of decision can be expected to be valid in all cases. The logical consequence of the above should be, in most cases, that not all decisions (hence all frames of reasoning) are necessarily mutually exclusive. Should this be the case, diversity rather becomes richness by deserving a higher degree of confidence in that we are pointing to the correct answers. Also in science, ‘diversity’ is not always a synonym of ‘confusion’, a popular way to
7
contrast it, rather it is an invaluable additional resource leading to better understanding. This fact is already well understood in experimental science, where the main way to detect systematic effects is to diversify the experimental methods. Why the diversifying methodology should not extend also to principles? It might be argued that the metrological traceability requirement—a fundamental one not only in metrology but in general in measurement—may conflict with diversity, since metrological traceability requires metrological criteria as given in [3], potentially creating a conflict between diversity and uniformity invoking the principle of (decision) efficiency in science of measurement. Based on the meaning of “metrological traceability” and of “measurement result” involved in it, as defined in [3], we do not see a possible conflict in allowing for diversity. Take, for example, the well-known issue of the frequentist versus Bayesian treatments: both are used depending on the decision of the single scientist, without ever having been considered, at our knowledge, to have affected the validity of any metrological traceability assessment. The origin of the trend indicated may be due to an incorrect assignment to a scientific Commission asked to reach a single ‘consensus’ outcome instead of a rationally-compounded digest of the best information/knowledge available. However the consequence would be politics (needing decisions) leaking into science (seeking for understanding); a potential trend also threatening scientific honesty. References 1. 2. 3.
4.
5.
Guide for the expression of uncertainty in measurement; JCGM 00:2008, ISO Geneva, at http://www.bipm.org/en/publications/guides/gum.html GUM Anniversary Issue, Metrologia Special Issue, 51 (2014) S141–S244. International vocabulary of metrology – Basic and general concepts and associated terms – VIM, 3rd edition, 2012 (2008 with minor corrections), ISO, Geneva, at http://jcgm.bipm.org/vim B. Efron, Bayesians, Frequentists, and Scientists, Technical Report No. 2005-1B/230, Janauary 2005, Division of Biostatistics, Standford, California 94305-4065. R. Willink and R. White, Disentangling Classical and Bayesian Approaches to Uncertainty Analysis, Doc. CCT/12-07, Comité Consultatif de Thermométrie, BIPM, Sèvres (2012).
8
6. 7. 8. 9. 10.
11. 12. 13. 14.
15. 16.
Stanford Encyclopedia of Philosophy, Interpretations of Probability, http://plato.stanford.edu/entries/probability-interpret/, pp 40. http://www.bipm.org/en/measurement-units/ http://www.bipm.org/en/measurement-units/new-si/ F. Pavese, How much does the SI, namely the proposed ‘new SI’, conform the spirit of the Metre Treaty?, ACQUAL, 19 (2014) 307–314 F. Pavese, Some problems concerning the use of the CODATA fundamental constants in the definition of measurement units, Letter to the Editor, Metrologia 51 (2014) L1–L4. N. Abbagnano, Dictionary of Philosophy (in Italian), UTET, Torino (1971). S.G. Vick, Degrees of Belief: Subjective Probability and Engineering Judgment, ASCE Publications (2002). L. Wittgenstein, Philosophical Investigations (Translated by G. E. M. Anscombe), Basil Blackwell, Oxford, 1st edition (1953). K. Popper, The Logic of Scientific Discovery (Taylor & Francis e-Library ed.). London and New York: Routledge / Taylor & Francis e-Library (2005). T.S. Kuhn, The Structure of Scientific Revolutions. 3rd ed. Chicago, IL: University of Chicago Press (1996). R. Viertl, Statistical Inference with imprecise data, in Probability and Statistics, in Encyclopedia of Life Support Systems (EOLSS), Developed under the Auspices of the UNESCO, Eolss Publishers, Oxford, UK, [http://www.eolss.net] (2003).
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. –)
POLYNOMIAL CALIBRATION FUNCTIONS REVISITED: NUMERICAL AND STATISTICAL ISSUES MAURICE COX AND PETER HARRIS National Physical Laboratory, Teddington, Middlesex TW11 0LW, UK E-mail: [email protected] The problem of constructing a polynomial calibration function is revisited, paying attention to the representation of polynomials and the selection of an appropriate degree. It is noted that the monomial representation (powers of the ‘natural’ variable) is inferior to the use of monomials in a normalized variable, which in turn is bettered by a Chebyshev representation, use of which also gives stability and insight. Traditional methods of selecting a degree do not take fully into account the mutual dependence of the statistical tests involved. We discuss degree-selection principles that are more appropriate. Keywords: calibration, polynomial representation, uncertainty, degree selection
1. Introduction Calibration consists of two stages.1 In stage 1 a relation is established between values provided by measurement standards and corresponding instrument response values. In stage 2 this relation is used to obtain measurement results from further response values. We consider polynomial calibration functions that describe the response variable y in terms of the stimulus variable x. Polynomials of various degrees, determined by least squares, are extensively used as empirical calibration functions in metrology. A polynomial of degree n has n+1 coefficients or parameters b. An estimate b b of b is to be determined given calibration data (xi , yi ), i = 1, . . . , m, provided by a measuring system. For a further response value y0 , the polynomial is then used inversely to predict the corresponding stimulus value x0 . The xi and yi are assumed to be realizations of random variables having Gaussian distributions (not necessarily independent). Section 2 considers uncertainty and model validity, Sect. 3 the representation of polynomials, Sect. 4 measures of consistency, Sect. 5 an example of thermocouple calibration and Sect. 6 our conclusions.
9
10
2. Uncertainty and model validity Calibration data invariably have associated measurement uncertainty (uncertainties associated with the xi or the yi or both), which means in the first stage there will be uncertainty associated with b b in the form of a covariance matrix U bb . In turn, in the second stage, U bb and the standard uncertainty associated with y0 contribute to the standard uncertainty u(x0 ) associated with x0 . Given the uncertainties associated with the calibration data (most generally in the form of a covariance matrix), an appropriate numerical algorithm2 is used to produce b b and U bb . Once a candidate polynomial model has been fitted to the data, it is necessary to determine the extent to which the model explains the data, ideally in a parsimonious way. Only when the model is acceptable in this regard should it be used to predict x0 given y0 and to evaluate u(x0 ). 3. Polynomial representation Whilst the traditional representation of a polynomial in x is the monomial form pn (x) = c0 + c1 x + · · · + cn xn , its use can lead to numerical problems.3 Representing pn (x) in Chebyshev form generally overcomes such difficulties, and has advantages mathematically and computationally.4 First, consider x varying within the finite interval [xmin , xmax ] and transforming it to a normalized variable t ∈ I = [−1, 1]: t = (2x − xmin − xmax )/(xmax − xmin ).
(1)
This normalization avoids working with numbers that are possibly very large or very small in magnitude for high or even modest polynomial degree. Second, the Chebyshev-polynomial representation pn (x) ≡ 12 a0 T0 (t) + a1 T1 (t) + · · · + an Tn (t)
(2)
is beneficial since polynomial functions expressed in this manner facilitate working with them in a numerically stable way.5 The Tj (t), which lie between −1 and 1 for t ∈ I, are generated for any t ∈ I using T0 (t) = 1,
T1 (t) = t,
Tj (t) = 2tTj−1 (t) − Tj−2 (t),
j ≥ 2.
Algorithms based on Chebyshev polynomials appear in the NAG Library6 and other software libraries and have been used successfully for decades. For many polynomial calibration functions the polynomial degree is modest, often no larger than three or four. For such cases, the use of monomials in a normalized (rather than the raw) variable generally presents few
11
numerical difficulties. There are cases, such as the International Temperature Scale ITS-90,7 where the reference functions involved take relatively high degrees such as 12 or 15. For such functions, working with a normalized variable offers numerical advantages and the Chebyshev form confers even more, not only numerically, but also in terms of giving a manageable and sometimes a more compact representation. For instance, the monomial representation of thermoelectric voltage E=
n X
c r xr
r=0
in the reference function for Type S Thermocouples, for Celsius temperatures x in the interval [−50, 1 064.18] ◦ C, is given in a NIST database.8 There is a factor of some 1021 between the non-zero coefficients of largest and smallest magnitude, which are held to 12 significant decimal digits (12S); presumably it was considered that care is needed in working with this representation. The cr are given in Table 1 (column “Raw”). Table 1. Coeff 0 1 2 3 4 5 6 7 8
Polynomial coefficients for a Type S thermocouple Raw
Scaled
0 0 5.403 133 086 31×10−3 5.749 9 1.259 342 897 40×10−5 14.261 8 −2.324 779 686 89×10−8 −28.017 4 3.220 288 230 36×10−11 41.300 5 −3.314 651 963 89×10−14 −45.239 0 2.557 442 517 86×10−17 37.144 7 −1.250 688 713 93×10−20 −19.331 0 2.714 431 761 45×10−24 4.464 8
Normalized
Chebyshev
4.303 6 5.527 8 0.478 4 −0.054 3 0.220 6 −0.163 7 0.021 6 −0.024 9 0.025 2
9.278 2 5.371 1 0.370 6 −0.072 9 0.037 1 −0.013 0 0.002 2 −0.000 4 0.000 2
A scaled variable q = x/B has been used in work on ITS-90 in recent years, where B = 1 064.18 ◦ C is the upper interval endpoint. Pn Then, E = r=0 dr q r , with dr = B r cr . The scaling implies that the contribution from the rth term in the sum is bounded in magnitude by |dr |. Values of E in mV are typically required to 3D (three decimal places). Accordingly, the coefficients dr are given in Table 1 (column “Scaled”) to 4D (including a guard digit) and are much more manageable. Alternatively, the variable can be normalized to the interval I (not done in ITS-90) using expression (1) with xmin = −50 ◦ C and xmax = B. The corresponding coefficients are given in column “Normalized” and the Chebyshev coefficients in column “Chebyshev”, both to 4D, obtained using Refs. 4 and 9.
12
Figure 1 depicts the reference function. It is basically linear, but the non-linearity present cannot be ignored. The coefficients in the monomial representation in terms of the raw or scaled variable in Table 1 give no indication of the fundamentally linear form. That the first two coefficients are dominant is strongly evident, though, in the normalized and Chebyshev forms. The Chebyshev coefficients for degree 8 and arguably for degree 7 could be dropped, since to 3D they make little or no contribution. Such reasoning could not be applied to the other polynomial representations. Further remarks on the benefits of the Chebyshev form appear in Sec. 5.
Fig. 1.
Relationship between temperature and thermoelectric voltage
4. Measures of consistency Usually the degree n is initially unknown. It is traditionally chosen by fitting polynomials of increasing degree, for each polynomial forming a goodnessof-fit measure, and, as soon as that measure demonstrates that the model explains the data, accepting that polynomial. A common measure is the chi-squared statistic χ2obs , the sum of squares of the deviations of the polynomial from the yi , weighted inversely by the squared standard uncertainties associated with the yi -values (or a modified measure when xi -uncertainties or covariances are present). The statistic is compared with a quantile of the chi-squared distribution, with m − n − 1 degrees of freedom, that corresponds to a stipulated probability 1 − α (α is often taken to be 0.05). A value of χ2obs that exceeds the quantile is considered significant at the 100(1 − α) % level, and therefore that the polynomial does not explain the data.
13
This approach is not statistically rigorous because the sequence of tests does not form an independent set: the successive values of χ2obs depend on the same data and hence are statistically interrelated. Bonferroni10 regarded such a process as a multiple hypothesis test: whilst a given α may be appropriate for each individual test, it is not for the set of tests. To control the number of hypotheses that are falsely rejected, α is reduced in proportion to the number of tests. (Less conservative tests of this type are also available.11 ) A problem is that α depends on the number of tests. If polynomials of all degrees lying between two given degrees are considered, that set of degrees has to be decided a priori, with the results of the tests depending on that decision. (In practice, the number of data is often very limited, reducing somewhat the impact of this difficulty.) We consider an approach that is independent of the number of tests to be performed. Such an approach makes use of generally accepted modelselection criteria, specifically Akaike’s Information Criterion (AIC) and the Bayesian Information Criterion (BIC).12 For a model with n+1 parameters, these criteria are AIC = χ2obs + 2(n + 1),
BIC = χ2obs + (n + 1) ln m.
Both criteria are designed to balance goodness of fit and parsimony. AIC can possibly choose too high a degree regardless of the value of m. BIC tends to penalize high degrees more heavily. Recent considerations of AIC and BIC are available together with detailed simulations.13 Given a number of candidate models, the model with the lowest value of AIC (or BIC) would be selected. Some experiments14 with these and other criteria for polynomial modelling found that AIC and BIC performed similarly, although a corrected AIC was better for small data sets. Considerably more experience is needed before strong conclusions can be drawn. Related to the χ2obs statistic is the root-mean-square residual (RMSR) for unit weight defined as [χ2obs /(m − n − 1)]1/2 , which we use subsequently. 5. Example: thermocouple calibration A thermocouple is to be calibrated, with temperature in ◦ C as the stimulus variable and voltage in mV as the response variable. The data consists of temperature values 10 ◦ C apart in the interval [500, 1 100] ◦ C and the corresponding measured voltage values, 61 data pairs in all. The temperature values are regarded as having negligible uncertainty and standard uncertainties associated with the voltage values are based on knowledge of the display resolution (4S). The coefficients of weighted least-squares polyno-
14
mial models of degrees from zero to ten were determined. Corresponding values of RMSR appear in Fig. 2. This statistic show a clear decrease until
Fig. 2.
RMSR values for thermocouple calibration data
degree 6 at which point it saturates at 0.9 mV. [The RMSR-values would decrease once more for higher degrees when the polynomial model follows more closely the noise in the data (corresponding to over-fitting).] One way of selecting a degree is to identify the saturation level visually, the polynomial of smallest degree with RMSR value at this level being accepted.4 This approach works well for a sufficient quantity of data such as in this example, but the saturation level may not be obvious for the small data sets often arising in calibration work. We thus suggest that AIC or BIC could be used instead to provide a satisfactory degree for large or small calibration data sets. Applying these criteria to this example, degree 6 (bold in Table 2), as given by visual inspection, would be selected. Note that once a degree has been reached at which χ2obs saturates, the other terms in these criteria cause their values to rise (as in Fig. 3), potentially more clearly indicating an acceptable degree than does the RMSR saturation level. Table 2. Information criteria as function of polynomial degree for a thermocouple relationship Degree AIC BIC
3
4
5
6
7
8
9
10
7 585 937 7 585 946
2 632 2 643
806 819
56 71
58 75
60 79
62 83
63 87
An advantage of the Chebyshev polynomials, due to their nearorthogonality properties,9 is that the Chebyshev coefficients of successive
15
Fig. 3.
Information criteria AIC and BIC versus polynomial degree
polynomials also tend to stabilize once the RMSR-value has saturated. For instance the Chebyshev coefficients for degrees 3 to 7 are given in Table 3. Coefficients in other representations do not have this property. Table 3. Chebyshev coefficients in polynomial models for a thermocouple relationship Degree 0 1 2 3 4 5 6 7
Degree 3 6 245.08 4 530.90 1 756.70 309.11
Chebyshev coefficients Degree 4 Degree 5 Degree 6 6 262.43 6 262.33 6 261.92 4 575.62 4 575.17 4 574.94 1 854.03 1 852.99 1 852.92 407.25 405.85 405.48 39.31 38.05 37.21 −0.55 −1.40 −0.37
Degree 7 6 261.93 4 574.94 1 852.92 405.49 37.23 −1.37 −0.35 0.01
6. Conclusions Polynomials form a nested family of empirical functions often used to express a relation underlying calibration data. We advocate the Chebyshev representation of polynomials. We have considered the selection of an appropriate polynomial degree when these functions are used to represent calibration data. After having used polynomial regression software (NAG Library6 routine E02AD, say) to provide polynomials of several degrees for a data set, it is straightforward to carry out tests to select a particular polynomial model. Information criteria AIC and BIC are easily implemented
16
and appear to work satisfactorily, but more evidence needs to be gathered. Based on limited experience, these criteria select a polynomial degree that is identical or close to that chosen by visual inspection. Acknowledgments The National Measurement Office of the UK’s Department for Business, Innovation and Skills supported this work as part of its Materials and Modelling programme. Clare Matthews (NPL) reviewed a draft of this paper. References 1. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML, International Vocabulary of Metrology — Basic and General Concepts and Associated Terms Joint Committee for Guides in Metrology, JCGM 200:2012, (2012). 2. M. G. Cox, A. B. Forbes, P. M. Harris and I. M. Smith, The classification and solution of regression problems for calibration, Tech. Rep. CMSC 24/03, National Physical Laboratory (Teddington, UK, 2003). 3. M. G. Cox, A survey of numerical methods for data and function approximation, in The State of the Art in Numerical Analysis, ed. D. A. H. Jacobs (Academic Press, London, 1977). 4. C. W. Clenshaw and J. G. Hayes, J. Inst. Math. Appl. 1, 164 (1965). 5. R. M. Barker, M. G. Cox, A. B. Forbes and P. M. Harris, SSfM Best Practice Guide No. 4. Discrete modelling and experimental data analysis, tech. rep., National Physical Laboratory (Teddington, UK, 2007). 6. The NAG library (2013), The Numerical Algorithms Group (NAG), Oxford, United Kingdom www.nag.com. 7. L. Crovini, H. J. Jung, R. C. Kemp, S. K. Ling, B. W. Mangum and H. Sakurai, Metrologia 28, p. 317 (1991). 8. http://srdata.nist.gov/its90/download/allcoeff.tab. 9. L. N. Trefethen, Approximation Theory and Approximation Practice (SIAM, Philadelphia, 2013). 10. L. Comtet, Bonferroni Inequalities – Advanced Combinatorics: The Art of Finite and Infinite Expansions (Reidel, Dordrecht, Netherlands, 1974). 11. Y. Benjamini and Y. Hochberg, Journal of the Royal Statistical Society. Series B (Methodological) , 289 (1995). 12. K. P. Burnham and D. R. Anderson, Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach 2nd edn (New York: Springer, 2002). 13. J. J. Dziak, D. L. Coffman, S. T. Lanza and R. Li, Sensitivity and specificity of information criteria, tech. rep., Pennsylvania State University, PA, USA (2012). 14. X.-S. Yang and A. Forbes, Model and feature selection in metrology data approximation, in Approximation Algorithms for Complex Systems, eds. E. H. Georgoulis, A. Iske and J. Levesley, Springer Proceedings in Mathematics, Vol. 3 (Springer Berlin Heidelberg, 2011) pp. 293–307.
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. –)
EMPIRICAL FUNCTIONS WITH PRE-ASSIGNED CORRELATION BEHAVIOUR ALISTAIR B. FORBES∗ National Physical Laboratory, Teddington, Middlesex, UK ∗ E-mail: [email protected] Many model-fitting problems in metrology involve fitting a function f (x) to data points (xi , yi ). The response of an ideal system may be known from physical theory so that the shape of f (x) = f (x, a) is specified in terms of parameters a of a model. However for many practical systems, there may be other systematic effects, for which there is no accepted model, that modify the response of the actual system in a smooth and repeatable way. Gaussian processes (GP) models can be used to account for these unknown systematic effects. GP models have the form y = f (x, a)+e, where f (x, a) is a function describing the response due to the known effects and e represents an effect drawn from a Gaussian distribution. These effects are correlated so that if x is close to x′ then e is similar to e′ . An alternative is to regard e(x, b) as described by an empirical function such as a polynomial, spline, radial basis function, etc., that also reflects a correlation structure imposed by assigning a Gaussian prior to b, the parameters of the empirical model. The advantage of this approach is that the empirical models can provide essentially the same flexible response as a GP model but with much less computational expense. In this paper, we describe how a suitable Gaussian prior for b can be determined and discuss applications that involve such empirical models with a correlation structure. Keywords: Empirical function, Gaussian processes
1. Introduction Many physical system respond in a way that are only partially understood and empirical models such as polynomials or splines are used to capture the observed behaviour. In this, the choice of the empirical model can be important to the success of the representation. For example, we may decide to use a polynomial to describe the behaviour but we have to choose the order (degree + 1) of the polynomial. A model selection approach2,6,12 is to try all plausible models and then choose the best of them using a criterion, such as the Akaike Information Criterion1 or the Bayes Information Criterion,10 that balances the goodness of fit with compactness of repre-
17
18
sentation. The compactness of representation is usually measured in terms of the number of data points and the number of degrees of freedom associated with the model, i.e., the number of free parameters to be fitted, e.g., the order of the polynomial model. This approach can be expected to work well if the underlying system can in fact be described accurately by one of the proposed models. In this case, the model selection amounts to identifying the model with the correct number of degrees of freedom. If the set of plausible models does not contain one that describes the system, then the model selection process could well lead to a model that poorly describes the underlying behaviour and is otherwise unsuitable for basing inferences about the underlying system. Another approach is represented by smoothing splines.6,11 Here, the number of parameters associated with model is chosen to match exactly the number of data points but an additional smoothing term is introduced to penalise unwanted variation in the fitted function, with the degree of penalty determined by a smoothing parameter. For the smoothing spline, the penalty term is defined in terms of the curvature of the fitting function which can then be re-expressed in terms of the fitted parameters. The larger the smoothing parameter, the smoother the fitted function and, in the limit, the fitted function coincides with a straight line fit. The effect of the increasing the smoothing parameter is to reduce the effective number of degrees of freedom associated with the model. Thus, the fitted function can be regarded as a linear function augmented by an empirical function whose number of degrees of freedom is determined by the smoothing parameter. Kennedy and O’Hagan8 suggest that model inadequacies can be compensated for by augmenting the model with a Gaussian processes (GP) model9 that assumes the underlying system shows a smooth response so that nearby inputs will give rise to similar responses. Again the degree of smoothness is determined one or more smoothing parameters. Here again, implicitly, is the notion that the model is augmented by a model with potentially a large (or even infinite) number of degrees of freedom, but these degrees of freedom are effectively reduced by adding a prior penalty term. While calculations to determine a smoothing spline can be implemented efficiently by exploiting the banded nature of spline approximation,3 the Gaussian processes approach involves variance matrices whose size depend on the number m of data points. This can be computationally expensive for large data sets since the number of calculations required is O(m3 ). In this paper, we show how augmenting a model using a Gaussian process model that has a modest effective number of degrees of freedom can be
19
implemented using an empirical model also depending on a modest number of parameters, so that the computational requirement is greatly reduced. The remainder of this paper is organised as follows. In section 2 we overview linear regression for standard and Gauss-Markov models, the latter of interest to us since models derived from Gaussian processes lead to a Gauss-Markov regression problem. In section 3, we show how a Gaussian process model can be approximated by an empirical model with a preassigned correlation structure. Example applications are given in section 4 and our concluding remarks in section 5. 2. Linear regression 2.1. Ordinary least squares regression Consider the standard model yi = f (xi , a)+ǫi ,
f (xi , a) =
n
aj fj (xi ),
2 ǫi ∈ N(0, σM ),
i = 1, . . . , m,
j=1
2 or in matrix terms, y ∈ N(Ca, σM I), where Cij = fj (xi ). Given data ˆ of the parameters (xi , yi ), i = 1, . . . , m, the linear least-squares estimate a a is found by solving
min (y − Ca)T (y − Ca). a If C has QR factorisation5 C = Q1 R1 where Q1 is an m × n orthogonal matrix and R1 is an n × n upper-triangular matrix, then ˆ = (C T C)−1 C T y = R1−1 QT a 1 y, ˆ can be calculated by solving the upper-triangular system of linear and a ˆ are given by ˆ = QT equations R1 a 1 y. The model predictions y ˆ = Ca ˆ = C(C T C)−1 C T y = Q1 QT y 1 y = Sy,
S = Q1 QT 1.
The matrix S is a projection, so that S = S T = S 2 , and projects the data vector y on to the n-dimensional subspace spanned by the columns of C; the columns of Q1 define an orthogonal axes system for this subspace. In Hastie et al.,6 the sum of the eigenvalues of the matrix S is taken to be the effective number of degrees of freedom associated with the model. In this case, S is a projection and has n eigenvalues equal to 1 and all other 0 so that the effective number of degrees of freedom is n, the number of parameters in the model. The symbol S is chosen to represent ‘smoother’ as it smooths the noisy data vector y to produce the smoother vector of ˆ = Sy. model predictions y
20
The variance matrices Va and Vyˆ associated with the fitted parameters and model predictions are given by 2 2 Va = σM (C T C)−1 = σM (R1T R1 )−1 ,
2 2 Vyˆ = σM SS T = σM S,
(recalling that S is a projection). 2.2. Gauss-Markov regression Now suppose that the data y arises according to the model y ∈ N(Ca, V ),
(1)
where the variance matrix V reflects correlation due, for example, to common systematic effects associated with the measurement system. The Gauss-Markov estimate of a is found by solving min (y − Ca)T V −1 (y − Ca). a
(2)
If V has a Cholesky-type factorisation5 of the form V = U U T where U is upper-triangular, then we can solve the ordinary linear least-squares regression problem ˜ T (˜ ˜ min(˜ y − Ca) y − Ca), a ˜ where involving the transformed observation matrix C˜ and data vector y ˜ 1R ˜ 1 , then the transformed model predictions ˜ = y. If C˜ = Q U C˜ = C and U y ˆ ˆ˜ = S˜y ˜ by y ˜ where y ˜ are given in terms of the transformed data vector y T ˜ ˜ ˜ ˜ S = Q1 Q1 . The matrix S is a projection matrix, as discussed above, and has n eigenvalues equal to 1 and all others are zero. The unweighted model ˆ are given in terms of the original data vector y by y ˆ = Sy, predictions y ˜ −1 . The matrix S is not in general a projection but it is where S = U SU conjugate to a projection matrix and therefore has the same eigenvalues: ˜ = λv, then S(U v) = U Sv ˜ = λ(U v). Thus, the effective number of if Sv degrees of freedom associated with the model is n, the number of parameters in the model, as for the case of ordinary least squares regression. 2.3. Explicit parameters for the correlated effects 2 Suppose now that V in (1) can be decomposed as V = V0 + σM I, where V0 is a positive semi-definite symmetric matrix. Here, we are thinking that y = Ca + e + ǫ where e ∈ N(0, V0 ) represents correlated effects and ǫ ∈
21 2 N(0, σM ) random effects such as measurement noise. If V0 is factored as V0 = U0 U0T , we can write this model equivalently as
y = U0 e + Ca + ǫ,
e ∈ N(0, I),
2 ǫ ∈ N(0, σM ).
ˆ and a ˆ of e and a, respectively, are found by solving the augEstimates e mented least squares system e y/σM U0 /σM C/σM ˇ ˇ ˇ= ˇ= ˇ≈y ˇ, C = , a , y . (3) Ca I a 0 ˆ is the same as that determined by solving (2) for V = The solution a 2 V0 + σM I. ˇ1Q ˇ 1 then the projection Sˇ = Q ˇT ˇ 1R If Cˇ has QR factorisation Cˇ = Q 1 ˇ ˆ ˇ from the calculates the 2m-vector of weighted model predictions y ˇ = Sy ˇ . It has m + n eigenvalues equal to 1, the same augmented data vector y ˇ , and all others are zero. The unweighted as the number of parameters in a ˆ are given by model predictions y ˆ = Sy, y
S=
1 ˇ T ˇ −1 [U0 C]T . 2 [U0 C](C C) σM
If the 2m × 2m matrix Sˇ is partitioned into m × m matrices as Sˇ11 Sˇ12 ˇ S = ˇT ˇ S12 S22 then S = Sˇ11 . As a submatrix of a projection, S has eigenvalues λj with 0 ≤ λj ≤ 1. Hastie et al.6 use the term shrinkage operator for this kind of matrix. In fact, n of eigenvalues of S are equal to 1 corresponding to the free parameters a in the model: if y = Ca for some a, then Sy = y. The number of effective degrees of freedom associated with the model is given by the sum of the eigenvalues of S and can be thought of as that fraction of the total number of degrees of freedom, m + n, used to predict y. The sum of the eigenvalues of Sˇ22 must be at least n. In fact, Sˇ22 also has n eigenvalues equal to 1: if U0 e = Cδ for some δ, then Sˇ22 e = e. The effective number of degrees of freedom of the model can range from between n and m (≥ n). If the prior information about the correlated effects U0 e is strong, then the effective number of degrees of freedom will be closer to n; if it is weak, it will be closer to m. ˆ in (3) is the same as that in (2), the Note that while the solution a ˆ = U0 e ˆ +C a ˆ , as opposed vector of model predictions associated with (3) is y ˆ = Ca ˆ for model (2). The extra degrees of freedom provided by the to y ˆ. parameters e allows y to be approximated better by y
22
3. Spatially correlated empirical models Gaussian processes (GP) are typically used to model correlated effects where the strength of the correlations depend on spatial and/or temporal separations. Consider the model yi = f (xi , a) + ei + ǫi ,
2 ǫ ∈ N(0, σM ),
where ei represents a spatially correlated random effect. For example the correlation could be specified by 1 V0 (i, q) = cov(ei , eq ) = k(xi , xq ) = σ 2 exp − 2 (xi − xq )2 . (4) 2λ The parameter σ governs the likely size of the effect while the parameter λ determines the spatial length scale over which the correlation applies. For linear models, estimates of a are determined from the data using the approaches described in sections 2.2 or 2.3. While GP models can be very successful in representing spatially correlated effects, the computational effort required to implement them is generally O(m3 ) where m is the number of data points. If the length scale parameter λ is small relative to the span of the data x, then the matrix V0 can be approximated well by a banded matrix and the computations can be made much more efficiently. For longer length scales, V0 will be full but will represent a greater degree of prior information so that the effective number of degrees of freedom associated with the model will be much less than m + n. This suggests augmenting the model using an empirical model involving a modest number of degrees of freedom but retaining the desired spatial correlation.4 Thus, we regard e as described by an empirical p function e(x, b) = k=1 bj ej (x), expressed as a linear combination of basis functions. We impose a correlation structure by assigning a Gaussian prior to b of the form b ∼ N(0, Vb ). The issue is how to choose Vb to impose the correct spatial correlation in order that cov(e(x, b), e(x , b)) ≈ k(x, x ). Suppose z = (z1 , . . . , zm )T is a dense sampling over the working range for x, and let V0 be the m × m variance matrix with V0 (i, q) = k(zi , zq ), and E the associated observation matrix with Eij = ej (zi ). If E has QR factorisation E = Q1 R1 and e ∈ N(0, V ), then b = R1−1 QT 1 e defines the empirical function e(x, b) that fits closest to e in the least squares sense. If e ∼ N(0, V0 ), then b = R1−1 QT 1 e ∼ N(0, Vb ),
−T Vb = R1−1 QT 1 V0 Q1 R1 .
ˆ = Eb, We use Vb so defined as the prior variance matrix for b. Setting e
23
then for b ∼ N(0, Vb ), we have Veˆ = EVb E T = P1 V0 P1T ,
ˆ ∼ N(0, Veˆ ), e
P1 = Q1 Q T 1.
The matrix P1 is a projection and P1 V0 P1T represents the variance matrix of the form EVb E T that is closest to V0 in some sense. The quality of the approximation can be quantified in terms of the trace Tr(V0 − P1 V0 P1T ),
(5)
for example, where Tr(A) is the sum of the diagonal elements of A. If the dense sampling of points used to generate the variance matrix V0 is regularly spaced, then V0 is a Toeplitz matrix.5 Matrix-vector multiplications using a Toeplitz matix of order p can executed in O(p log p) using the fast Fourier transform and the matrix itself can be represented by p elements. ˆ are determined ˆ and b If Vb is factored as Vb = Ub UbT then estimates a by finding the least squares solution of (3) where now b y/σM EUb /σM C/σM ˇ ˇ= ˇ= , a , y . (6) C= I a 0 The matrix Cˇ above is an (m + p) × (p + n) matrix whereas in (3) it is an 2m × (m + n) matrix (and m is likely to be much larger than p). The ˆ are given by unweighted model predictions y ˆ = Sy, y
S=
1 ˇ T ˇ −1 [EU C]T , 2 [EUb C](C C) b σM
(7)
where Cˇ is given by (6). The shrinkage operator S above has n eigenvalues equal to 1, a further p eigenvalues between 0 and 1 and all other eigenvalues equal to 0. 4. Example applications 4.1. Instrument drift In the calibration of a 2-dimensional optical standard using a coordinate measuring machine (CMM), it was noticed that the origin of the measurement system drifted by a few micrometres over time, possibly due to thermal effects arising from the CMM’s internal heat sources due to its motion. The drift in x and y is modelled as a quadratic polynomial (n = 3) augmented by an order p = 10 polynomial with a preassigned correlation derived from kernel k in (4) with λ = 0.25. The time units are scaled so that 0 represents that start time of the measurements and 1 the end time
24
(in reality, a number of hours later). Figure 1 shows the measured data and the fitted functions for the x- and y-coordinate drift. The shrinkage operator S defined by (7) has n + p nonzero eigenvalues with n of them equal 1, the rest between 1 and 0. For this model, there are in fact only p nonzero eigenvalues because the order p augmenting polynomial e(x, b) has n basis functions in common with the polynomial representing the drift. Table 1 shows those eigenvalues of the shrinkage operator S that are not necessarily 1 or 0 for different values of the length scale parameter λ. Increasing λ decreases the effective number of degrees of freedom from a maximum of p = 10 to a minimum of n = 3.
x coordinate drift/mm
−4
2
p = 10, lambda = 0.25
x 10
0 fitted polynomial data
−2 −4
0
0.2
0.4 0.6 time/arbitrary units
0.8
1
y coordinate drift/mm
−4
6
x 10
4 2 fitted polynomial data
0 −2
0
0.2
0.4 0.6 time/arbitrary units
0.8
1
Fig. 1. The fits of quadratic drift functions augmented with the spatially correlated polynomials of order n = 10 corresponding to λ = 0.25 to data measuring instrument drift in x- and y-coordinates, top and bottom, respectively.
4.2. Trends in oxygen data Since the early 1990s, the Scripps Institute7 has monitored the change in the ratio of O2 to N2 , relative to a reference established in the 1980s, at 9 remote locations around the globe. Figure 2 shows the record at two sites, Alert in Canada, latitude 82 degrees North, and Cape Grim, Australia, latitude 41 degrees South. All the records i) have missing data, ii) show a yearly cyclical variation, and iii) show an approximately linear decrease. The units
25 Table 1. Modelling drift: non-unit and non-zero eigenvalues (rows 2 to 8) associated with the shrinkage operator S for different values of λ (row 1), for p = 10 and n = 3. The number of effective degrees of freedom are given in row 9. 0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.99 0.99 0.99 0.98 0.97 0.92 0.81
0.99 0.99 0.98 0.95 0.83 0.59 0.25
0.99 0.98 0.93 0.80 0.40 0.13 0.02
0.99 0.97 0.79 0.45 0.08 0.02 0.00
0.98 0.93 0.53 0.15 0.02 0.00 0.00
0.97 0.86 0.26 0.04 0.00 0.00 0.00
0.95 0.74 0.11 0.01 0.00 0.00 0.00
9.65
8.59
7.24
6.29
5.61
5.14
4.82
associated with the vertical axis in figure 2 are such that a decrease of 100 units represents a 0.01 % decrease in the ratio of oxygen to nitrogen.
δ (O2/N02)/10−6
δ (O2/N02)/10−6
Alert, Canada 0 −200 −400 −600 1990
1995
2000 2005 time/year Cape Grim, Australia
2010
2015
1995
2000
2010
2015
0 −200 −400 −600 1990
2005 time/year
Fig. 2.
Oxygen data gathered by the Scripps Institute for two sites.
The data is analysed using a model of the form y = f1 (t, a1 ) + f2 (t, a2 ) + e1 (t, b) + e + ǫ, where f1 (t, a1 ) represents a linear trend, f2 (t, a2 ) a Fourier series to model cyclical variation, e1 (t, b) a temporally correlated polynomial to model long term trend with a time constant λ2 equal to approximately 5 years, e temporally correlated effect to model short term seasonal variations with a
26
time constant λ2 equal to approximately 1 month, and ǫ represents random noise associated with short term variations and measurement effects. It would be possible to use a temporally correlated polynomial e2 (t, b2 ) to represent the shorter term variations. However, in order to deliver the appropriate effective number of degrees of freedom (of the order of 60) or, in other terms, approximate the variance matrix V0 well, at least an order p = 100 polynomial would be required. This does not pose any real problem (if orthogonal polynomials are used) but it is computationally more efficient to exploit the fact that the variance matrix V0 is effectively banded with a bandwidth of about 25 for the type of data considered here. If the extent of the spatial/temporal correlation length is small relative to the span of the data, then the variance matrix can be approximated well by a banded matrix (and there is large number of degrees of freedom in the model) while if the spatial/temporal correlation length is comparable with data span, the variance matrix can be approximated well using a correlated empirical model (and there are a modest number of degrees of freedom). In either case, the computations can be made efficiently. Figures 3 and 4 show the results of calculations on the data in figure 2. The units in these graphs are the same as that for figure 2. Figure 3 shows the fitted model for the time period between 2000 and 2004. The Fourier model included terms of up to order 4, so that 8 Fourier components are present. Note that the northern hemisphere fit (top) is out of phase with the southern hemisphere fit (bottom) by approximately half a year. The figure also shows the uncertainty band representing ± 2 u(ˆ yi ), where u(ˆ yi ) is the standard uncertainty of the model prediction at the ith data point. Figure 3 relates to a linear trend function f1 (t, a1 ). We can also perform the same fitting but with a quadratic trend function. Figure 4 shows the differences in the combined trend functions (polynomial plus augmented polynomial) for the two datasets in figure 2 along with the estimate uncertainty associated with the fitted trend functions. It can be seen that both sets of trend functions agree with each other well, relative to the associated uncertainties. Thus, the effect of a choice of linear or quadratic trend function is minimised by the use of an augmented model that can compensate for the mismatch between the model and the data. The invariance with respect to such model choices is one of the benefits in employing an augmented model. 5. Concluding remarks In this paper we have been concerned with fitting data that is subject to systematic effects that are only partially understood. We use a model that
27
50
0
−50 2000
2000.5
2001
2001.5
2002
2002.5
2003
2003.5
2004
2000.5
2001
2001.5
2002
2002.5
2003
2003.5
2004
50
0
−50 2000
Fig. 3.
Fitted model to oxygen data in figure 2 shown for the period 2000 to 2004.
10 5 0 −5
1990
1995
2000
2005
2010
2015
1990
1995
2000
2005
2010
2015
1990
1995
2000
2005
2010
2015
10 5 0 −5 6 4 2
Fig. 4. Differences in the combined trend functions f1 (t, a1 ) + e1 (t, b) determined for the datasets in figure 2 for the cases of linear and quadratic f1 (t, a1 ). The bottom graph shows the estimated uncertainties associated with the trend functions (these uncertainties are virtually the same for the linear and quadratic functions). Uncertainties are larger at the ends of the data record due to the fact that the temporally correlated models have only future or past data to determine model estimates.
reflects what we believe about the system response but augmented by a model to account for our incomplete knowledge. Gaussian processes (GP)
28
models can be used to provide these augmentations but can be computationally expensive for large datasets. A more computationally efficient approach can be found by replacing the Gaussian process model with an empirical model that provides almost the same functionality as the GP model. The correlation structure in the GP model is translated to a correlation structure applying to the parameters associated with the empirical model and acts as a regularisation term. Acknowledgements This work was funded by the UK’s National Measurement Office Innovation, Research and Development programme. I thank my colleague Dr Dale Partridge for his comments of an earlier draft of this paper. The support of the AMCTM programme committee is gratefully acknowledged. References 1. H. Akaike. A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19:716–723, 1974. 2. H. Chipman, E. I. George, and R. E. McCulloch. The practical implementation of Bayesian model selection. Institute of Mathematical Statistics, Beachwood, Ohio, 2001. 3. M. G. Cox. The least squares solution of overdetermined linear equations having band or augmented band structure. IMA J. Numer. Anal., 1:3 – 22, 1981. 4. A. B. Forbes and H. D. Minh. Design of linear calibration experiments. Measurement, 46(9):3730–3736, 2013. 5. G. H. Golub and C. F. Van Loan. Matrix Computations. John Hopkins University Press, Baltimore, third edition, 1996. 6. T. Hastie, R. Tibshirani, and J. Friedman. Elements of Statistical Learning. Springer, New York, 2nd edition, 2011. 7. R. Keeling. http://scrippsO2.ucsd.edu/ accessed 12 November, 2014. 8. M. C. Kennedy and A. O’Hagan. Bayesian calibration of computer models. J. Roy. Sat. Soc. B, 64(3):425–464, 2001. 9. C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Cambridge, Mass., 2006. 10. G. Schwarz. Estimating the dimension of a model. Annals of Statistics, 6:461– 464, 1978. 11. G. Wahba. Spline models for observational data. SIAM, Philadelphia, 1990. 12. X.-S. Yang and A. B. Forbes. Model and feature selection in metrology data approximation. In E. H. Georgoulis, A. Iske, and J. Levesley, editors, Approximation Algorithms for Complex Systems, pages 293–307, Heidelberg, 2010. Springer-Verlag.
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 29–37)
MODELS AND METHODS OF DYNAMIC MEASUREMENTS: RESULTS PRESENTED BY ST. PETERSBURG METROLOGISTS* V.A. GRANOVSKII Concern CSRI Elektropribor, JSC 30, Malaya Posadskaya Str., 197046, Saint Petersburg, Russia E-mail:[email protected] The paper reviews the results of St. Petersburg metrologists work on the development of dynamic measurement and instrument models, as well as algorithms for measurement data processing. The results were obtained in the 60-ies – 80-ies of the past century within the framework of three generalized formal problems, two of which are related to ill-posed inverse problems. The developed methods for dynamic measurement instrument identification are presented. The general characteristic is given for the problem of an input signal recovery of the dynamic measurement instrument. The topicality of the obtained results is pointed out. Keywords: Dynamic measurements; Instrument; Dynamic characteristic, Inverse problem, Correctness, Regularity.
1. Introduction The memoir [1] shall be considered as the first work on the theory of dynamic measurements in St. Petersburg. The regular development of the problem started in the 1960-ies in two research centers: the Mendeleev Institute of Metrology (VNIIM) and the Research Institute of Electrical Instruments (VNIIEP). This work has passed two stages. Publication of the books [2, 3], respectively, can be considered the origin of each stage. The paper is aimed at reviewing the work of St. Petersburg metrologists, the results of which seem to be actual nowadays. 1.1. Dynamic measurements [3-5] The idea of dynamic measurements is usually associated with the presence of a substantial component of a measurement error, caused by the discrepancy *
This work was supported by the Russian Foundation for Basic Research (project 13-08-00688) 29
30
between inertial (dynamic) properties of an instrument and the rate of measured process change (frequency content). Such an interpretation determines the range of problems for solving formal tasks of modeling and algorithm elaboration. Concrete definition of problems first of all requires the analysis of relation between measurement and the inverse problem. 1.1.1. Measurement and inverse mathematical physics problem Being an instrument of knowledge, the measurement is aimed at recovery of the phenomenon under investigation by the measured response of an object to the controllable effect. The measurement itself acts as recovery of a measured attribute by the result of its influence on an instrument, in the context of the object model. Processing of measurement data is the recovery of actual effect on the instrument, disturbed by a chain of physical measurement transformations. 1.1.2. Formal model of dynamic measurements A model of direct dynamic measurements [3] is considered in order to find basic features of dynamic measurements (Fig. 1).
Figure 1. Block diagram of dynamic measurement error origin
A true signal xt(t) contains the information about the property of the object under investigation OI. A measurand in general is defined as a result of functional transformation of signal xt(t): Q=
n{Bn[xt(t)]},
(1)
31
where Bn – a required transformation operator; n – a required functional. Due to the measuring instrument MI influence on the object, the real output signal of the target of research xout(t) differs from xt(t). The device input is also affected by disturbance (t). Hence, the real input signal xin(t) differs from xout(t). The instrument transforms xin(t) into the output signal y(t). The real transformation operator Br.t, expressing the properties of the type of instruments, differs from Bn because of imperfect principle of operation and device design. The properties of a certain instrument, expressed by operator Br.c, differ from the typical ones. Besides, the instrument is affected by influence quantities v1, …, vl. The combined action of these quantities may be expressed by an influence signal (t), disturbing the operator Br.c. As a result, the operator Br, realized during measurements, differs from Bn. Calculation of the functional (y) is included either in the instrument operation algorithm, or in the algorithm of measurement data processing MP, particularly, processing the output signal values, read out from the device scale. In the latter case the readout error and the calculation error should be taken into account separately. The following parameter should be taken as a measurement result: =
r[
(t)] =
r{Br[xr(t)]
+ (t)}.
(2)
An error =
–Q=
r{Br
[xr(t)] + (t)} –
n{Bn[xt(t)]}.
(3)
On the assumption of linearity of operators and functionals: =
n[
Br.c(x )] +
n[
Br.t(xin)] +
n[
Bn(xin)] + [Br(xin)],
where Br.c = Br-Br.c; Br.t = Br.c-Br.t; Br.n = Br.t-Bn;
n[
=
(t)] + r-
n;
n[Bn(
x)] + (4)
x = xt+ .
Linear operator B takes, in the time and frequency (complex) domain, the different forms, to which the following total dynamic characteristics of the instrument correspond: (a) set of differential equation structure and coefficients; (b) impulse response g(t); (c) transient performance h(t); (d) complex frequency characteristic W(j ) and its two components – amplitude- and phase-frequency characteristics; (e) transfer function W(p). 1.1.3. Three typical problems of dynamic measurements Metrological support of dynamic measurements is represented by a set of one direct and two inverse problems. The direct problem is to determine the response y of an instrument with the known dynamic properties (operator B) to the given effect x:
32
x, B → y = Bx.
(5)
The first inverse problem is to determine the dynamic properties of the instrument by the known test influence x and the instrument response to it y:
x, y → B = y * ( x ) −1 .
(6)
Expression (6) is certainly symbolical. The second inverse problem consists in the recovery of the input effect by the known dynamic properties of the instrument and its response to the desired effect: (7) B, y → x = ( B )−1 y. Generally, inverse problems are related to ill-posed ones from Hadamard’s viewpoint, and regularization methods should be used to solve them [6]. 2. Results of solving typical problems of dynamic measurements 2.1. Direct problem of dynamic measurement [7-10] 2.1.1. Metrological statement of direct problem The direct problem of dynamic measurements is concerned with the dynamic measurement error estimation, or dynamic error of measurement. From the expression (4) it follows that the basic contribution to the dynamic error is induced by the difference between function g(t) and -function, or, what is the same, by the difference between h(t) and the unit step. In [10] the matrix of typical direct problem statements is analyzed. The matrix is defined by varieties of dynamic characteristics of the instrument and input signals. It contains above hundred specified tasks, from which only one third have a solution; note that most of unsolved problems are those concerned with particular dynamic characteristics of the instrument. 2.1.2. Direct problem solution The work [9] presents the results of transformation error analysis for a variable signal modeled by the stationary random process. The expression was obtained for autocorrelation function of an error, as well as for error variance in the steady-state mode, when the transformation operator is exactly known (in one form or another). As for the process, it is assumed that we know either the autocorrelation function, or the spectrum density of the process, or the generating stochastic differential equation. Besides, the influence of the real operator divergence from the nominal one on error estimation is studied.
33
The authors in paper [8] suggest estimating a transformation error by using an inverse operator in the complex domain, the expression for which is derived from the direct operator represented by the Taylor expansion. The estimates obtained are reliable only when the input signal can be approximated by the loworder polynomial. 2.2. The first inverse problem of dynamic measurements [3, 11-29] 2.2.1. Metrological statement of the problem of dynamic measurement instrument identification The first inverse problem of the dynamic measurement, or a problem of determining the total dynamic characteristics, includes, in addition to the incorrectness problem solution, also accounts for limited accuracy of forming the required test effects on the instrument. 2.2.2. Identification problem solution for an instrument dynamic properties Determining the total dynamic characteristics of the instrument is defined by the peculiarities of characteristics and the test signals. When characteristical signals are used, which fit the determination of the corresponding dynamic characteristics, we should only compute an error of the characteristic estimate, caused by non-ideal test signal. The expressions for errors are derived as applied to linearly or exponentially increasing test signals, and transient responses of the instrument having linear models of the 1st and 2nd orders. The similar results were obtained for pulse characteristics and frequency responses on the assumption that the real test signal is accordingly a rectangular pulse, and a sum of the first and second harmonics [3]. In the general case of determining characteristic g(t) or h(t) from the convolution equation, by the known input x(t) and output y(t) signals, we have to regularize it. Because of that the problem is ill-posed one and a priori information about the desired dynamic characteristic is very important, adaptive identification methods have become widely used. The work [13] describes the method of adaptive selection of regularization parameter by the statistical criterion based on the given fractile of the 2 distribution. The method is realized by digitalization of the convolution equation and transfer to the matrix equation, which is regularized and converted to the following form:
(A
T
−1
)
A + λ I g = AT
−1
y.
(8)
34
The criterion of selecting an optimal regularization parameter form:
( y − Agλ )T
−1
has the
( y − Ag λ ) ≤ χl2−1 (α ).
(9)
In contrast to [13], the identification method [12] uses the iteration procedure of selecting a vector of the instrument model parameters by the squared criterion of discrepancy between the responses from the real instrument and its model. The iteration method of selecting a dynamic model of the instrument (in the form of transfer function) by the squared criterion of discrepancy, presented in [3], is based on model generation by integration of typical dynamic elements of the 1st nd 2nd orders. Being a dynamic model of the instrument, the differential equation is mostly used for theoretical construction. But coefficients of linear differential equation can be determined by successive integration, if a steady-state signal (or pulse signal) is used as a test input signal [3]. Here the major problem is correct determination of the equation order or the iteration procedure stopping criterion. It may be based on convergence of discrepancy between the responses from the instrument and its model, when its order increases. The methods considered above were related to the instrument with the linear dynamic model, because in this case general solutions are possible. As for nonlinear instruments, their identification requires much a priori information. Usually this means a limited number of model classes, first of all, Hammerstein and Wiener integral equations. The instrument identification methods with such models are considered in [16]. They mean separate determination of characteristics of linear and nonlinear elements of the model. The nonlinear element is identified in static mode, after that the problem of linear element identification is solved. The solution process quickly “branches” into variants based on versions of a particular instrument model. The use of pseudorandom test sequence of pulses makes it possible to ease the restrictions imposed on the model being identified. 2.3. The second inverse problem of dynamic measurements 2.3.1. Metrological statement of the problem of recovering the input signal of dynamic measurement instrument The recovery of the instrument input signal by the known output signal and total dynamic characteristic of the instrument, in terms of metrology, means correction of a dynamic error of input signal transformation or correction of nonideal dynamic characteristic. Put this another way, the signal processing is
35
required to go from real pulse and transient characteristics, and frequency response to ideal ones: g uò = δ (t ); huò = 1(t ); Auò (ω ) = 1; Φuò (ω ) = 0. The problem regularization is very difficult, owing to the nature of the available a priori information. In contrast to identification, during the recovery we rarely come up against the a priori known input signal type. For the same reason it is more difficult to implement the iteration methods. 2.3.2. Solution of the input signal recovery problem for the dynamic measurement instrument The works [3, 30] are devoted to this problem solution. The authors try to analyze the problem peculiarities and to outline the ways of solving the most difficult inverse problem of dynamic measurements. Because of impossibility to regularize the problem without relevant a priori information, we obtain the unlimited number of particular solutions, each having the restricted domain of applicability, instead of general solutions. 2.4. Overall evaluation of results During the past years the European program on dynamic measurement metrology has been implemented [31, 32]. The published results show that our colleagues from six countries of European Community are at the initial stage of the way which was passed by St. Petersburg metrologists in the 70-ies-90-ies of the past century. So, results obtained in the past remain valid and actual. References 1. 2. 3. 4.
5. 6.
. N. rylov, Some remarks on ‘kreshers’ and indicators, Proceedings of the St. Petersburg Academy of Sciences, 1909. . N. Gordov, Temperature Measurement of Gas Streams ( ashgiz, Moskow–Leningrad, 1962). V. . Granovskii, Dynamic measurements: Basics of metrological assurance (Leningrad, Energoatomizdat, 1984). V. . rutyunov, V. . Granovskii, V. S. Pellinets, S. G. Rabinovich, D. F. rtakovskiy, . P. Shirokov, Basic concepts of dynamic measurement theory, Dynamic measurements. Abstracts of 1th All-Union Symposium (VNIIM, 1974). G. I. valerov, S. . ndel’shtam, Introduction in Information Theory of Measurement, (Moscow, Energiya, 1974). . N. ikhonov, V. Ya. rsenin, Solution methods for ill-posed problems (Nauka, Moscow, 1974).
36
7.
8. 9.
10. 11.
12. 13. 14. 15. 16.
17.
18.
19.
20.
21.
Ya. G. Neuymin, I. A. Popova, B. L. Ryvkin, B. A. Schkol’nik, Estimate of measurement dynamic error, Metrology (Measurement Techniques), 1 (1973). . D. Vaysband, Approximate method for dynamic errors calculation of linear conversion, Measurement Techniques, 12 (1975). N. I. Seregina, G. N. Solopchenko, V. M. Chrumalo, Error determination of direct measurement of variable quantities, Measurements, control, automation, 4 (1978). V. . Granovskii, . . udryavtsev, Error estimation of direct dynamic measurements, Metrology (Measurement Techniques), 1 (1981). B. . Shkol’nik, Estimation error of dynamic characteristic by system response on periodic signal, in Transactions of the metrology institutes of the USSR, 137 (197) (1972). M. D. Brener, V. M. Chrumalo, On dynamic parameter estimates of typical inertial measuring instruments, ransactions, VNIIEP, 16 (1973). N. Ch. Ibragimova, G. N. Solopchenko, Estimate of weight and transfer functions by statistical regularization, Transactions, VNIIEP, 16 (1973). N. . Zeludeva, Some numerical transformations of measuring instrument transfer functions, Transactions, VNIIEP, 16 (1973). V. . Zelenyuk, D. F. artakovskiy, Dynamic characteristic of film measuring transducers, Measurement Techniques, 6 (1973). . D. Brener, G. N. Solopchenko, On dynamic characteristic estimates in the process of non-linear measuring instruments calibration, Transactions, VNIIEP, 18 (1973). V. . Granovskii, n dynamic properties normalization of measuring transducers, Problems of theory and designing of information transducers: Records of seminar, Cybernetics Institute of the Ukraine Academy of sciences ( iev, 1973). B. . Schkol’nik, Development of methods and apparatus for dynamic characteristics determination of measuring transducers of continuous time electrical signals, Abstract of a thesis on academic degree (VNIIM, 1974). V. . rutyunov, V. . Granovskii, S. G. Rabinovich, Standardization and determination of measuring instrument dynamic properties, Dynamic measurements. Abstracts of 1th All-Union Symposium (VNIIM, 1974). V. . Granovskii, Yu. S. Etinger, Determination procedure of measuring instrument dynamic properties, trology (Measurement Techniques), 10 (1974). G. N. Solopchenko, Dynamic error of measuring instrument identification, trology (Measurement Techniques), 1 (1975)
37
22. V. . Arutyunov, V. . Granovskii, Standardization and determination of dynamic properties of measuring instruments, Measurement Techniques, 12 (1975). 23. V. . Granovskii, Determination procedure of error determination of full dynamic characteristics of measuring instruments, Measurement Techniques, 7 (1977). 24. V. . Granovskii, Yu. S. Etinger, Transfer function determination of measuring instruments with distributed parameters, Dynamic measurements. Abstracts of 2th All-Union Symposium (VNIIM, 1978). 25. S. M. Mandel’shtam, G. N. Solopchenko, Dynamic properties of measuring and computing aggregates, Measurement Techniques, 22, 4 (1979). 26. G. N. Solopchenko, I. B. Chelpanov, Method for determining and normalizing the measuring-equipment dynamic characteristics, Measurement Techniques, 22, 4 (1979). 27. G. N. Solopchenko, Minimal fractionally rational approximation of the complex frequency characteristic of a means of measurement, Measurement Techniques, 25, 4 (1982). 28. V. . Granovskii, . B. inz, Yu. S. Etinger, Dynamic characteristics determination of sulphide-cadmium photoresistors, Metrology (Measurement Techniques), 10 (1982). 29. V. Ya. Kreinovich, G. N. Solopchenko, Canonical-parameter estimation for instrument complex frequency response, Measurement Techniques, 36, 9 (1993). 30. R. . Poluektov, G. N. Solopchenko, Correction methods of dynamic errors, Avtometriya, 5 (1971). 31. T. Esward, C. Elster, J. P. Hessling, Analysis of Dynamic Measurements: New Challenges Require New Solutions, in Proc. XIII International Congress IMEKO (Lisbon, 2009). 32. C. Bartoli, M. F. Beug, T. Bruns, C. Elster, L. Klaus, M. Kobusch, C. Schlegel, T. Esward, A. Knott, S. Saxholm, Traceable Dynamic Measureement of Mechanical Quantities: Objectives and First Results of this European Project, in Proc. XX International Congress IMEKO (Busan, 2012).
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. –)
INTERVAL COMPUTATIONS AND INTERVAL-RELATED STATISTICAL TECHNIQUES: ESTIMATING UNCERTAINTY OF THE RESULTS OF DATA PROCESSING AND INDIRECT MEASUREMENTS V. KREINOVICH Computer Science Department, University of Texas at El Paso, El Paso, Texas 79968, USA E-mail: [email protected] http://www.cs.utep.edu/vladik In many practical situations, we only know the upper bound ∆ on the measurement error: |∆x| ≤ ∆. In other words, we only know that the measurement error is located on the interval [−∆, ∆]. The traditional approach is to assume that ∆x is uniformly distributed on [−∆, ∆]. In some situations, however, this approach underestimates the error of indirect measurements. It is therefore desirable to directly process this interval uncertainty. Such “interval computations” methods have been developed since the 1950s. In this paper, we provide a brief overview of related algorithms and results. Keywords: interval uncertainty, interval computations, interval-related statistical techniques
1. Need for Interval Computations Data processing and indirect measurements. We are often interested in a physical quantity y that is difficult (or impossible) to measure directly: distance to a star, amount of oil in a well. A natural idea is to measure y indirectly: we find easier-to-measure quantities x1 , . . . , xn related to y by a known relation y = f (x1 , . . . , xn ), and then use the results x ei of measuring xi to estimate ye: x e1 ✲ x e2 ✲
f
···
x en ✲
38
ye = f (e x1 , . . . , x en ) ✲
39
This is known as data processing. Estimating uncertainty of the results of indirect measurements: a problem. Measurements are never 100% accurate. The actual value xi of i-th measured quantity can differ from the measurement result x ei ; in other def words, there are measurement errors ∆xi = x ei − xi . Because of that, the result ye = f (e x1 , . . . , x en ) of data processing is, in general, different from the actual value y: ye = f (e x1 , . . . , x en ) 6= f (x1 , . . . , xn ) = y. It is desirable to def
describe the error ∆y = ye − y of the result of data processing. For this, we must have information about the errors of direct measurements.
Uncertainty of direct measurements: need for overall error bounds (i.e., interval uncertainty). Manufacturers of a measuring instrument (MI) usually provide an upper bound ∆i for the measurement error: |∆xi | ≤ ∆i . (If no such bound is provided, then x ei is not a measurement, it is a wild guess.) Once we get the measured value x ei , we can thus guarantee that the def actual (unknown) value of xi is in the interval xi = [e x i − ∆i , x ei + ∆i ]. For example, if x ei = 1.0 and ∆i = 0.1, then xi ∈ [0.9, 1.1]. In many practical situations, we also know the probabilities of different values ∆xi within this interval. It is usually assumed that ∆xi is normally distributed with 0 mean and known standard deviation. In practice, we can determine the desired probabilities by calibration, i.e., by comparing the results x ei of our MI with the results x ei st of measuring the same quantity by a standard (much more accurate) MI. However, there are two cases when calibration is not done: (1) cutting-edge measurements (e.g., in fundamental science), when our MI is the best we have, and (2) measurements on the shop floor, when calibration of MI is too expensive. In both cases, the only information we have is the upper bound on the measurement error. In such cases, we have interval uncertainty about the actual values xi ; see, e.g.,11 . Interval computations: a problem. When the inputs xi of the data processing algorithms are known with interval uncertainty, we face the following problem: • Given: an algorithm y = f (x1 , . . . , xn ) and n intervals xi = [xi , xi ]. • Compute: the corresponding range of y: [y, y] = {f (x1 , . . . , xn ) | x1 ∈ [x1 , x1 ], . . . , xn ∈ [xn , xn ]}.
40
x1 ✲ x2 ✲
f
y = f (x1 , . . . , xn )
✲
···
xn ✲
It is known that this problem is NP-hard even for quadratic f ; see, e.g.,8 . In other words, unless P=NP (which most computer scientists believe to be impossible), no feasible algorithm is possible that would always compute the exact range y. We thus face two major challenges: (1) find situations feasible algorithms are possible, and (2) in situations when the exact computation of y is not feasibly possible, find feasible algorithms for computing a good approximation Y ⊇ y. 2. Alternative Approach: Maximum Entropy (MaxEnt) Idea: a brief reminder. Traditional engineering approach to uncertainty is to use probablistic techniques, based on probability density functions def (pdf) ρ(x) and cumulative distribution functions (cdf) F (x) = P (X ≤ x). As we have mentioned, in many practical applications, it is very difficult to come up with the probabilities. In such applications, many different probability distributions are consistent with the same observations. In such situations, a natural idea is to select one of these distributions – e.g., the R def one with the largest entropy S = − ρ(x) · ln(ρ(x)) dx; see, e.g.,5 . Often, this idea works. This approach often leads to reasonable results. For example, for the case of a single variable x, if all we know is that x ∈ [x, x], then MaxEnt leads to a uniform distribution on [x, x]. For several variables, if we have no information about their dependence, MaxEnt implies that different variables are independently distributed. Sometimes, this idea does not work. Sometimes, the results of MaxEnt are misleading. As an example, let us consider the simplest algorithm y = x1 + . . . + xn , with ∆xi ∈ [−∆, ∆]. In this case, ∆y = ∆x1 + . . . + ∆xn . The worst case is when ∆i = ∆ for all i, then ∆y = n · ∆. What will MaxEnt return here? If all ∆xi are uniformly distributed, then for large n, due to the Central Limit Theorem, ∆y is approximately
41
√ n normal, with σ = ∆ · √ . 3 With confidence 99.9%, we can thus conclude that |∆y| ≤ 3σ; so, we √ get ∆ ∼ n, but, as we mentioned. it is possible that ∆ = n · ∆ ∼ n which, √ for large n, is much larger than n. The conclusion from this example is that using a single distribution can be very misleading, especially if we want guaranteed results – and we do want guaranteed results in high-risk application areas such as space exploration or nuclear engineering. 3. Possibility of Linearization Linearization is usually possible. Each interval has the form [e x i − ∆i , x ei − ∆i ], where x ei is a midpoint and ∆i is half-width. Possible values xi are xi = x ei + ∆xi , with |∆xi | ≤ ∆i , so f (x1 , . . . , xn ) = f (e x1 + ∆x1 , . . . , x en + ∆xn ). The values ∆i are usually reasonable small, hence the values ∆xi are also small. Thus, we can expand f into Taylor series and keep only linear terms in this expansion: f (e x1 + ∆x1 , . . .) = ye +
n X i=1
def
def
ci · ∆xi , where ye = f (e x1 , . . .) and ci =
∂f . ∂xi
Here, max(ci · ∆xi ) = |ci | · ∆i , so the range of f is [e y − ∆, ye + ∆], where n P ∆= |ci | · ∆i . i=1
Towards an algorithm. To compute ∆ =
n P
i=1
|ci | · ∆i , we need to find ci .
If we replace one of x ei with x ei + ∆i , then, due to linearization, we get def
yi = f (e x1 , . . . , x ei−1 , x ei + ∆i , x ei+1 , . . . , x en ) = ye + ci · ∆i . n P Thus, |ci | · ∆i = |yi − ye| and hence ∆ = |yi − ye|. i=1
Resulting algorithm. Compute ye = f (e x1 , . . . , x en ), compute n values n P yi = Pf (e x1 , . . . , x ei−1 , x ei + ∆i , x ei+1 , . . . , x en ), then compute ∆ = |yi − ye| i=1 i h and Pe − ∆, Pe + ∆ . This algorithm requires n + 1 calls to f : to compute ye and n values yi .
Towards a faster algorithm. When the number of inputs n is large, n+1 calls may be too long. To speed up computations, we can use the following
42
δ · π
1
: if ηi are x2 1+ 2 δ n def P independently Cauchy-distributed with parameters ∆i , then η = c i · ηi
property of Cauchy distribution, with density $ρδ (x) =
is Cauchy-distributed with parameter ∆ =
c P
i=1
i=1
|ci | · ∆i .
Once we get simulated Cauchy-distributed values η, we can estimate ∆ by the Maximum Likelihood method. We also need to scale ηi to the interval [−∆i , ∆i ] on which the linear approximation is applicable. Resulting faster algorithm.7 First, we compute ye = f (e x1 , . . . , x en ). For some N (e.g., 200), for k = 1, . . . , N , we repeatedly: (k)
• use the random number generator to compute ri , i = 1, 2, . . . , n, uniformly distributed on [0, 1]; (k) (k) • compute Cauchy distributed values as ci = tan(π · (ri − 0.5)); (k) • compute the largest value K of the values ci ; (k)
∆i · c i (k) • compute simulated “actual values” xi = x ei + ; K (k) (k) • apply f and compute ∆y (k) = K · f x1 , . . . , xn − ye .
(k) by applying the bisection method Then, we compute ∆ ∈ 0, max ∆y
to the equation
1
(1)
k
2 + . . . +
1
(N )
N 2 = 2 . We stop when
∆y ∆y 1+ ∆ ∆ we get ∆ with accuracy ≈ 20% (accuracy 1% and 1.2% is approximately the same). The Cauchy-variate algorithm requires N ≈ 200 calls to f . So, when n ≫ 200, it is much faster than the above linearization-based algorithm. 1+
4. Beyond Linearization, Towards Interval Computations Linearization is sometimes not sufficient. In many application areas, it is sufficient to have an approximate estimate of y. However, sometimes, we need to guarantee that y does not exceed a certain threshold y0 : in nuclear engineering, the temperatures and the neutron flows should not exceed the critical values; a spaceship should land on the planet and does not fly past it, etc.
43
The only way to guarantee this is to have an interval Y = Y , Y for which y ⊆ Y and Y ≤ y0 . Such an interval is called an enclosure. Computing such an enclosure is one of the main tasks of interval computations. Interval computations: a brief history. The origins of interval computations can be traced to the work of Archimedes from Ancient Greece who used intervals to bound values like π; see, e.g.,1 . Its modern revival was boosted by three pioneers: Mieczyslaw Warmus (Poland), Teruo Sunaga (Japan), and Ramon Moore (USA) in 1956–59. The first successful application was taking interval uncertainty into account when planning spaceflights to the Moon. Since then, there were many successful applications: to design of elementary particle colliders (Martin Berz, Kyoko Makino, USA), to checking whether an asteroid will hit the Earth (M. Berz, R. Moore, USA), to robotics (L. Jaulin, France; A. Neumaier, Austria), to chemical engineering (Marc Stadtherr, USA), etc.4,9 Interval arithmetic: foundations of interval techniques. The problem is to compute the range [y, y] = {f (x1 , . . . , xn ) | x1 ∈ [x1 , x1 ], . . . , xn ∈ [xn , xn ]}. For arithmetic operations f (x1 , x2 ) (and for elementary functions), we have explicit formulas for the range. For example, when x1 ∈ x1 = [x1 , x1 ] and x2 ∈ x2 = [x2 , x2 ], then: • The range x1 + x2 for x1 + x2 is [x1 + x2 , x1 + x2 ]. • The range x1 − x2 for x1 − x2 is [x1 − x2 , x1 − x2 ]. • The range x1 · x2 for x1 · x2 is [min(x1 ·x2 , x1 ·x2 , x1 ·x2 , x1 ·x2 ), max(x1 ·x2 , x1 ·x2 , x1 ·x2 , x1 ·x2 )]. The range 1/x1 for 1/x1 is [1/x1 , 1/x1 ] (if 0 6∈ x1 ). Straightforward interval computations. In general, we can parse an algorithm (i.e., represent it as a sequence of elementary operations) and then perform the same operations, but with intervals instead of numbers. For example, to compute f (x) = (x − 2) · (x + 2), the computer first computes r1 := x − 2, then r2 := x + 2, and r3 := r1 · r2 . So, for estimating the range of f (x) for x ∈ [1, 2], we compute r1 := [1, 2] − [2, 2] = [−1, 0], r2 := [1, 2] + [2, 2] = [3, 4], and r3 := [−1, 0] · [3, 4] = [−4, 0]. Here, the actual range is f (x) = [−3, 0]. This example shows that we need more efficient ways of computing an enclosure Y ⊇ y.
44
First idea: use of monotonicity. For arithmetic, we had exact ranges, because +, −, · are monotonic in each variable, and monotonicity helps: if f (x1 , . . . , xn ) is (non-strictly) increasing (f ↑) in each xi , then f (x1 , . . . , xn ) = [f (x1 , . . . , xn ), f (x1 , . . . , xn )]. Similarly, if f ↑ for some xi and f ↓ for other xj . ∂f It is known that f ↑ in xi if ≥ 0. So, to check monotonicity, we can ∂xi ∂f check that the range [r i , ri ] of on xi has r i ≥ 0. Here, differentiation ∂xi can be performed by available Automatic Differentiation (AD) tools, an ∂f can be done by using straightforward interval estimating ranges of ∂xi computations. For example, for f (x) = (x − 2) · (x + 2), the derivatives is 2x, so its range on x = [1, 2] is [2, 4], with 2 ≥ 0. Thus, we get the exact range f ([1, 2]) = [f (1), f (2)] = [−3, 0]. Second idea: centered form. In the general non-monotonic case, we can use the general version of linearization – the Intermediate Value Theorem, according to which n X ∂f f (x1 , . . . , xn ) = f (e x1 , . . . , x en ) + (χ) · (xi − x ei ) ∂x i i=1
for some χi ∈ xi . Because of this theorem, we can conclude that f (x1 , . . . , xn ) ∈ Y, where Y = ye +
n X ∂f (x1 , . . . , xn ) · [−∆i , ∆i ]. ∂x i i=1
Here also, differentiation can be done by Automatic Differentiation (AD) tools, and estimating the ranges of derivatives can be done, if appropriate, by monotonicity, or else by straightforward interval computations, or also by centered form (this will take more time but lead to more accurate results). Third idea: bisection. It is known that the inaccuracy of the first order approximation (like the ones we used) is O(∆2i ). So, when ∆i is too large and the accuracy is low, we can split the corresponding interval in half (reducing the inaccuracy from ∆2i to ∆2i /4), and then take the union of the resulting ranges. For example, the function f (x) = x · (1 − x) is not monotonic for x ∈ x = [0, 1]. So, we take x′ = [0, 0.5] and x′′ = [0.5, 1]; on the 1st subinterval, the range of the derivative is 1 − 2 · x = 1 − 2 · [0, 0.5] = [0, 1], so f ↑
45
and f (x′ ) = [f (0), f (0.5)] = [0, 0.25]. On the 2nd subinterval, we have 1−2·x = 1−2·[0.5, 1] = [−1, 0], so f ↓ and f (x′′ ) = [f (1), f (0.5)] = [0, 0.25]. The resulting estimate is f (x′ ) ∪ f (x′′ ) = [0, 0.25], which is the exact range. These ideas underlie efficient interval computations algorithms and software packages.3,4,6,9 5. Partial Information about Probabilities Formulation of the problem. In the ideal case, we know the probability distributions. In this case, in principle, we can find the distribution for y = f (x1 , . . . , xn ) by using Monte-Carlo simulations. In the previous section, we considered situations when we only know an interval of possible values. In practice, in addition to the intervals, we sometimes also have partial information about the probabilities. How can we take this information into account? How to represent partial information about probabilities. In general, there are many ways to represent a probability distribution; it is desirable to select a representation which is the most appropriate for the corresponding practical problem. In most practical problems, the ultimate objective is to make decisions. According to decision theory, a decision maker should look for an alternative a that maximizes the expected utility Ex [u(x, a)] → max. a
When the utility function u(x) is smooth, we can expand it in Taylor series u(x) = u(x0 ) + (x − x0 ) · u′ (x0 ) + . . .; this shows that, to estimate E[u], we must know moments. In this case, partial information means that we only have interval bounds on moments. There are known algorithms for processing such bounds; see, e.g.,10 . Another case is when we have a threshold-type utility function u(x): e.g., for a chemical plant, drastic penalties start if the pollution level exceeds a certain threshold x0 . In this case, to find the expected utility, we need the know the values of the cdf F (x) = P (ξ ≤ x). Partial information means that, for every x, we only have interval bounds [F (x), F (x)] on the actual (unknown) cdf; such bounds are known as a p-box. There are also known algorithms for processing such boxes; see, e.g.,2,10 . Example of processing p-boxes. Suppose that we know p-boxes [F 1 (x1 ), F 1 (x1 )] and [F 2 (x2 ), F 2 (x2 )] for quantities x1 and x2 , we do not have any information about the relation between x1 and x2 , and we want to find the p-box corresponding F (y), F (y)] corresponding to y = x1 + x2 .
46
It is known that for every two events A and B, P (A ∨ B) = P (A) + P (B) − P (A & B) ≤ P (A) + P (B). In particular, P (¬A ∨ ¬B) ≤ P (¬A) + P (¬B). Here, P (¬A) = 1 − P (A), P (¬B) = 1−P (B), and P (¬A∨¬B) = 1−P (A & B), thus, 1−P (A & B) ≤ (1 − P (A)) + (1 − P (B)) and so, P (A & B) ≥ P (A) + P (B) − 1. We also know that P (A & B) ≥ 0, hence P (A & B) ≥ max(P (A) + P (B) − 1, 0). Let us use this inequality to get the desired bounds for F (y). def
If ξ1 ≤ x1 and ξ2 ≤ x2 , then ξ = ξ1 + ξ2 ≤ x1 + x2 . Thus, if x1 + x2 = y, then F (y) = P (ξ ≤ y) ≥ P (ξ1 ≤ x1 & ξ2 ≤ x2 ). Due to the above inequality, P (ξ1 ≤ x1 & ξ2 ≤ x2 ) ≥ P (ξ ≤ x1 ) + P (ξ2 ≤ x2 ) − 1. Here, P (ξi ≤ xi ≥ F i (xi ), so F (y) ≥ F 1 (x1 ) + F 2 (x2 ) − 1. Thus, as the desired lower bound F (y), we can take the largest of the corresponding right-hand sides: F (y) = max
max
x1 ,x2 :x1 +x2 =y
(F 1 (x1 ) + F 2 (x2 ) − 1), 0 , i.e.,
F (y) = max max(F 1 (x1 ) + F 2 (y − x1 ) − 1), 0 . x1
To find the upper bound for F (y), let us find a similar lower bound for 1 − F (y) = P (ξ > y). If x1 + x2 = y, ξ1 > x1 , and ξ2 > x2 , then ξ = ξ1 + ξ2 > y. Here, P (ξi > xi ) = 1 − P (ξi ≤ xi ) = 1 − Fi (xi ). Thus, 1−F (y) = P (ξ > y) ≥ P (ξ1 > x1 & ξ2 > x2 ) ≥ P (ξ1 > x1 )+P (ξ2 > x2 )−1 = (1 − F1 (x1 )) + (1 − F2 (x2 )) − 1 = 1 − F1 (x1 ) − F2 (x2 ), hence F (y) ≤ F1 (x1 ) + F2 (x2 ). Since Fi (xi ) ≤ F i (xi ), we have F (y) ≤ F 1 (x1 ) + F 2 (x2 ). Thus, as the desired upper bound F (y), we can take the smallest of the corresponding right-hand sides: F (y) = min min (F 1 (x1 ) + F 2 (x2 )), 1 , i.e., x1 ,x2 :x1 +x2 =y
F (y) = min min(F 1 (x1 ) + F 2 (y − x1 )), 1 . x1
Similar formulas can be derived for other elementary operations. How to represent p-boxes. Representing a p-box means representing two cdfs F (x) and F (x). For each cdf F (x), to represent all its values 1 with accuracy , it is sufficient to store n − 1 quantiles x1 < . . . < xn−1 , n
47
i . These values divide the real line into n def def segments [xi , xi+1 ], where x0 = −∞ and xn+1 = +∞. Each real value x belongs to one of these segments [xi , xi+1 ], in which i i+1 case, due to monotonicity of F (x), we have F (xi ) = ≤ F (x) ≤ = n n i 1 F (xi+1 ), hence F (x) − ≤ . n n i.e., values xi for which F (xi ) =
Need to go beyond p-boxes. In many practical situations, we need to maintain the value within a certain interval: e.g., the air conditioning must maintain the temperature within certain bounds, a spaceship must land within a certain region, etc. In such cases, the utility drastically drops if we are outside the interval; thus, the expected utility is proportional to the probability F (a, b) = P (ξ ∈ (a, b]) to be within the corresponding interval (a, b]. In such situations, partial information about probabilities means that for a and b, we only know the interval [F (a, b), F (a, b)] containing the actual (unknown) values F (a, b). When we know the exact cdf F (x), then we can compute F (a, b) as F (a) − F (b). However, in case of partial information, it is not sufficient to only know the p-box. For example, let us assume that x is uniformly distributed on some interval of known width ε > 0, but we do not know on which. In this case, as one can easily see, for every x, F (x) = 0 and F (x) = 1 – irrespective on ε.On the other hand, for any interval [a, b], we b−a have F (a, b) = min , 1 . This bound clearly depends on ε and thus, ε cannot be uniquely determined by the p-box values. How to process this more general information. Good news is that we process this more general information similarly to how we process p-boxes. Specifically, when ξ1 ∈ x1 = (x1 , x1 ] and ξ2 ∈ x2 = (x2 , x2 ], then ξ = ξ1 + ξ2 ∈ x1 + x2 = (x1 + x2 , x1 + x2 ]. Thus, if x1 + x2 ⊆ y = [y, y], we have F (y, y) ≥ P (ξ1 ∈ x1 & ξ2 ∈ x2 ) ≥ P (ξ1 ∈ x1 ) + P (ξ2 ∈ x2 ) − 1 ≥ F 1 (x1 ) + F 2 (x2 ) = 1. So, as the desired lower bound F (y, y), we can take the largest of the corresponding right-hand sides: max (F 1 (x1 ) + F 2 (x2 ) − 1), 0 . F (y, y) = max x1 ,x2 :x1 +x2 ⊆y
48
This formula is very similar to the formula for p-boxes. The formula for the upper bound comes from the fact that F (y, y) = F (y) − F (y), and thus, F (y, y) ≤ F (y) − F (y). We already know the values F (y) − F (y), thus we can take their difference as the desired upper bound F (y, y): F (y, y) = min min(F 1 (x1 ) + F 2 (y − x1 )), 1 − x1
max max(F 1 (x1 ) + F 2 (y − x1 ) − 1), 0 . x1
Similar formulas can be obtained for other elementary operations. How to represent this more general information. Not so good news is that representing such a more general information is much more difficult than representing p-boxes. Indeed, similarly to p-boxes, we would like to represent all the values 1 F (a, b) and F (a, b) with a given accuracy , i.e., we would like to find the n values x1 < . . . < xN for which xi ≤ a ≤ xi+1 and xj ≤ b ≤ xj+1 implies 1 1 |F (a, b) − F (xi , xj ) ≤ and |F (a, b) − F (xi , xj ) ≤ . n n For p-boxes, we could use N = n values xi . Let us show that for the bounds on P (a, b), there is no upper bound on the number of values needed. Namely, we will show that in the above example, when ε → 0, the corresponding number of points N grows indefinitely: N → ∞. Indeed, when j = i, a = xi , and b = xi+1 , then, due to F (xi , xi ) = 0, the above conxi+1 − xi 1 1 ≤ , i.e., dition means F (xi , xi+1 ) ≤ . Thus, we must have n ε n ε xi+1 − xi ≤ . The next point xi+1 is this close to the previous one, so, n n e.g., on the unit interval [0, 1], we need at least N ≥ such points. When ε ε → 0, the number of such points indeed tends to infinity. It is worth mentioning that we can have an upper bound on N if we know an upper bound d on the probability density ρ(x): in this case, F (a, b) ≤ (b− 1 a) · d and thus, to get the desired accuracy , it is sufficient to have xi+1 − n 1 . On an interval of width W , we thus need N = W xi+1 − xi = xi = n·d W · n · d points.
49
Acknowledgments This work was supported in part by the National Science Foundation grants HRD-0734825 and HRD-1242122 (Cyber-ShARE Center of Excellence) and DUE-0926721. The author is greatly thankful to Scott Ferson, to Franco Pavese, and to all the participants of the International Conference on Advanced Mathematical and Computational Tools in Metrology and Testing AMTCM’2014 (St. Petersburg, Russia, September 9–12, 2014) for valuable discussions. References 1. Archimedes, On the measurement of the circle, In: T. L. Heath (ed.), The Works of Archimedes (Dover, New York, 1953). 2. S. Ferson et al., Constructing Probability Boxes and Dempster-Shafer Structures (Sandia Nat’l Labs, Report SAND2002-4015, 2003). 3. Interval computations website http://www.cs.utep.edu/interval-comp 4. L. Jaulin et al., Applied Interval Analysis (Springer, London, 2001). 5. E. T. Jaynes and G. L. Bretthorst, Probability Theory: The Logic of Science (Cambridge University Press, Cambridge, UK, 2003). 6. V. Kreinovich, Interval computations and interval-related statistical techniques, In: F. Pavese and A. B. Forbes (eds.), Data Modeling for Metrology and Testing in Measurement Science (Birkhauser-Springer, Boston, 2009), pp. 117–145. 7. V. Kreinovich and S. Ferson, A new Cauchy-Based black-box technique for uncertainty in risk analysis, Reliability Engineering and Systems Safety 85(1– 3), 267–279 (2004). 8. V. Kreinovich et al., Computational Complexity and Feasibility of Data Processing and Interval Computations (Kluwer, Dordrecht, 1997). 9. R. E. Moore, R. B. Kreinovich, and M. J. Cloud, Introduction to Interval Analysis (SIAM Press, Philadelphia, Pennsylvania, 2009). 10. H. T. Nguyen et al., Computing Statistics under Interval and Fuzzy Uncertainty (Springer, Berlin, Heidelberg, 2012). 11. S. G. Rabinovich, Measurement Errors and Uncertainty:Theory and Practice (Springer, Berlin, 2005).
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 50–53)
CLASSIFICATION, MODELING AND QUANTIFICATION OF HUMAN ERRORS IN CHEMICAL ANALYSIS ILYA KUSELMAN National Physical Laboratory of Israel, Givat Ram, Jerusalem 91904, Israel E-mail: [email protected]
Classification, modeling and quantification of human errors in chemical analysis are described. The classification includes commission errors (mistakes and violations) and omission errors (lapses and slips) by different scenarios at different stages of the analysis. A Swiss cheese model is used for characterization of the error interaction with a laboratory quality system. A new technique for quantification of human errors in chemical analysis, based on expert judgments, i.e. on the expert(s) knowledge and experience, is discussed.
Keywords: Human errors; Classification; Modeling; Quantification; Analytical Chemistry
1. Introduction Human activity is never free from errors: the majority of incidents and accidents are caused by human errors. In chemical analysis, human errors may lead to atypical test results, in particular out-of-specification test results that fall outside established specifications in the pharmaceutical industry, or do not comply with regulatory, legislation or specification limits in other industries and fields, such as environmental and food analysis. Inside the limits or at their absence (e.g., for an environmental object or a new material) errors may also lead to incorrect evaluation of the tested properties. Therefore, study of human errors is necessary in any field of analytical chemistry and required from any laboratory (lab) seeking accreditation. Such a study consists of classification, modeling and quantification of human errors [1].
50
51
2. Classification The classification includes commission errors (knowledge-, rule- and skill-based mistakes and routine, reasoned, reckless and malicious violations) and omission errors (lapses and slips) by different scenarios at different stages of the analysis [1]. There are active errors by a sampling inspector and/or an analyst/operator. Errors due to a lab poor design, a defect of the equipment and a faulty management decision, are latent errors [2]. 3. Modeling A Swiss cheese model is used for characterization of the errors interaction with a lab quality system. This model considers the quality system components j = 1, 2, .., J as protective layers against human errors. For example, the system components are: validation of the measurement/analytical method and formulation of standard operation procedures (SOP); training of analysts and proficiency testing; quality control using statistical charts and/or other means; and supervision. Each such component has weak points, whereby errors are not prevented, similar to holes in slices of the cheese. Coincidence of the holes in all components of the lab quality system on the path of a human error is a defect of the quality system, which does not allow prevention of an atypical result of the analysis [1]. 4. Quantification 4.1. A new technique By this technique [3] kinds of human error k = 1, 2, …, K and steps of the analysis m = 1, 2, …, M in which the error may happen (locations of the error), form event scenarios i = 1, 2, …, I, where I = K × M. An expert may estimate likelihood pi of scenario i by the following scale: likelihood of an unfeasible scenario – as pi = 0, weak likelihood - as pi = 1, medium – as pi = 3, and strong (maximal) likelihood – as pi = 9. The expert estimates/judgments on severity of an error by scenario i as expected loss li of reliability of the analysis, are
52
performed with the same scale (0, 1, 3, 9). Estimates of the possible reduction rij of likelihood and severity of human error scenario i as a result of the error blocking by quality system layer j (degree of interaction) are made by the same expert(s) using again this scale. The interrelationship matrix of rij has I rows and J columns, hence it contains I × J cells of estimate values. Blocking human error according to scenario i by a quality system component j can be more effective in presence of another component j' (j' ≠ j) because of their synergy ∆ (jji )', equals to 0 when the effect is absent, and equals to 1 when it is. Estimates qj of importance of quality system component j in decreasing losses from human error are q j = iI=1 pi li rij sij , where the synergy factor is calculated as
sij = 1 +
(i ) J j '≠ j ∆ jj '
( J − 1) .
The technique allows transformation of the semi-intuitive expert judgments on human errors and on the laboratory quality system into the following quantitative scores expressed in %: a) likelihood score of human error in the analysis P* = (100% 9 )
I p i =1 i
I ; b) severity (loss) score of human error for
reliability of the analysis results L* = (100% 9 ) *
I
l I ; c) importance score
i =1 i
of a component of the lab quality system q j = 100%q j
J j =1q j ;
and d)
effectiveness score of the quality system, as a whole, against human error Eff* =
(100% 9 )
J j =1q j
J j =1
I i =1 p i l i s ij .
4.2. Further developments Calculation of the score values qj* allows evaluation of the quality system components for all steps of the analysis together. The columns of the interrelationship matrix are used for that: it is the "vertical vision" of the matrix. However, an analyst may be interested to know which step m is less protected from errors, with intent to improve it. To obtain this information the "horizontal vision" of the interrelationship matrix (by the rows) is necessary. The scores similar to qj*, but related to the same error location, i.e., the same step m of the analysis, are applicable for that. Variability of the expert judgments and robustness of the quantification parameters of human errors are also important. Any expert feels a natural doubt choosing one of close values from the proposed scale: 0 or 1? 1 or 3? 3 or 9?
53
One change of an expert judgment on the likelihood of scenario i from pi = 0 to pi = 1 and vice versa leads to the change of the likelihood score P* from 11.11 % for one scenario to 0.21 % for I = 54 scenarios, for example. The same is true for severity score L*. Evaluation of the robustness of quality system scores to variation of the expert judgments is more complicated and can be based on Monte Carlo simulations. Examples of the human error classification, modeling and quantification using this technique are considered for pH measurements of groundwater [3], multi-residue analysis of pesticides in fruits and vegetables [4], and ICP-MS of geological samples [5]. Acknowledgements This research was supported in part by the International Union of Pure and Applied Chemistry (IUPAC Project 2012-021-1-500). The author would like to thank the project team members Dr. Francesca Pennecchi (Istituto Nazionale di Ricerca Metrologica, Italy), Dr. Aleš Fajgelj (International Atomic Energy Agency, Austria), Dr. Stephen L.R. Ellison (Laboratory of Government Chemist Ltd, UK) and Prof. Yury Karpov (State Research and Design Institute for Rare Metal Industry, Russia) for useful discussions. References 1. 2. 3.
4. 5.
I. Kuselman, F. Pennecchi, A. Fajgelj, Y. Karpov. Human errors and reliability of test results in analytical chemistry. Accred. Qual. Assur. 18, 3 (2013). ISO/TS 22367. Medical laboratories – Reduction of error through risk management and continual improvement (2008). I. Kuselman, E. Kardash, E. Bashkansky, F. Pennecchi, S. L. R. Ellison, K. Ginsbury, M. Epstein, A. Fajgelj, Y. Karpov. House-of-security approach to measurement in analytical chemistry: quantification of human error using expert judgments. Accred. Qual. Assur. 18, 459 (2013). I. Kuselman, P. Goldshlag, F. Pennecchi. Scenarios of human errors and their quantification in multi-residue analysis of pesticides in fruits and vegetables. Accred. Qual. Assur. 19, online, DOI 10.1007/00769-014-1071-6 (2014). I. Kuselman, F. Pennecchi, M. Epstein, A. Fajgelj, S. L. R. Ellison. Monte Carlo simulation of expert judgments on human errors in chemical analysis – a case study of ICP-MS. Talanta 130C, 462 (2014).
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 54–65)
APPLICATION OF NONPARAMETRIC GOODNESS-OF-FIT TESTS: PROBLEMS AND SOLUTION* B. YU. LEMESHKO Applied Mathematic Department, Novosibirsk State Technical University, Novosibirsk, Russia E-mail: [email protected] www.ami.nstu.ru/~headrd/ In this paper, the problems of application of nonparametric goodness-of-fit tests in the case of composite hypotheses have been considered. The factors influencing test statistic distributions have been discussed. A manual on application of nonparametric tests have been prepared. The proposed recommendations would reduce errors in statistical inference when using considered tests in practice. Keywords: Composite hypotheses of goodness-of-fit; Anderson-Darling test, Cramervon Mises-Smirnov test, Kolmogorov test, Kuiper test, Watson test, Zhang tests.
1. Introduction In applications of statistical data analysis, there are a lot of examples of incorrect usage of nonparametric goodness-of-fit tests (Kolmogorov, Cramer-von Mises Smirnov, Anderson-Darling, Kuiper, Watson, Zhang tests). The most common errors in testing composite hypotheses are associated with using classical results obtained for simple hypotheses. There are simple and composite goodness-of-fit hypotheses. A simple hypothesis tested has the following form H 0 : F ( x) = F ( x, θ ) , where F ( x,θ ) is the distribution function, which is tested for goodness-of-fit with observed sample, and θ is an known value of parameter (scalar or vector). A composite hypotheses tested has the form H 0 : F ( x) ∈ { F ( x, θ ), θ ∈ Θ} , where Θ is the definition domain of parameter θ . If estimate θˆ of scalar or vector parameter of tested distribution was not found by using the sample, for which goodness-of-fit hypothesis is tested, then the application of goodness-of-fit test for composite hypothesis is similar to the application of test in the case of simple hypothesis.
*
This work is supported by the Russian Ministry of Education and Science (project 2.541.2014K). 54
55
The problems arise in testing composite hypothesis, when estimate θˆ of the distribution parameter was found by using the same sample on which goodnessof-fit hypothesis is tested. 2. Goodness-of-fit tests for simple hypotheses In the case of simple hypotheses, nonparametric tests are “free from distribution”, i.e. the limiting distribution of statistics of classical nonparametric goodness-of-fit tests do not depend on a tested distribution and its parameters. The Kolmogorov test (which is usually called the Kolmogorov–Smirnov test) is based on statistic Dn = sup Fn ( x) − F ( x, θ ) , (1) x 25 . Unfortunately, the dependence of the nonparametric goodness-of-fit tests statistics distributions for testing composite hypotheses on the values of the shape parameter (or parameters) (see Fig. 3) appears to be for many parametric distributions implemented in the most interesting applications, particularly in problems of survival analysis and reliability. This is true for families of gamma-, beta-distributions of the 1st, 2nd and 3rd kind, generalized normal, generalized Weibull, inverse Gaussian distributions, and many others.
5. An interactive method to study distributions of statistics In the cases, when statistic distributions of nonparametric tests depend on a specific values of shape parameter(s) of tested distribution, the statistic distribution cannot be found in advance (before computing corresponding estimates). In such situations, it is recommended to find the test statistic distribution by using interactive mode in statistical analysis process, see Ref. [18], and then, to use this distribution for testing composite hypothesis. The dependence of the test statistics distributions on the values of the shape parameter or parameters is the most serious difficulty that is faced while applying nonparametric goodness-of-fit criteria to test composite hypotheses in different applications. Since estimates of the parameters are only known during the analysis, so the statistic distribution required to test the hypothesis could not be obtained in advance. For the criteria with statistics (8) - (10), the problem is harder to be solved as statistics distributions depend on the samples sizes. Therefore, the statistics distributions of applied test should be obtained interactively during statistical analysis (see Ref. [19, 20]), and then should be used to make conclusions about composite hypothesis under test. The implementation of such an interactive mode requires a developed software that allows parallelizing the simulation process and taking available computing resources. The usage of parallel computing enables to decrease the time of simulation of the required test statistic distribution GN ( Sn H 0 ) (with the required accuracy), which is used to calculate the achieved significance level P{Sn ≥ S *} , where S * is the value of the statistic calculated using an original sample. In the software system (see Ref. [4]), the interactive method for the research of statistics distributions is implemented for the following nonparametric goodness-of-fit tests: Kolmogorov, Cramer-von Mises-Smirnov,
61
Anderson-Darling, Kuiper, Watson and three Zhang tests. Moreover, the different methods of parameter estimation can be used there. The following example demonstrates the accuracy of calculating the achieved significance level depending on sample size N of simulated interactively empirical statistics distributions (Software system, Ref. [4]).
Example. It is necessary to check a composite hypothesis on goodness-of-fit of the inverse Gaussian distribution with the density function 1/ 2
f ( x) =
1
θ0
θ2
x − θ3
2π
θ0 3
x − θ3
θ2
exp −
θ2
2θ12
2
− θ1
x − θ3
θ2
on the basis of the following sample of the size n =100: 0.945 1.040 0.239 0.382 0.398 0.946 1.248 1.437 0.286 0.987 2.009 0.319
0.498 0.694 0.340 1.289
0.316 1.839 0.432 0.705
0.371 0.668
0.421 1.267 0.466 0.311
0.466 0.967 1.031 0.477
0.322 1.656
1.745 0.786 0.253 1.260
0.145 3.032 0.329 0.645
0.374 0.236
2.081 1.198 0.692 0.599
0.811 0.274 1.311 0.534
1.048 1.411
1.052 1.051 4.682 0.111
1.201 0.375 0.373 3.694
0.426 0.675
3.150 0.424 1.422 3.058
1.579 0.436 1.167 0.445
0.463 0.759
1.598 2.270 0.884 0.448
0.858 0.310 0.431 0.919
0.796 0.415
0.143 0.805 0.827 0.161
8.028 0.149 2.396 2.514
1.027 0.775 0.240 2.745 0.885 0.672 0.810 0.144 0.125 1.621 The shift parameter θ 3 is assumed to be known and equal to 0. The shape parameters θ 0 , θ1 , and the scale parameter θ 2 are estimated using the sample. The maximum likelihood estimates (MLEs) calculated using the sample above are the following θˆ0 = 0.7481 , θˆ1 = 0.7808 , θˆ2 =1.3202 . The statistics distributions of nonparametric goodness-of-fit tests depend on the values of the shape parameters θ 0 and θ1 (see Ref. [21]), does not depend on the value of the scale parameter θ 2 and have to be calculated using values θ 0 = 0.7481 , θ1 = 0.7808 . The calculated values of the statistics Si* for Kuiper, Watson, Zhang, Kolmogorov, Cramer-von Mises-Smirnov, Anderson-Darling tests and achieved significance levels for these values P{S ≥ Si* H 0 } (p-values), obtained with different accuracy of simulation (with different sizes N of simulated samples of statistics) are given in Table 1.
62
The similar results for testing goodness-of-fit of the -distribution with the density:
θ1 x − θ3 f ( x) = θ 2 Γ(θ 0 ) θ 2
θ1θ0 −1
−
e
x −θ3
θ1
θ2
on the given sample, are given in Table 2. The MLEs of the parameters are θˆ0 = 2.4933 , θˆ1 = 0.6065 , θˆ2 = 0.1697 , θˆ4 = 0.10308 . In this case, the distribution of test statistic depends on the values of the shape parameters θ 0 and θ1 .
Table 1. The achieved significance levels for different sizes inverse Gaussian distribution
The values of tests statistics Vnmod = 1.1113 2 n
U = 0.05200 Z A = 3.3043 Z C = 4.7975 Z K = 1.4164 S K = 0.5919 Sω = 0.05387 SΩ = 0.3514
2 n
U = 0.057777 Z A = 3.30999 Z C = 4.26688 Z K = 1.01942 S K = 0.60265 Sω = 0.05831 SΩ = 0.39234
when testing goodness-of-fit of the
N = 103
N = 104
N = 105
N = 106
0.479
0.492
0.493
0.492
0.467
0.479
0.483
0.482
0.661
0.681
0.679
0.678
0.751
0.776
0.777
0.776
0.263
0.278
0.272
0.270
0.643
0.659
0.662
0.662
0.540
0.557
0.560
0.561
0.529
0.549
0.548
0.547
Table 2. The achieved significance levels for different sizes -distribution
The values of tests statistics Vnmod = 1.14855
N
N
when testing goodness-of-fit of the
N = 103
N = 104
N = 105
N = 106
0.321
0.321
0.323
0.322
0.271
0.265
0.267
0.269
0.235
0.245
0.240
0.240
0.512
0.557
0.559
0.559
0.336
0.347
0.345
0.344
0.425
0.423
0.423
0.424
0.278
0.272
0.276
0.277
0.234
0.238
0.238
0.237
63
Fig. 4 presents the empirical distribution and two theoretical ones (IGdistribution and -distribution), obtained by the sample above while testing composite hypotheses. The results presented in Table 1 and Table 2 show that estimates of p-value obtained for IG-distribution higher than estimates of p-value for the distribution, i.e. the IG-distribution fits the sample given above better than the distribution. Moreover, it is obvious that the number of simulated samples of statistics N = 104 is sufficient to obtain the estimates of p-value with desired accuracy in practice, and this fact does not lead to the noticeable increase of time of statistical analysis.
Fig. 4. Empirical and theoretical distributions (IG-distribution and -distribution), calculated using given sample
6. Conclusion The prepared manual for application of nonparametric goodness-of-fit tests (Ref. [17]) and the technique of interactive simulation of tests statistic distributions provide the correctness of statistical inferences when testing composite and simple hypotheses.
References 1.
T. W. Anderson, D. A. Darling. Asymptotic theory of certain “Goodness of fit” criteria based on stochastic processes, J. Amer. Statist. Assoc., 23, 1952, pp. 193–212.
64
2. 3. 4. 5. 6. 7. 8. 9. 10.
11.
12.
13.
14.
15.
T. W. Anderson, D. A. Darling. A test of goodness of fit, J. Amer. Statist. Assoc., 29, 1954, pp. 765–769. L.N. Bolshev, N.V. Smirnov. Tables of Mathematical Statistics. (Moscow: Science, 1983). ISW – Program system of the statistical analysis of one-dimensional random variables. URL: http://ami.nstu.ru/~headrd/ISW.htm (address date 02.09.2014) M. Kac, J. Kiefer, J. Wolfowitz. On tests of normality and other tests of goodness of fit based on distance methods, Ann. Math. Stat., 26, 1955, pp. 189–211. A. N. Kolmogoroff. Sulla determinazione empirica di una legge di distribuzione, G. Ist. Ital. attuar. 4(1), 1933, pp. 83–91. N. H. Kuiper, Tests concerning random points on a circle, Proc. Konikl. Nederl. Akad. Van Wettenschappen, Series A, 63, 1960, pp. 38-47. B. Yu. Lemeshko, A. A. Gorbunova. Application and Power of the Nonparametric Kuiper, Watson, and Zhang Tests of Goodness-of-Fit, Measurement Techniques. 56(5), 2013, pp. 465-475. B. Yu. Lemeshko, S. B. Lemeshko. Distribution models for nonparametric tests for fit in verifying complicated hypotheses and maximum-likelihood estimators. P. 1, Measurement Techniques. 52(6), 2009, pp. 555–565. B. Yu. Lemeshko, S. B. Lemeshko. Models for statistical distributions in nonparametric fitting tests on composite hypotheses based on maximumlikelihood estimators. P. II, Measurement Techniques. 52(8), 2009, pp. 799– 812. B. Yu. Lemeshko, S. B. Lemeshko, S. N. Postovalov. Statistic Distribution Models for Some Nonparametric Goodness-of-Fit Tests in Testing Composite Hypotheses, Communications in Statistics – Theory and Methods, 39(3), 2010, pp. 460–471. B. Yu. Lemeshko, S. B. Lemeshko, M. S. Nikulin, N. Saaidia. Modeling statistic distributions for nonparametric goodness-of-fit criteria for testing complex hypotheses with respect to the inverse Gaussian law, Automation and Remote Control, 71(7), 2010, pp. 1358–1373. B. Yu. Lemeshko, S.B. Lemeshko. Models of Statistic Distributions of Nonparametric Goodness-of-Fit Tests in Composite Hypotheses Testing for Double Exponential Law Cases, Communications in Statistics - Theory and Methods, 40(16), 2011, pp. 2879-2892. B. Yu. Lemeshko, S. B. Lemeshko. Construction of Statistic Distribution Models for Nonparametric Goodness-of-Fit Tests in Testing Composite Hypotheses: The Computer Approach, Quality Technology & Quantitative Management, 8(4), 2011, pp. 359-373. B. Yu. Lemeshko, A. A. Gorbunova. Application of nonparametric Kuiper and Watson tests of goodness-of-fit for composite hypotheses, Measurement Techniques, 56(9), 2013, pp. 965-973.
65
16. B. Yu. Lemeshko, A. A. Gorbunova, S. B. Lemeshko, A. P. Rogozhnikov. Solving problems of using some nonparametric goodness-of-fit tests, Optoelectronics, Instrumentation and Data Processing, 50(1), 2014, pp. 21-35. 17. B. Yu. Lemeshko. Nonparametric goodness-of-fit tests. Guide on the application. – M.: INFRA–M, 2014. – 163 p. (in russian) 18. B. Yu. Lemeshko, S. B. Lemeshko, A. P. Rogozhnikov. Interactive investigation of statistical regularities in testing composite hypotheses of goodness of fit, Statistical Models and Methods for Reliability and Survival Analysis : monograph. – Wiley-ISTE, Chapter 5, 2013, pp. 61-76. 19. B. Yu. Lemeshko, S. B. Lemeshko, A. P. Rogozhnikov. Real-Time Studying of Statistic Distributions of Non-Parametric Goodness-of-Fit Tests when Testing Complex Hypotheses, Proceedings of the International Workshop “Applied Methods of Statistical Analysis. Simulations and Statistical Inference” – AMSA’2011, Novosibirsk, Russia, 20-22 September, 2011, pp. 19-27. 20. B. Yu. Lemeshko, A. A. Gorbunova, S. B. Lemeshko, A. P. Rogozhnikov. Application of Nonparametric Goodness-of-fit tests for Composite Hypotheses in Case of Unknown Distributions of Statistics, Proceedings of the International Workshop “Applied Methods of Statistical Analysis. Applications in Survival Analysis, Reliability and Quality Control” – AMSA’2013, Novosibirsk, Russia, 25-27 September, 2013, pp. 8-24. 21. B. Yu. Lemeshko, S. B. Lemeshko, M. S. Nikulin, N. Saaidia. Modeling statistic distributions for nonparametric goodness-of-fit criteria for testing complex hypotheses with respect to the inverse Gaussian law, Automation and Remote Control, 71(7), 2010, pp. 1358-1373. 22. H. W. Lilliefors. On the Kolmogorov-Smirnov test for normality with mean and variance unknown, J. Am. Statist. Assoc., 62, 1967, pp. 399–402. 23. H. W. Lilliefors. On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown, J. Am. Statist. Assoc., 64, 1969, pp. 387– 389. 24. R 50.1.037-2002. Recommendations for Standardization. Applied Statistics. Rules of Check of Experimental and Theoretical Distribution of the Consent. Part II. Nonparametric Goodness-of-Fit Test. Moscow: Publishing House of the Standards, 2002. (in Russian) 25. M. A. Stephens. Use of Kolmogorov–Smirnov, Cramer – von Mises and related statistics – without extensive table, J. R. Stat. Soc., 32, 1970, pp. 115–122. 26. G. S. Watson. Goodness-of-fit tests on a circle. I, Biometrika, 48(1-2), 1961. pp. 109-114. 27. G. S. Watson. Goodness-of-fit tests on a circle. II, Biometrika, 49(1-2), 1962, pp. 57- 63.
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 66–77)
DYNAMIC MEASUREMENTS BASED ON AUTOMATIC CONTROL THEORY APPROACH A. L. SHESTAKOV South Ural State University (National Research University) Chelyabinsk, Russian Federation E-mail: [email protected] www.susu.ru The paper deals with the accuracy of dynamic measurements improvement based on automatic control theory approach. The review of dynamic measuring systems models developed by the author and his disciples is given. These models are based on modal control method, iterative principle of dynamic systems synthesis, observed state vector, sliding mode control method, parametric adaptation and neural network approach. Issues of dynamic measurements error evaluation are considered. Keywords: Dynamic Measurement, Dynamic Measuring System, Dynamic Measurement Error Evaluation, Modal Control of Dynamic Behavior, Iterative Signal Recovery Approach, Observed State Vector, Sliding Mode Control, Adaptive Measuring System, Neural Network Approach.
1. Modal control of dynamic behavior method The dynamic measurement error (DME) is determined by two main factors: dynamic characteristics of a measuring system (MS) and parameters of a measured signals. Requirements for the accuracy of dynamic measurements improvement initiated the study of two approaches to the DME correction: on a basis of the solution to convolution integral equations and its regularization [1– 4], and with the use of the inverse Fourier [5] or Laplace [6] transformation. In the present paper the third group of approaches to the DME correction based on the automatic control theory methods is proposed. 1.1. Measuring system with modal control of dynamic behavior The analysis of MSs can be made in terms of the automatic control theory (as well as of the theory of automatic control systems sensitivity [7, 8]), but the main structural difference between automatic control systems and MSs is that the latter have a primary measuring transducer (sensor), which input is inaccessible neither for direct measurement, nor for correction. Therefore, it is
66
67
impossible to cover the feedback the MS as a whole from the output to the input. This means that it is impossible to directly use approaches of modal control or other methods of the automatic control theory in MSs. However, it is possible to offer special structures of correcting devices of MSs, in which the idea of modal control can be implemented. MSs with the sensor model [9–11] are among such structures. Let the transfer function (TF) of a sensor is generally represented as follows:
WS p
Ti12 p 2 2[ i1Ti1 p 1 T j1 p 1 l
q
i1 1 r
j1 1 s
T
i
2
,
(1)
p 2 2[ i Ti p 1 T j p 1
i 1
j 1
where Ti1 , Ti , T j1 , Tj are time constants and [ i1 , [ i are damping coefficients. Its differential equation can be represented in such a way:
p n y an1 p n1 y ... a0 y
bm p m u bm1 p m1u ... b0 u ,
(2)
where y is the sensor output, u is the sensor input to be measured, a0 , a1 , …, a n 1 , b0 , b1 , …, bm are constant coefficients ( m d n ) and p d dt is the differentiation operator. Similarly, the sensor model that is presented as a real unit is described by the equation
p n y M an1 p n1 y M ... a0 y M
bm p m u M bm1 p m1u M ... b0 u M , (3)
where y M and u M are the sensor model output and input signals respectively. Differential equations of the sensor and its model are identical. Therefore, if their outputs are close to each other, their inputs will differ a little one from another. Hence, the sensor model input, which is accessible for observation, can be used to evaluate the sensor input, which is inaccessible for observation. This is the basic idea of the sensor model application to the DME correction. For the idea implementation, the system of the sensor and its model shown in Fig. 1 is formed. To achieve proximity of signals, feedbacks with coefficients K j (for j 0...n 1 ) and the m order filter with coefficients of numerator di (for i 0...m ) and denominator bi (for i 0...m ) are introduced. This structure of the MS is recognized as the invention [11]. The MS proposed is described by the following TF: W MS p
bm d m p m bm1 d m 1 p m 1 ... b0 d 0 a 0 K 0 b0 d 0 p n a n 1 K n 1 p n 1 ... a 0 K 0
. (4)
68
The last equation shows that by changing of adjustable coefficients di (for i 0...m ) and K j (for j 0...n 1 ) it is possible to receive any desired TF of the MS. The method proposed of the MS with modal control of dynamic behavior synthesis in accordance with the DME required is as follows. Type and parameters of model measured signals, which are closest to the actual measured signal, are a priori evaluated. In accordance with maximum permissible value of the DME, zeros and poles of the MS are selected. Then adjustable coefficients of the MS are calculated. These parameters define desired location of zeros and poles.
Fig. 1. Block diagram of the dynamic measuring system.
1.2. Dynamic error evaluator based on sensor model A properly designed MS performs its function to recover the sensor input with the smaller DME than the sensor output. Presence of the sensor model input and output allows to evaluate the DME of the sensor. Therefore, having available some additional signals of the measuring transducer it is possible to evaluate the DME of the entire MS. The method proposed is the basis of the DME evaluator, which is recognized as the invention [12]. The input of the MS (see Fig. 1) is u MS p WMS p u p ,
(5)
y p WS p u p ,
(6)
the output of the sensor is
69
the output of the sensor model is y M p WS p u M p .
(7)
The DME signal is formed as follows:
H 0 p
y p y M p
a0 K 0 . b0 d 0
(8)
Taking into account equations (6) and (7) the following equation is obtained from the last one:
§
H 0 p WS p ¨ u p uM p
a0 K0 · b0 d 0 ¸¹
© WS p u p uMS p WS p H MS p ,
(9)
where H MS p u p u MS p is the MS error. Thus, formation of signals difference according to (9) gives the MS error evaluation, which differs from the true evaluation in the same way as the sensor output differs from the input. This allows to correct the evaluation in the same way as the sensor signal [10, 13]. The block diagram of the DME evaluator is shown in Fig. 2.
Fig. 2. Block diagram of the dynamic measurement error evaluator.
The DME evaluator is described by the following TF: WH p
u H* p H MS p
bm d mH p m ... b0 d 0H a0 K 0H p n a n1 K n1H p n1 ... a 0 K 0H b0 d 0H
. (10)
70
The TF of the DME evaluator (10) has the same form as the TF of the MS (4). Adjustable coefficients of the evaluator diH (for i 0...m ) and K jH (for j 0...n 1 ) affect corresponding coefficients of its TF numerator and denominator in the same way. It should also be noted that the DME evaluator does not require the complete model of the sensor. Block diagrams above reflect all significant links in the implementation of the dynamic MS. They can be considered as structural representations of differential equations, which must be numerically integrated in the implementation of the MS in the form of the sensor output processing program. 2. Iterative dynamic measuring system The sensor model, which is described by the same differential equation as the sensor, in the MS structure (see Fig. 1) allows to reduce the DME. If to consider the sensor model not as the device distorting the signal, but as the device reproducing some input signal it is possible to improve this reproduction by means of additional channels with the sensor models introduction. Wellknown iterative principle of automatic control systems synthesis allows to develop systems of high dynamic accuracy. However, due to implementation difficulties, they are not widely used in control systems. In MSs the idea of iterative channels can be implemented easily, namely in the form of additional data processing channels. The structure of the MS of dynamic parameters differs from that of automatic control systems. The main difference is the impossibility of introducing feedbacks and corrective signals directly to the MS input. However, the iterative signal recovery approach in this case allows to significantly reduce the DME. The block diagram of the iterative MS proposed is shown in Fig. 3. The idea of the DME correction in such a system is as follows. The sensor output y has a certain dynamic error of the sensor input u reproduction. At the output of the sensor there is the sensor model. This model reproduces the signal y , which is dynamically distorted relative to the measured signal u . If the difference of signals y y M is fed to the second model input, this difference reproduction at the model output is obtained. The sum of signals y M 1 y M 2 y 2 reproduces signal y more accurately. Hence, the sum of their inputs u M 1 u M 2 u 2 more accurately reflects the signal u , because TFs of both the model and the sensor are identical. Then the difference of signals y y 2 is fed to the third model input. The error of the signal y reproduction in first two models is processed by the third model. The sum of first three models outputs y3 reproduces the sensor output y more accurately. Therefore, the sum of first three models inputs u M 1 u M 2 u M 3 u3 more accurately reproduces the sensor output u .
71
Fig. 3. Block diagram of the iterative dynamic measuring system.
The iterative MS for an arbitrary number N of models is described by the following TF [10, 14]: W N p 1 1 WS p N .
(11)
It should be noted that iterative MSs have high noise immunity. 3. Dynamic measuring system with observed state vector Dynamic model of a liner system with constant parameters is considered. Let u t is the input r -vector, yt is the output l -vector ( r , l n ) and x t is the state n -vector (here n is dimension of the state space). Then its state space model is described by the following system of linear differential equations in a vector-matrix form: x t A x t B u t , y t C x t D u t
(12)
72
where A , B , C , D are matrices of sizes n u n , n u r , l u n , l u r consequently, A is the system matrix, B is the control matrix, C is the output matrix and D is the feed-forward matrix. All these matrices are constants. The block diagram of the primary measuring transducer with observed state vector [10, 15] is shown in Fig. 4. Coefficients ci indicate the possibility of the state vector coordinates measurement (if ci 1 , the measurement of coordinate i is possible, and if c i 0 , the measurement is impossible). Outputs of the sensor with the observed state vector are
y1 t b0 x1 t b1 x 2 t ...bm x m1 t y 2 t c1 b1 x 2 t
y m1 t
... c m bm x m1 t
.
(13)
If m n and ci 1 for i from 1 to m , then outputs y1 t , y 2 t , …, y m 1 t (13) are complete state vector of the sensor. Otherwise, some coordinates of the state vector should be measured.
Fig. 4. Block diagram of the primary measuring transducer with observed state vector.
The possibility of the state vector coordinates measurement allows to design various block diagrams of the MS for the flexible choice of its form according to the actual measurement situation. On a basis of the primary measuring transducer with observed state vector various block diagrams of the MS was examined [10, 15]. The algorithm of optimal adjustment of the MS parameters was proposed [10, 15].
73
4. Dynamic measuring system with sliding mode control To ensure the proximity of the sensor model output to the sensor output in the MS with modal control of dynamic behavior feedbacks are introduced. It is possible to achieve the proximity of these signals in the MS by implementation of sliding mode control. The block diagram of the MS with sliding mode control [10, 16] is shown in Fig. 5. In this diagram for sliding mode launching a nonlinear unit (relay) is introduced. The gain factor K , which affects both the amplitude of the relay output signal and the switching frequency of the relay, is introduced after the nonlinear unit. Oscillations in the closed-loop nonlinear MS with sliding mode control were examined. The MS with the sensor model in the form of serial dynamic units was also proposed to ensure sliding mode stability.
Fig. 5. Block diagram of the dynamic measuring system with sliding mode control.
5. Adaptive measuring system On a basis of the MS with modal control of dynamic behavior it is possible to design a MS adaptive to the minimum of the DME [10, 17]. The MS with modal control of dynamic behavior and with adaptation of its TF coefficients by
74
direct search were investigated. The DME evaluation method in the presence of a priori information about characteristics of measured signal and noise of the sensor was proposed. The dynamic model of the MS with adaptation of its adjustable coefficients to minimum of the DME in the real time mode (see Fig. 6) was examined. This adaptation was implemented on a basis of the DME evaluator output (see Fig. 7). Coefficients k i in diagrams below are the results of the solution to some differential equations [17].
Fig. 6. Block diagram of the adaptive measuring system.
Fig. 7. Block diagram of the dynamic measurement error evaluator.
75
6. Neural network dynamic measuring system Neural networks application is one of approaches to intelligent MSs development [10, 18]. Neural network dynamic model of the sensor and training algorithm for dynamic parameters determination were considered. Neural network dynamic model of the MS with the sensor inverse model (see Fig. 8) and the algorithm for its training by minimum of mean-squared DME criterion were examined. On its base the MS in the form of serial sections of the first and second order, as well as in the form of correcting filter with special structure and identical serial sections of the first order to ensure the MS stability were proposed. Neural network dynamic models of the MS in the presence of noise at the sensor output were investigated.
Fig. 8. Block diagram of the neural network inverse sensor model.
76
Acknowledgments Author is grateful to his disciples for participation in the research and development of proposed approaches: D. Yu. Iosifov (section 3), O. L. Ibryaeva (section 3), M. N. Bizyaev (section 4), E. V. Yurasova (section 5) and A. S. Volosnikov (section 6). References 1.
G. N. Solopchenko, “Nekorrektnye zadachi izmeritel'noy tekhniki [Illconditioned Problems of Measuring Engineering]”, Izmeritel'naya tekhnika [Measuring Engineering], no. 1 (1974): 51-54. 2. A. I. Tikhonov, V. Ya. Arsenin, Metody resheniya nekorrektnykh zadach [Methods of Solution to Ill-conditioned Problems] (Moscow: Nauka, 1979). 3. V. A. Granovskiy, Dinamicheskie izmereniya: Osnovy metrologicheskogo obespecheniya [Dynamic Measurements: Fundamentals of Metrological Support] (Leningrad: Energoatomizdat, 1984). 4. E. Layer, W. Gawedzki, “Theoretical Principles for Dynamic Errors Measurement”, Measurement 8, no. 1 (1990): 45-48. 5. G. N. Vasilenko, Teoriya vosstanovleniya signalov: O reduktsii k ideal'nomu priboru v fizike i tekhnike [The theory of Signals Recovery: about Reduction to Ideal Instrument in Physics and Engineering] (Moscow: Sovetskoe Radio, 1979). 6. S. Dyer, “Inverse Laplace Transformation of Rational Functions. Part I”, IEEE. Instrumentation and Measurement Magazine 5, no. 4 (2006): 13-15. 7. E. N. Rosenwasser, R. M. Yusupov, Sensitivity of Automatic Control Systems (CRC Press, 2000). 8. M. Eslami, Theory of Sensitivity in Dynamic Systems: An Introduction (Springer-Verlag, 1994). 9. A. L. Shestakov, “Dynamic Error Correction Method”, IEEE Transactions on Instrumentation and Measurement 45, no. 1 (1996): 250-255. 10. A. L. Shestakov, Metody teorii avtomaticheskogo upravleniya v dinamicheskikh izmereniyakh: monografiya [Theory Approach of Automatic Control in Dynamic Measurements: Monograph] (Chelyabinsk: Izd-vo Yuzhno-Ural'skogo gosudarstvennogo universiteta, 2013). 11. A. L. Shestakov, “A. s. 1571514 SSSR. Izmeritel'nyy preobrazovatel' dinamicheskikh parametrov [Ⱥ. ɋ. 1571514 USSR. Measuring Transducer of Dynamic Parameters]”, Otkrytiya, izobreteniya [Discoveries and inventions], no. 22 (1990): 192.
77
12. V. A. Gamiy, V. A. Koshcheev and A. L. Shestakov, “A. s. 1673990 SSSR. Izmeritel'nyy preobrazovatel' dinamicheskikh parametrov [Ⱥ. ɋ. 1673990 USSR. Measuring Transducer of Dynamic Parameters]”, Otkrytiya, izobreteniya [Discoveries and inventions], no. 12 (1991): 191. 13. A. L. Shestakov, “Modal'nyy sintez izmeritel'nogo preobrazovatelya [Modal Synthesis of Measuring Transducer]”, Izv. RAN. Teoriya i sistemy upravleniya [Proceedings of the RAS. Theory and Control Systems], no. 4 (1995): 67-75. 14. A. L. Shestakov, “Izmeritel'nyy preobrazovatel' dinamicheskikh parametrov s iteratsionnym printsipom vosstanovleniya signala [Measuring Transducer of Dynamic Parameters with Iterative Approach to Signal Recovery]”, Pribory i sistemy upravleniya [Instruments and Control Systems], no. 10 (1992): 23-24. 15. A. L. Shestakov, O. L. Ibryaeva and D. Yu. Iosifov, “Reshenie obratnoy zadachi dinamiki izmereniy s ispol'zovaniem vektora sostoyaniya pervichnogo izmeritel'nogo preobrazovatelya [Solution to the Inverse Dynamic Measurement Problem by Using of Measuring Transducer State Vector]”, Avtometriya [Autometering] 48, no. 5 (2012): 74-81. 16. A. L. Shestakov and M. N. Bizyaev, “Vosstanovlenie dinamicheski iskazhennykh signalov ispytatel'no-izmeritel'nykh sistem metodom skol'zyashchikh rezhimov [Dynamically Distorted Signals Recovery of Testing Measuring Systems by Sliding Mode Control Approach]”, Izv. RAN. Energetika [Proceedings of RAS. Energetics], no. 6 (2004): 119-130. 17. A. L. Shestakov and E. A. Soldatkina, “Algoritm adaptatsii parametra izmeritel'noy sistemy po kriteriyu minimuma dinamicheskoy pogreshnosti [Adaptation Algorithm of Measuring System Parameters by Criterion of Dynamic Error Minimum]”, Vestnik Yuzhno-Ural'skogo gosudarstvennogo universiteta. Seriya “Komp'yuternye tekhnologii, upravlenie, radioelektronika” [Bulletin of the South Ural State University. Series “Computer Technologies, Automatic Control & Radioelectronics”], iss. 1, no. 9 (2001): 33-40. 18. A. S. Volosnikov and A. L. Shestakov, “Neyrosetevaya dinamicheskaya model' izmeritel'noy sistemy s fil'tratsiey vosstanavlivaemogo signala [Neural Network Dynamic Model of Measuring System with Recovered Signal Filtration]”, Vestnik Yuzhno-Ural'skogo gosudarstvennogo universiteta. Seriya “Komp'yuternye tekhnologii, upravlenie, radioelektronika” [Bulletin of the South Ural State University. Series “Computer Technologies, Automatic Control & Radioelectronics”], iss. 4, no. 14 (2006): 16-20.
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. –)
MODELS FOR THE TREATMENT OF APPARENTLY INCONSISTENT DATA R. WILLINK Wellington, New Zealand E-mail: [email protected] Frequently the results of measurements of a single quantity are found to be mutually inconsistent under the usual model of the data-generating process. Unless this model is adjusted, it becomes impossible to obtain a defensible estimate of the quantity without discarding some of the data. However, taking that step seems arbitrary and can appear unfair when each datum is supplied by a different laboratory. Therefore, we consider various models that do not involve discarding any data. Consider a set of measurement results from n independent measurements with stated standard uncertainties. The usual model takes the standard uncertainties to be the standard deviations of the distributions from which the measurement results are drawn. One simple alternative involves supposing there is an unknown extra variance common to each laboratory. A more complicated model has the extra variance differing for each laboratory. A further complication is to allow the extra variance to be present with an unknown probability different for each laboratory. Maximum-likelihood estimates of the measured quantity can be obtained with all these models, even though the last two models have more unknown parameters than there are data. Simulation results support the use of the model with the single unknown variance. Keywords: Combination of data; Random effects; Inconsistent data.
1. Introduction Frequently data put forward as the results of measurements of a single quantity appear mutually inconsistent. Unless the model of the data-generating process is adjusted, it becomes impossible to obtain a defensible estimate of that quantity without discarding some of the data. However, discarding data can appear unfair and arbitrary, especially when each datum is supplied by a different laboratory. Therefore, in this paper we consider alternative models for the generation of the data. Let θ denote the fixed unknown quantity measured. The information at hand is a set of measurement results x1 , . . . , xn and standard uncertainties u1 , . . . , un from n independent measurements of θ. The usual model for the generation of the data takes xi to be drawn from the normal distribution 78
79
with mean θ and variance u2i . This model can be written as xi ← N(θ, u2i ),
i = 1, . . . , n.
(1)
Sometimes it will be apparent that this model cannot properly describe the spread in the data. If we are to continue to use the data without downweighting or removing any of them then another model must be proposed. Any model should be a realistic representation of the system studied, and the relevant system here is the data-generating process. So if we are to propose an alternative model then it should be realistic as a description of how the xi data arise. Also, it should be amenable to analysis, lead to a meaningful estimate of θ and, arguably, should contain (1) as a special case. One useful possibility is the model xi ← N(θ, u2i + σ 2 ),
i = 1, . . . , n,
(2)
where σ 2 is an unknown nuisance parameter.1 This model involves the ideas that (i) the measurement procedure in each laboratory incurred an additional error not accounted for in the uncertainty calculations and (ii) the sizes of the additional errors in the n measurements can be regarded as a random sample from the normal distribution with mean 0 and unknown variance σ 2 . (It is a special case of a standard ‘random effects’ model as discussed by Vangel and Rukhin,2 where the u2i variances are not known but are estimated from data and where concepts of ‘degrees of freedom’ apply.) The merit of model (2) is its simplicity and its additive nature, which is realistic. It might be criticised for the implication that every laboratory has failed to properly assess some source of error, (even if the estimates of the extra errors turn out to be small). One subject of this paper is a generalization of (2) constructed to address this criticism. Section 2 gives more details of models (1) and (2) and the ways in which they are fitted to the data by the principle of maximum-likelihood. Section 3 describes an extension to (2) and its solution by maximum-likelihood, and Section 4 describes a more complicated model that turns out to have the same solution. Sections 5 and 6 present examples of the results obtained with the models, and Section 7 uses simulation to examine the abilities of the models to give accurate estimates of quantities measured. A different alternative to (1), which has been proposed in Bayesian analyses, is xi ← N(θ, κu2i ) for i = 1, . . . , n.3,4 If this model is interpreted as describing the data-generating process then its implication is that every laboratory has erred by a common factor κ in assessing the overall error
80
variance. Given that the assessments of variance are made for individual components of error, added together, and assessed independently at different laboratories, such a model seems highly unrealistic. Also, it can be inferred that, unless some values for κ are favoured over others a priori, multiplying every submitted standard uncertainty ui by a constant would produce no change in the estimate of θ or in the standard uncertainty of this estimate.5 This does not seem reasonable. 2. The standard models The total error in a measurement result xi can be seen as the sum of a component whose scale is accurately ‘counted’ in the uncertainty budget, ec,i , and a component whose scale is not properly assessed, enc,i . The laboratory will see the standard uncertainty u(xi ) as the parent standard deviation of ec,i , i.e. the standard deviation of the distribution from which ec,i arose. Also, the laboratory claims that enc,i does not exist, which is functionally equivalent to claiming that enc,i = 0. If we accept this claim then we are led to adopt (1), which we shall call Model I. 2.1. Model I Suppose that, for each i, we accept the claim that enc,i is zero. For laboratory i we obtain the model xi ← (θ, u2i ), which indicates that xi was drawn from some distribution with mean θ and variance u2i . The weighted-leastsquares estimate of θ and the minimum-variance unbiased linear estimate of θ under this model are both given by Pn xi /u2i θˆ = Pi=1 (3) n 2 . i=1 1/ui
The distribution from which θˆ is drawn has mean θ and variance Pn ( i=1 1/u2i )−1 , so the corresponding standard uncertainty is s ˆ = Pn 1 (4) u(θ) 2. i=1 1/ui
If the parent distributions of the xi values are treated as being normal then the complete model is (1), in which case θˆ in (3) is also the maximumlikelihood estimate, as can be found by maximising the likelihood function n Y 1 −(xi − θ)2 √ L(θ) = exp . 2u2i 2πui i=1
81
Henceforth, we assume that the distributions are sufficiently close to normal for this step to be taken. We will also use the principle of maximumlikelihood exclusively in fitting a model for the estimation of θ. 2.2. Model II If, by some principle, Model I is deemed to be inconsistent with the data then we must conclude that either (i) the xi values were not drawn independently, (ii) the distributions are not well modelled as being normal or (iii) one or more of the enc,i errors are non-zero. One unprejudiced modification of the model based on the third of these possibilities involves the idea that each enc,i was drawn from a normal distribution with mean zero and unknown variance σ 2 . The model then becomes (2). This assumption of a single distribution for the extra errors does not mean that each laboratory incurs an extra error of the same size. Rather it means that there will be extra errors of different sizes for different laboratories, as would be expected in practice, and that the underlying effects can be modelled as being normally distributed across the hypothetical population of laboratories. The spread of values of these extra errors can reflect the spread of resources and expertise in the laboratories. Indeed, the implied values of enc,i for a large subset of laboratories whose results are consistent under model (1) will be negligible under model (2). In an inter-laboratory comparison involving the circulation of an artefact for imeasurement among many laboratories, the extra variance σ 2 could be seen as an effect of artefact instability. Let us refer to (2) as Model II. The corresponding likelihood function is n Y −(xi − θ)2 1 p . exp L(θ, σ 2 ) = 2(u2i + σ 2 ) 2π(u2i + σ 2 ) i=1
Fitting the model by the principle of maximum-likelihood means finding the values of the unknown parameters θ and σ 2 that maximise this function. This means maximising the logarithm of the likelihood, which is equivalent to minimising the quantity n X (xi − θ)2 log(u2i + σ 2 ) + 2 . (5) ui + σ 2 i=1
The fitted parameter values θˆ and σ ˆ 2 are those for which the partial derivatives of (5) are zero. Differentiating with respect to θ and setting the result to zero gives n X xi − θ = 0, u2 + σ 2 i=1 i
82
which implies that at the point of maximum-likelihood Pn xi /(u2i + σ 2 ) . θ = Pi=1 n 2 2 i=1 1/(ui + σ )
This expression for θ is substituted into (5), and we find that σ ˆ 2 is the value 2 minimising Pn xj /(u2j +σ 2 ) Pj=1 x − n n 2 i 2 X j=1 1/(uj +σ ) . Q(σ 2 ) = log(u2i + σ 2 ) + u2i + σ 2 i=1
This is found by searching between zero and some upper bound, say (xmax − xmin )2 where xmax and xmin are the largest and smallest values of x1 , . . . , xn . Finally the estimate θˆ is given by Pn ˆ2) xi /(u2i + σ θˆ = Pi=1 . (6) n 2 ˆ2) i=1 1/(ui + σ
It is clear from symmetry that θˆ is an unbiased estimate of θ under this model. So a suitable estimate of the parent standard deviation of θˆ can act ˆ One possibility is as the standard uncertainty of θ. s 1 ˆ = Pn . (7) u(θ) 2 ˆ2) i=1 1/(ui + σ
If σ 2 were equal to σ ˆ 2 then this figure describes the smallest possible stanˆ dard deviation of any unbiased linear estimator of θ. So, in practice, u(θ) ˆ in (7) might tend to be smaller than the parent standard deviation of θ. Another possibility is the standard deviation of the ML estimator of θ that would apply if θ and σ 2 were equal to θˆ and σ ˆ 2 , which is a figure that can be found by simulation. We generate a set of simulated measurement results according to the model ˆ u2 + σ x ˜i ← N(θ, ˆ 2 ), i
i = 1, . . . , n,
and then apply the estimation procedure to the x ˜i values and the u2i values ˜ˆ to obtain a simulated estimate θ. (The ˜indicates a simulated value.) This ˜ ˜ is repeated m times to form a set of simulated estimates θˆ1 , . . . , θˆm . Then the standard uncertainty to associate with θˆ is v u X 2 u 1 m ˜ ∗ ˆ u (θ) = t (8) θˆj − θˆ . m j=1
Model II is a straightforward extension of Model I designed to accommodate situations where an unprejudiced assessment of a set of data is
83
required. Although it supposes the existence of a shared extra variance, it does permit the extra error enc,i to be negligible for almost all of the laboratories. 3. Model III A natural modification to Model II involves allowing the extra errors enc,1 , . . . , enc,n to be drawn from distributions with different unknown variances σ12 , . . . , σn2 . The model becomes xi ← N(θ, u2i + σi2 ),
i = 1, . . . , n,
with θ and each σi2 being unknown. We call this Model III. There are now n + 1 unknown parameters, but we only wish to estimate θ. The likelihood function under this model is n Y −(xi − θ)2 1 2 2 p . (9) L(θ, σ1 , . . . , σn ) = exp 2(u2i + σi2 ) 2π(u2i + σi2 ) i=1
Let θˆ indicate the estimate that we shall obtain of θ. Even though θˆ is as yet unknown, (9) implies that the corresponding estimates of σ12 , . . . , σn2 are Pn the values minimising the sum i=1 Hi (σi2 ) where Hi (σi2 ) = log(u2i + σi2 ) +
ˆ2 (xi − θ) . 2 ui + σi2
(10)
This means minimizing each Hi (σi2 ) term, because each is unrelated. From ˆ2 ∂Hi (σi2 ) 1 (xi − θ) − 2 = 2 2 2 ∂σi ui + σi (ui + σi2 )2 ˆ 2 − u2 and that this we infer that Hi (σi2 ) has a minimum at σi2 = (xi − θ) i 2 is the only minimum. So, because σi ≥ 0, the fitted value of σi2 is ˆ 2 − u2 , 0}. σ ˆi2 = max{(xi − θ) i
(11)
θˆ = argmin t Q∗ (t).
(12)
ˆ If t indicates apossible value for θ then the corresponding fitted value 2 2 2 2 of ui + σi is max (xi − t) , ui . So, from (10), we set θˆ to be the value of t that minimises n X (xi − t)2 ∗ Q (t) = log max (xi − t)2 , u2i + . max {(xi − t)2 , u2i } i=1
That is, we set
84
This estimate can be found by a searching over t between the lowest and highest values of xi . The corresponding maximum-likelihood choice for σi2 is then given by (11), and - like (7) - one simple figure of standard uncertainty is s 1 ˆ = Pn . (13) u(θ) 2+σ 1/(u ˆi2 ) i i=1
From symmetry, it is clear that θˆ is an unbiased estimate of θ. So, as in Model II, we could instead take the standard uncertainty of θˆ to be the parent standard deviation θˆ under the condition that the parameˆσ ˆn2 . Again, we can ˆ12 , . . . , σ ters θ, σ12 , . . . , σn2 are equal to the fitted values θ, evaluate this standard deviation by simulating the measurement process many times. Thus for i = 1, . . . , n we draw a value x ˜i from the distribution 2 2 ˆ ˜i values ˆi ), and then we apply the estimation procedure to the x N(θ, ui + σ ˜ˆ 2 and the ui values to obtain a simulated estimate θ. This is repeated m ˜ ˜ ˆ times to form a set of simulated estimates θˆ1 , . . . , θˆm . Then, as in (8), u(θ) is given by v u X 2 u 1 m ˜ ∗ ˆ (14) θˆj − θˆ . u (θ) = t m j=1 4. Model IV Let us now consider a model that allows many of the enc,i errors to be exactly zero. We suppose that laboratory i had probability λi of incurring a non-zero enc,i error and that this error would be drawn from the normal distribution with mean 0 and unknown variance σi2 . The model is xi ← N(θ, u2i + ki σi2 ). with ki ← Bernoulli(λi ). (A Bernoulli variable with parameter λi takes the value 1 with probability λi and takes the value 0 otherwise.) There are now 2n + 1 unknown parameters, but our primary attention is on estimating θ. The parent probability distribution of xi is now a mixture of the distributions N(θ, u2i ) and N(θ, u2i + σi2 ) in the ratio (1 − λi ) : λi . This mixture distribution has probability density function 1 − λi −(x − θ)2 −(x − θ)2 λi p fi (x) = p exp exp + . 2u2i 2(u2i + σi2 ) 2πu2i 2π(u2i + σi2 )
85
Qn The likelihood function is i=1 fi (xi ). Setting each λi to zero gives Model I while setting each λi to one gives Model III. Again, let θˆ denote the MLE of θ. Even though θˆ is as yet unknown, the corresponding fitted values of λ1 , . . . , λn , σ12 , . . . , σn2 are the values maxQn imising i=1 g(λi , σi2 ) where 1 − λi exp g(λi , σi2 ) = ui
ˆ2 −(xi − θ) 2 2ui
!
+p
λi
u2i + σi2
exp
ˆ2 −(xi − θ) 2 2(ui + σi2 )
!
,
subject to 0 ≤ λi ≤ 1 and σi2 > 0. This means maximising each of the individual g(λi , σi2 ) factors separately. Setting ∂g(λi , σi2 )/∂λi = 0 implies that 1 exp ui
ˆ2 −(xi − θ) 2 2ui
!
1
=p 2 exp ui + σi2
ˆ2 −(xi − θ) 2 2(ui + σi2 )
!
.
(15)
ˆ i 6= 0, 1 then (15) holds with σ So if λ ˆi2 replacing σi2 . Also, setting 2 2 ∂g(λi , σi )/∂σi = 0 implies that either λi = 0 or −
u2i
ˆ2 1 (xi − θ) + 2 = 0, 2 + σi (ui + σi2 )2
ˆi = ˆ 2 = u2 + σ 2 . So if λ 6 0, 1 then, using this result and in which case (xi − θ) i i 2 (15), we find that σ ˆi satisfies 2 ˆi2 u2i + σ ˆi2 ui + σ exp − = exp(−1), u2i u2i ˆ i 6= 0, 1 then σ which implies that σ ˆi2 = 0. Thus, if λ ˆi2 = 0, in which case the value of λi does not matter, and we recover the solution under Model I. ˆ i = 0 then we again recover the solution under Model I. However, if Also, if λ ˆ λi = 1 then we recover the solution under Model III. Model III encompasses Model I as a special case, so the value of the likelihood function at the solution under Model III must be at least as large as the value of the likelihood function at the solution under Model I. From this we can infer that – when fitting is carried out by the method of maximum likelihood – the model described in this section leads to the same result as Model III. Therefore, this model is not considered further as a means of solution.
86
5. Example: the gravitational constant G Consider the formation of a combined estimate of Newton’s gravitational constant from the 10 ordered measurement results given in Table 1.3 Table 1: Measurement results for G (10−11 m3 kg−1 s−2 ) i 1 2 3 4 5
xi 6.6709 6.67259 6.6729 6.67387 6.6740
ui 0.0007 0.00043 0.0005 0.00027 0.0007
i 6 7 8 9 10
xi 6.67407 6.67422 6.674255 6.67559 6.6873
ui 0.00022 0.00098 0.000092 0.00027 0.0094
Analysis is carried out in the units of 10−11 m3 kg−1 s−2 . With Model I ˆ = 0.000074. With we obtain, from (3) and (4), θˆ = 6.674186 and u(θ) ˆ ˆ = 0.000401, Model II we obtain, from (6) and (7), θ = 6.673689 and u(θ) 2 −6 with σ ˆ = 1.21 × 10 . With Model III we obtain, from (12) and (13), ˆ = 0.000081, with {ˆ θˆ = 6.674195 and u(θ) σi2 } = {1.04 × 10−5 , 2.39 × −6 −6 −8 10 , 1.43 × 10 , 3.27 × 10 , 0, 0, 0, 0, 1.87 × 10−6 , 8.34 × 10−5 }. Using Model III instead of Model II brings the estimate back towards the value obtained with Model I. This was a pattern observed in other examples also. ˆ and the intervals Figure 1 shows the corresponding intervals θˆ ± u(θ) √ √ 2 2 ˆi2 ) for ˆ ) for Model II and xi ± (u2i + σ xi ± ui for Model I, xi ± (ui + σ Model III. Inp accordance with (11), every laboratory with σ ˆi2 > 0 has its 2 2 ˆ interval xi ± ui + σ ˆi with one end-point at the estimate θ. Model III Model II Model I
● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
6.670
Fig. 1.
6.675
6.680
6.685
Gravitational constant G: estimate ± standard uncertainty (10−11 m3 kg−1 s−2 )
87
6. Example: Planck’s constant Similarly, consider the formation of a combined estimate of Planck’s constant from the 20 ordered measurement results given in Table 2.4 Analysis is carried out in the units of 10−34 J s. Model I gives, from (3) and (4), ˆ = 0.00000010. Model II gives, from (6) and (7), θˆ = 6.62606993 and u(θ) ˆ ˆ θ = 6.62606986 and u(θ) = 0.00000020, with σ ˆ 2 = 2.17 × 10−13 . Model III ˆ ˆ = 0.00000011 with gives, from (12) and (13), θ = 6.62607004 and u(θ) 2 −11 maxi {ˆ σi } = 1.23 × 10 . Figure 2 presents the results graphically. Table 2: Measurement results for Planck’s constant (10−34 J s) i 1 2 3 4 5 6 7 8 9 10
xi 6.6260657 6.6260670 6.6260682 6.6260684 6.6260686 6.6260686 6.62606887 6.62606891 6.62606901 6.6260691
ui 0.0000088 0.0000042 0.0000013 0.0000036 0.0000044 0.0000034 0.00000052 0.00000058 0.00000034 0.0000020
Model III Model II Model I
i 11 12 13 14 15 16 17 18 19 20
xi 6.62607000 6.62607003 6.62607009 6.62607063 6.626071 6.6260712 6.62607122 6.6260715 6.6260729 6.6260764
ui 0.00000022 0.00000020 0.00000020 0.00000043 0.000011 0.0000013 0.00000073 0.0000012 0.0000067 0.0000053
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
6.626066
Fig. 2.
6.626068
6.626070
6.626072
6.626074
6.626076
Planck’s constant: estimate ± standard uncertainty (10−34 J s)
88
7. Performance assessment We envisage each error distribution being symmetric. So each method is unbiased and its performance can be judged by its standard error. Model I will perform best when the ui uncertainties are correctly assessed, so we also consider the application of Model I unless the data fail the test of conPn ˆ 2 /u2 with the sistency undertaken by comparing the statistic i=1 (xi − θ) i 95th percentile of the chi-square distribution with n − 1 degrees of freedom. The five methods of analysis studied were therefore: I - apply Model I II - apply Model II III - apply Model III I+II - apply Model II if Model I fails the chi-square test I+III - apply Model III if Model I fails the chi-square test. Datasets were generated by mechanisms obeying Models I to III. There were four settings for the main parameters and five settings for the extra variances (which are nuisance parameters), as follows: θ θ θ θ
= 0, = 0, = 0, = 0,
{extra {extra {extra {extra {extra
n=8 n=8 n=8 n=8
and and and and
variances} variances} variances} variances} variances}
{ui } = {ui } = {ui } = {ui } = = = = = =
{1, {1, {1, {1,
1, 1, 1, 4,
1, 1, 4, 4,
1, 1, 4, 4,
1, 9, 4, 4,
1, 9, 4, 4,
1, 9, 9, 4,
1} 9} 9} 4}
{0, 0, 0, 0, 0, 0, 0, 0} {1, 1, 1, 1, 1, 1, 1, 1} {9, 9, 9, 9, 9, 9, 9, 9} {15, 0, 0, 0, 0, 0, 0, 0} {15, 15, 0, 0, 0, 0, 0, 0}
(as (as (as (as (as
per per per per per
Model Model Model Model Model
I) II) II) III) III).
For each of the 20 corresponding combinations, there were 10,000 simulated experiments. The appropriate row in Table 1 shows the standard errors for the five methods normalized to the value of the best performing method, which is given the value 1. (So entries in Table 1 may be compared across rows but not down columns.) To indicate relatively poor performance, entries of 1.2 or more have been italicised. Bold type has been used to indicate where the best performing model was not the model under which the data were actually generated. The results support the use of Model II with or without a chi-square test. They indicate that Model II performed relatively well with data generated under Model I. Also they show it to be the best performing model
89
in several of the scenarios involving data generated under Model III, which was the most general of the three models. This phenomenon will be associated with the simpler model having a smaller number of free parameters. Table 3: Relative standard errors of estimators of θ {ui } (known) {1,1,1,1,1,1,1,1} {1,1,1,1,1,1,1,1} {1,1,1,1,1,1,1,1} {1,1,1,1,1,1,1,1} {1,1,1,1,1,1,1,1}
extra variances {0,0,0,0,0,0,0,0} {1,1,1,1,1,1,1,1} {9,9,9,9,9,9,9,9} {15,0,0,0,0,0,0,0} {15,15,0,0,0,0,0,0}
I 1.00 1.00 1.00 1.33 1.53
II 1.00 1.00 1.00 1.33 1.53
III 1.17 1.33 1.54 1.01 1.00
I+II 1.00 1.00 1.00 1.33 1.53
I+III 1.05 1.26 1.54 1.00 1.00
{1,1,1,1,9,9,9,9} {1,1,1,1,9,9,9,9} {1,1,1,1,9,9,9,9} {1,1,1,1,9,9,9,9} {1,1,1,1,9,9,9,9}
{0,0,0,0,0,0,0,0} {1,1,1,1,1,1,1,1} {9,9,9,9,9,9,9,9} {15,0,0,0,0,0,0,0} {15,15,0,0,0,0,0,0}
1.00 1.00 1.02 1.51 1.23
1.00 1.00 1.00 1.45 1.17
1.17 1.29 1.43 1.00 1.00
1.00 1.00 1.00 1.45 1.17
1.04 1.18 1.42 1.01 1.00
{1,1,4,4,4,4,9,9} {1,1,4,4,4,4,9,9} {1,1,4,4,4,4,9,9} {1,1,4,4,4,4,9,9} {1,1,4,4,4,4,9,9}
{0,0,0,0,0,0,0,0} {1,1,1,1,1,1,1,1} {9,9,9,9,9,9,9,9} {15,0,0,0,0,0,0,0} {15,15,0,0,0,0,0,0}
1.00 1.00 1.15 1.32 1.32
1.06 1.03 1.00 1.00 1.00
1.16 1.20 1.35 1.02 1.33
1.04 1.04 1.01 1.02 1.03
1.04 1.11 1.34 1.03 1.34
{1,4,4,4,4,4,4,4} {1,4,4,4,4,4,4,4} {1,4,4,4,4,4,4,4} {1,4,4,4,4,4,4,4} {1,4,4,4,4,4,4,4}
{0,0,0,0,0,0,0,0} {1,1,1,1,1,1,1,1} {9,9,9,9,9,9,9,9} {15,0,0,0,0,0,0,0} {15,15,0,0,0,0,0,0}
1.00 1.00 1.34 1.63 1.65
1.11 1.02 1.00 1.00 1.00
1.17 1.15 1.37 1.25 1.28
1.05 1.02 1.07 1.15 1.14
1.05 1.05 1.34 1.29 1.31
References 1. R. Willink, Statistical detemination of a comparision reference value using hidden errors Metrologia 39, 343 (2002). 2. M. G. Vangel and A. L. Rukhin, Maximum Likelihood Analysis for Heteroscedastic One-Way Random Effects ANOVA in Interlaboratory Studies Biometrics 55, 129 (1999). 3. V. Dose, Bayesian estimate of the Newtonian constant of gravitation Meas. Sci. Technol. 18, 176 (2007). 4. G. Mana, E. Massa and M. Predecsu, Model selection in the average of inconsistent data: an analysis of the measured Planck-constant values Metrologia 49, 492 (2012). 5. R. Willink, Comments on ‘Bayesian estimate of the Newtonian constant of gravitation’ with an alternative analysis Meas. Sci. Technol. 18, 2275 (2007).
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 90–97)
MODEL FOR EMOTION MEASUREMENTS IN ACOUSTIC SIGNALS AND ITS ANALYSIS Y. BAKSHEEVA Radioelectronic System Department, St. Petersburg State University of Aerospace Instrumentation St. Petersburg, 190000, Russian Federation † E-mail: [email protected] www.guap.ru K. SAPOZHNIKOVA, R. TAYMANOV† Computerized Sensors and Measuring Systems Laboratory, D. I. Mendeleyev Institute for Metrology St. Petersburg, 190005, Russian Federation E-mail: [email protected], †[email protected] www.vniim.ru In the paper a hypothesis concerning the mechanism of emotion formation as a result of perception of acoustic impacts is justified. A model for measuring emotions and some methods of emotion information processing are described, which enable signals-stimuli and their ensembles causing the emotions to be revealed. Keywords: Measurement Model, Acoustic Signals, Emotion Measurement
1. Introduction As a result of civilization development, the priorities of tasks that the society puts in the forefront for metrology, change. In the last decades the scientific research emphasis shifts more and more to a study of humans, their abilities, special features of their communication and perception of external impacts, interaction with environment, etc. Interest in measuring quantities characterizing properties which until recently have been referred to immeasurable ones, increases [1]. They were considered as the nominal properties that, according to [2], had “no magnitude”. For the most part, these quantities are of a multiparametric (multidimentional) character. Approach used in processing the results of such measurements as well as reliability of results depend, to a significant extent, on a measurement model. In
90
91
fact, measurement model demonstrates a conception of model designers on the “mechanism” of forming the corresponding quantities. Its designing is associated with a step-by-step development of this conception. Developing such a model, it is necessary to use knowledge from the fields that are far from metrology and put forward hypotheses based on it. 2. Stages of development and justification of a measurement model The experience gained in developing a model for emotion measurement in musical fragments, communication (biolinguistic) signals of animals as well as other acoustic signals with emotional colour, is a representative one. The statement of the task with regard to a possibility to measure emotions in acoustic signals is based on the hypothesis according to which the considered signals contain certain “signals-stimuli” in an infrasound and a low part of a sound range (hereinafter, the above ranges will be referred to as the IFR), approximately up to 30 Hz. These signals-stimuli initiate the emotions [3-5]. At the first stage of the model development it was required: x to put forward and justify a hypothesis that the selection of signals-stimuli from complicated acoustic signals of various type (for example, chords), is carried out by nonlinear conversion; x to determine possible parameters that can describe these signals-stimuli; x to reveal correlation of some signals-stimuli with certain emotions and to build a simplified measurement scale (nominal scale); x to evaluate ranges of variation of the signals-stimuli parameters; x to prove that at a certain stage of evolution, nonlinear conversion of acoustic communication signals was included into the “mechanism” of emotion formation. Within the frames of this proof it was demonstrated that evolution of biolinguistic signals proceeded along the path of increasing the number and frequency of IFR signals and later on forming ensembles of such signals. Shrimps Alpheidae have only one communication signal (danger signal). Crabs Uca annulipes emit two signals, emotional colour of which are different. Fishes use two or three types of signals. When amphibia and reptiles (later) left water and settled on dry land, where the density of a medium was significantly less, they needed to keep a “vocabulary” of vitally important signals. This has resulted in use of modulation in biolinguistic signals as well as their demodulation for perception (with the help of nonlinear conversion).
92
Highly developed animals (birds, mammals) have in their arsenal much more signals, the meaning and emotional colour of which are different, but they have preserved ancient signals-stimuli. On the whole, the work performed at the first stage made it possible to develop the simplest measurement model of a “mechanism” providing formation of emotions. The results of corresponding investigations are published in [6] and other papers. The model contains a nonlinear converter, selector of a frequency zone of the energy maximum, selector of signals-stimuli as well as comparison and recognition unit. In the comparison and recognition unit, the frequencies of signals-stimuli selected and frequency intervals on a scale of elementary emotions are compared. Functionality of the simplest model was tested by “decoding” an emotional content of fragments of drum ethnic music and bell rings [5, 7]. At the second stage of the model development its limitations were analyzed. Special measures were taken for the step-by-step removal of the above mentioned limitations. This required: x to suggest and substantiate a hypothesis about the way by which signalsstimuli are singled out when listening to the simplest melodies; x to evaluate the parameters characterizing this process; x to show a role of an associative memory in the “mechanism” of emotion formation considered at the first stage. Investigations [5-7] have demonstrated the necessity to improve the simplest measurement model. It was supplemented with: x a preselector that restricts a frequency and amplitude range of perceived acoustic signals X; x a time delay unit that can carry out the delay of signals in order to form elementary emotions Y while listening to sounds with a changing frequency; x a memory unit assigned for memorizing the ensembles consisting of 3-4 signals-stimuli; x an associative memory unit assigned for memorizing emotional images corresponding to certain signals-stimuli (it carries out the function of multidimentional scale of emotional images); x a comparison and recognition unit 2, which forms emotional images Z. The improved model linking emotions and acoustic signals caused them is shown in Fig.1. An emotional content of some animal biolinguistic signals in various situations was “decoded” in order to study capabilities of this model [68]. It should be emphasized that as a further study of the “mechanism” providing
93
formation of human emotions takes place, the structure of the improved measurement model can be corrected somewhat. The third stage being performed at present stipulates for optimization of the measurement model parameters. The results of the work at this stage should become a basis that will allow preparations for designing a special measurement instrument. It will be capable to measure an expected emotional reaction of listeners to various acoustic impacts.
Fig.3. Measurement model.
3. Optimization of conversion function In experiments carried out at the previous stages of the measurement model development, a nonstationary acoustic signal converted nonlinearly was presented as a Fourier spectrum in the IFR. A duration of acoustic signals fragments under investigation was from some fractions of a second up to a few seconds. Within such a time interval the acoustic signal can contain a number of signals-stimuli. A corresponding Fourier spectrum in the IFR included a large quantity of spectrum components. It was caused by a number of reasons. Firstly, the signals-stimuli can have a nonsinusoidal form, i.e. they can contain a number of harmonics “masking” the remaining signals-stimuli. Secondly, a short duration of the analyzed fragments results in a supplementary distortion of the spectrum. In addition, the spectrum of emotionally coloured acoustic signals is to some extent “blurred” owing to special features of the “instrument” emitting them with some modulation.
94
Optimization of the transform function is aimed at searching a form of a signal converted nonlinearly, so that a number of components considered to be efficient signals-stimuli would be minimal. This requirement is caused by the fact that an elementary emotions scale has comparatively little number of gradations. As the first result of the search, taking into account special features of the Fourier transform [9, 10], a modified algorithm of signal presentation was proposed. In this algorithm a considered Fourier spectrum is subjected to a supplementary transform. It was proposed to select the most probable “basic” frequencies of signals-stimuli in the original spectrum and then on the basis of corresponding oscillations to form synthesized signals-stimuli, using harmonics of oscillations with the “basic” frequency components with the frequencies close to them. A resulting signal-stimulus contains only one spectral line with the “basic” frequency in the modified spectrum. For components remaining in the original spectrum, the procedure is repeated as long as a number of signal-stimuli containing a greater part of IFR spectrum energy is synthesized. Fig. 2 and Fig. 3 illustrate the efficiency of the modified algorithm. Fig.2 demonstrates a growling of a cheetah as the signal analyzed [11].
Fig. 2. Spectra of the IFR signals after nonlinear conversion. Growling of cheetah (axis of abscissa is the frequency, Hz; ordinate axis is the level of spectrum components, relative units); a) Fourier spectrum, b) spectrum modified.
95
A Fourier spectrum of a nonlinearly converted signal in the IFR is shown in Fig. 2a). Fig.2b) demonstrates the spectrum of the same nonlinearly converted signal in the IFR processed with the help of the modified algorithm. In Fig.3 Fourier spectra and modified spectra of signals emitted by various animals (a dhole, white-naped crane, and red-breasted goose) in the process of their coupling [11], are given. Sounds emitted by these animals are perceived by ear quite differently, but their spectra in the IFR after nonlinear conversion are similar. This fact can speak about the ancient origin of the corresponding emotion and the same “mechanism” of its origination for different animals. Of course, for analogous situations, the modified spectra of biolinguistic signals (after nonlinear conversion) can differ between various animals too. These differences are influenced by the “age” and complexity of an emotion, out-of-true interpretation of an animal behavior, technical distortions due to signal recording, etc. However, the efficiency of “basic” signals-stimuli selection using the modified algorithm indicates that the search in the chosen direction is perspective.
Fig. 3. Spectra of the IFR signals after nonlinear conversion. Signals emitted by various animals in the process of their coupling (axis of abscissa is the frequency, Hz; ordinate axis is the level of spectrum components, relative units); a), c), and e) Fourier spectra; b), d), and f) spectra modified.
This way can be referred not only to the analysis of biolinguistic signals, but to the decoding of any acoustic signals that can form an emotional response of
96
listeners. A future plan is to continue the search using wavelet transform and other methods of nonstationary signals study. 4. Conclusion Papers, in which measurements of multidimensional quantities are considered, can be found in scientific journals more and more often. However, the experience gained in designing the measurement models for such quantities have not received a due generalization and methodological support in metrology so far. It should be noticed that the works in the field of multidimensional measurements have a wide spectrum of practical applications. In particular, the measurement model assigned for investigations of the relationship between acoustic signals and emotions of listeners, opens opportunities for applied works in the fields of musicology, medicine, mathematical linguistics, etc. References 1.
2.
3.
4.
5.
6.
7.
K. Sapozhnikova, A. Chunovkina, and R. Taymanov, “Measurement” and related concepts. Their interpretation in the VIM, Measurement 50(1), 390 (2014). International Vocabulary of Metrology – Basic and General Concepts and Associated Terms. 3rd edn., 2008 version with minor corrections (BIPM, JCGM 200, 2012). K. Sapozhnikova and R. Taymanov, About a measuring model of emotional perception of music, in Proc. XVII IMEKO World Congress, (Dubrovnik, Croatia, 2003). K. Sapozhnikova and R. Taymanov, Measurement of the emotions in musiɫ fragments, in Proc.12th IMEKO TC1 & TC7 Joint Symposium on Man Science & Measurement, (Annecy, France, 2008). R. Taymanov and K. Sapozhnikova, Improvement of traceability of widelydefined measurements in the field of humanities, MEAS SCI REV 3 (10), 78 (2010). K. Sapozhnikova and R. Taymanov, Role of measuring model in biological and musical acoustics, in Proc. 10th Int. Symposium on Measurement Technology and Intelligent Instruments (ISMTII-2011), (Daejeon, Korea, 2011). R. Taymanov and K. Sapozhnikova, Measurements enable some riddles of sounds to be revealed, KEY ENG MAT 613 482 (2014).
97
8.
R. Taymanov and K. Sapozhnikova, Measurement of multiparametric quantities at perception of sensory information by living creatures, EPJ WOC 77, 00016 (2014) http://epjwoc.epj.org/articles/epjconf/abs/2014/ 14/epjconf_icm2014_00016/epjconf_icm2014_00016.html 9. L. R. Rabiner and B. Gold, Theory and Application of Digital Signal Processing, Textbook (Prentice-Hall Inc., 1975). 10. L. Yaroslavsky, Fast Transform Methods in Digital Signal Processing, v.2 (Bentham E-book Series “Digital Signal Processing in Experimental Research”, 2011) 11. Volodins Bioacoustic Group Homepage, Animal sound gallery http://www.bioacoustica.org/gallery/gallery_eng.html
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 98–104)
UNCERTAINTY CALCULATION IN GRAVIMETRIC MICROFLOW MEASUREMENTS E. BATISTA*, N. ALMEIDA, I. GODINHO AND E. FILIPE Instituto Português da Qualidade Caparica, 2828-513, Portugal * E-mail: [email protected] www.ipq.pt The primary realization of microflow measurements is often done by the gravimetric method. This new measurement field arise from the need of industry and laboratories to have their instruments traceable to reliable standards. In the frame of the EMRP European Metrology Research Programme a new project on metrology for drug delivery started in 2012 with the purpose of developing science and technology in the field of health. One of the main goal of this project is to develop primary microflow standards and in doing so also developing the appropriated uncertainty calculation. To validate the results obtained by the Volume Laboratory of the Portuguese Institute for Quality (IPQ) through that model, it was considered to apply the GUM and MCM methodologies. Keywords: Microflow, uncertainty, drug delivery devices, calibration.
1. Introduction With the development of science and the widespread use of nanotechnology, the measurement of fluid flow has become the order of microliter per minute or even nanoliter per minute. In order to pursuit the industry and laboratory’s needs, in such fields as health, biotechnology, engineering and physics, it was identified, not only nationally but also at international level [1] the need of developing a primary standard for microflow measurement, to give traceability to its measurements. Therefore, in 2011, Metrology for Drug Delivery - MeDD [2] was funded by EMRP. This joint research project (JRP) aims to develop the required metrology tools and one of the chosen JRP subjects was Metrology for Health. The choice of this subject had the purpose of developing science and technology in the field of health, specifically, to assure the traceability of clinical data, allowing the comparability of diagnostic and treatment information.
98
99
2. Microflow measurements The scientific laws used for the study of fluids in a macro scale are not always applicable for micro fluids. This happens because of some physical phenomenon like capillarity, thermal influences and evaporation, have bigger influence in micro fluid measurements than in larger flows. Based on recent studies [1], several parameters have to be taken into account, such as: thermal influence, dead volume, system for delivering and collection the fluid, continuous flow, pulsation of the flow generator, evaporation effects, surface tension effects/drop and capillarity, balance effects (floatability and impacts), contamination and air bubbles, variation of pressure and time measurement. A reference for microflow is often the gravimetric setup and this requires a scale, a measuring beaker or collecting vessel, a flow-generator and a water reservoir. This is the type of setup used by IPQ, one of the MeDD project partners, to calibrate drug delivery devices, normally used in hospitals, microflow meters and other microflow generators. IPQ has two different setups that cover a range of 1 µL/h up to 600 mL/h with correspondent uncertainties from 4 % up to 0.15 %. Two types of scales are used according to the flow, a 20 g balance (AX26) with 0.001 mg resolution, Fig. 1 a), and a 200 g balance (XP205) with 0.01 mg resolution, Fig. 1 b). A data acquisition system was developed using LabView graphical environment. Different modules were implemented to the acquisition, validation, online visualization data, statistical processing and uncertainty calculation. The data acquisition is done directly from the balance every 250 ms and the measurement of time is done simultaneously.
Figure 1. a) IPQ microflow setup AX26
Figure 1. b) IPQ microflow setup XP205
100
3. Uncertainty calculation The measurement uncertainty of the gravimetric method used for microflow determination is estimated following the Guide to the expression of Uncertainty in Measurement (GUM) [3]. The measurement model is presented along with the standard uncertainties components, the sensitivity coefficients values, the combined standard uncertainty and the expanded uncertainty. It was considered to perform also a validation process by a robust method, being used for that purpose, the Monte Carlo Method (MCM) simulation, as described in GUM Supplement 1 [4] using MATLAB programming software. The computational simulation was carried out using the same input information used by the Law of Propagation of Uncertainties (LPU ) evaluation, namely, the same mathematical model (equation 1), estimates and the assigned probability density functions (PDF) characterizing each input quantity. It was considered a number of Monte Carlo trials equal to 1.0×106 . 3.1. Measurements model The gravimetric dynamic measurement method is, by definition, the measurement of the mass of fluid obtained during a specific time period. For volume flow rates (Q) the density of the liquid (ρW ) is included in equation 1 along with the following components: final time (tf), initial time (ti), final mass (IL), initial mass (IE), air density (ρA), mass pieces density (ρB), expansion coefficient (γ ), water temperature (T) and evaporation rate (δQevap): =
−
−
−
×
−
×
−
×
! − "# $% + '()*+,
−
(1)
If the buoyancy correction (δmbuoy) of the dispensing needle is determined by: =-
−
×
. / . 012
"
3
(2)
Where: Dtube is the immersed tube diameter and Dtank is the measuring beaker diameter. Then: =
−
4-
−
×5 −
. / " 63 × . 012
−
×
−
×
−
! − "# $7 +
/809
(3)
The evaporation rate was determined by leaving the collecting vessel, full of
101
water, in the balance for 24 h, at the same conditions as the measurements are normally done.
3.2. Uncertainty evaluation The main standard uncertainties considered are: mass measurements (m), density of the mass pieces ( B), density of the water ( W), density of the air ( A), evaporation rate ( Qevap), water temperature (T), time (t), expansion coefficient ( ), standard deviation of the measurements ( Qrep) and buoyancy on the immersed dispensing needle ( Qmbuoy). Detailed information regarding the uncertainty components is described in Table 1. Table 1. Uncertainty components in the microflow measurements. Uncertainty components Final mass
Standard uncertainty u(IL)
Evaluation type
Distribution
B
Normal
B
Normal
u(ρW)
Evaluation process Calibration certificate Calibration certificate Literature
Initial mass
u(IE)
Density of the water Density of the air Density of the mass pieces Temperature
B
Rectangular
u(ρA)
Literature
B
Rectangular
u(ρB)
Calibration certificate Calibration certificate Literature
B
Rectangular
B
Normal
Expansion coefficient Evaporation
u(γ)
B
Rectangular
u( Qevap)
Standard deviation of the measurements Estimation (1 µs) Estimation (1 µs) Calibration certificate Standard deviation of the measurements
A
Normal
Final time
u(tf)
B
Rectangular
Initial time
u(ti)
B
Rectangular
Buoyancy
u( Qmbuoy)
B
Normal
Repeatability
u( Qrep)
A
Normal
u(T)
102
The combined uncertainty for the gravimetric method is given by the following equation: = < < < =< < > ! ;
"
"
+
> >
> >
! +5
>
"
>
"
+
" "
/809
6
"
"
+
> >
> >
/809
"
"
" "
+5
> +5 6 >
+
"
>
> >
"
>
"
+
6
"
"
> >
"
"
+
"
> >
"
+5
"
>
>
?/9
"
6
"
?/9
B A A A A A A A @
"
(4)
From the determined values of the coverage factor k and the combined standard uncertainty of the measurand, the expanded uncertainty is deduced by: (5)
U = k × u(Q) 4. Results
A Nexus 3000 pump (microflow generator) was calibrated using the gravimetric method, with the AX26 balance, for a programed flow rate of 0.1 mL/h. The measurements results were collected using a Labview application and the average of approximately 60 values were used to determine the measured flow, Fig. 2, with a mean value equal to 2.73×10-5 mL/s. Flow measurement results 4.00E-05 3.50E-05
Flow (ml/s)
3.00E-05 2.50E-05 2.00E-05 1.50E-05 1.00E-05 5.00E-06 0.00E+00 00:00.0
07:12.0
14:24.0
21:36.0 Time (min)
28:48.0
36:00.0
43:12.0
Figure 2. Flow measurement results
The uncertainty results using GUM approach are presented in Table 2. A comparison between GUM and MCM approaches (considering coverage intervals of 68 %, 95 % and 99 %) is presented in Table 3, where is indicated the estimated values for the output quantity with the associated standard uncertainties and the limits determined for each measurement interval. A difference of 2.8×10-11 mL/s was obtained for the output quantity, which is a negligible value compared to the experimental system accuracy (5.7×10-8 mL/s).
103 Table 2. Uncertainty components in the calibration of a Nexus 3000 pump - GUM Uncertainty components Final mass (g) Density of water (g/mL) Density of air (g/mL) Density of weights (g/mL) Temperature (ºC) Expansion coefficient (/ºC) Initial mass (g) Evaporation (mL/s) Initial Time (s) Final Time (s) Buoyancy (g) Repeatability (mL/s) Flow (mL/s) ucomb (mL/s) Uexp (mL/s)
u(xi)
Estimation 5.12 0.9980639 0.001202 7.96 20.68 1×10-5 5.06 1.09×10-7 0.249 191 0.0007 0 2.7254×10-5 5.7×10-8 1.1×10-7
(ci×xi) 2
ci -6
4.8×10 9.00×10-7 2.89×10-7 3.46×10-2 5.00×10-3 2.89×10-7 4.8×10-6 1.12×10-8 5.77×10-5 5.77×10-5 9.01×10-6 5.55×10-8
-4
5.25×10 -2.72×10-5 2.38×10-5 5.15×10-10 -2.71×10-10 -1.85×10-5 -5.25×10-4 1 1,42×10-8 -1.42×10-8 5.25×10-4 1
6.46112×10-18 6.00614×10-22 4.72819×10-23 3.1831×10-22 1.84216×10-24 2.83938×10-23 6.42455×10-18 1.2544×10-16 6.73403×10-25 6.73403×10-25 2.23655×10-17 3.08025×10-15
Table 3. Comparison between GUM and MCM approach MCM
GUM ± u (mL/s)
M (Monte Carlo trials)
y (mL/s)
± u (mL/s)
y (mL/s)
1.0×10+6
2.7254×10-5
5.6×10-8
2.7254×10-5
5.7×10-8
Probability Density
u probability
5.6×10-8
95 % ⇔ (y ± 1.96 × u)
1.1×10-7
99 % ⇔ (y ± 2.68 × u)
1.5×10-7
Limits 2.7197×10-5 2.7310×10-5 2.714×10-5 2.736×10-5 2.710×10-5 2.740×10-5
u probability
68 % ⇔ (y ± u)
Limits 2.7197×10-5 2.7311×10-5 2.714×10-5 2.737×10-5 2.710×10-5 2.741×10-5
5.7×10-8 1.1×10-7 1.5×10-7
From the Monte Carlo simulation performed it was obtained a normal probability density function of the output quantity as presented in Fig. 3.
104
Figure 3. Probability density function of output quantity Q using MCM
5. Conclusions In the gravimetric determination of microflow there are several influence factors that have a major contribution to the uncertainty calculation due to the very small amount of liquid used, namely the evaporation of the fluid and capillary effects like buoyancy correction. The standard deviation of the measurements (repeatability) is also one of the major uncertainty sources. Comparing the results and the corresponding uncertainties obtained by the two approaches, it can be concluded that the estimated output quantity values, considering both GUM and MCM approaches, show an excellent agreement (in order of 10-11 mL/s), negligible compared to the experimental system accuracy (5.7×10-8 mL/s) which allows the validation of methodology used for microflow measurements. Acknowledgments This work is part of a project under the European Metrology Research Program (EMRP), MeDD – Metrology for Dug Delivery. References 1. 2. 3. 4.
C. Melvad, U. Kruhne e J. Frederlkesen, “IOP Publishing, Measurement Science and Techonology, nº 21, 2010. P. Lucas, E. Batista, H. Bissig et al Metrology for drug delivery, research project 2012 – 2015, www.drugmetrology.com JCGM 2008, Evaluation of measurement data - Guide to expression of uncertaity in measurement, 1ª ed., 2008. BIPM, IEC, IFCC, ILAC, ISO, IUPAP and OIML, “Evaluation of measurement data – Supplement 1 to the Guide to the Expression of Uncertainty in Measurement – Propagation of distributions using a Monte Carlo method, Joint Committee for Guides in Metrology, JCGM 101, 2008.
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. –)
UNCERTAINTIES PROPAGATION FROM PUBLISHED EXPERIMENTAL DATA TO UNCERTAINTIES OF MODEL PARAMETERS ADJUSTED BY THE LEAST SQUARES V.I. BELOUSOV, V.V. EZHELA∗ , Y.V. KUYANOV, S.B. LUGOVSKY, K.S. LUGOVSKY, N.P. TKACHENKO COMPAS group, IHEP, Science square 1 Protvino, Moscow region, 142280, Russia ∗ E-mail: [email protected] www.ihep.ru This report presents results of the indirect multivariate “measurements” of model parameters in a task of experimental data description by analytical models with “moderate” nonlinearity. In our “measurements” we follow the recommendations of the GUM-S1 and GUM-S2 documents in places where they are appropriate. Keywords: GUM; JCGM; Propagation of distributions; Indirect measurements; Uncertainty evaluation; Criteria for the measurement result; Numerical peer review.
1. Introduction Our task is as follows: for available experimental data sample of Nd data points (ˆ xi , ui ) we need to find an algebraic model ti (yj ) dependent upon Np parameters yj and a vector yjref , such that for almost all i we will have |ˆ xi − ti (yjref )| ≦ ui , where ui is an estimate of the uncertainty of random variable xi with mean (or estimated) value x ˆi . This task is an optimisation task that one can solve by the method of least squares (MLS) using estimator function χ2 (ˆ x, yj ) defined as: 2
χ (ˆ x , yj ) =
2 Nd X x ˆi − ti (yj ) i=1
ui
Solution(s) is contained in the set of roots of the equation: xi , yj ) = χ2 (ˆ xi , yjref ) min χ2 (ˆ yj
105
106
If selected “best” vector yjref points to the local minimum of the sufficiently smooth estimator function then, due to necessity equations for extremums, vector yj is determined as a function upon random variables xi in the vicinity of values x ˆi yj = Fj (xi )
⇒
yjref = Fj (ˆ xi )
We see that in such tasks almost all requirements for the applicability of the GUM-S1 and GUM-S2 recommendations are met. In our case study we have conducted a few computer experiments on simultaneously indirect “measuring” of all components of the vector of adjustable parameters Yj using algebraic formulae for physical observables Σab (sab ; Yj ) and ℜab (sab ; Yj ) (also indirect measurands, but named here as direct for clarity) describing possible measurable √ outcomes after collisions of two particles a, b at total collision energy sab in the center of mass of colliding particles. Formulae comes from theory and phenomenology to model direct experimental data on σ ab (sab ) and ρab (sab ) and connect them with adjustable model parameters (indirect measurands). Best estimates of parameters (reference estimates) yjref are obtained by tuning parameters by MLS to obtain the “best” currently possible quantifiable consistency between theory and experiment. 2. Experimental data input Experimental data samples used are from recent compilations [1], [2] of the measurement results on the hadronic production total cross sections σ ab (sab ) and another measurands ρab (sab ) in various two particle collisions √ at center of mass energies sab above 5 GeV. Compilation were collected from published scientific reports (1960-2013). It contains 1047 data points ab ab ab ab of (σ ab (sab l ), u(σl )) or (ρ (sk ), u(ρk )) where u(...) stands for total experimental uncertainties at each energy point (marked as l or k). These data ref ab ab ref should be compared with model tables Σab (sab l ; yj ) and ℜ (sk ; yj ) calculated using our algebraic formulae with reference values yiref of adjusted parameters inserted. 3. Phenomenological models In this section we show results of data description by different variants of the model used in our mini-review on the current situation of the subject in
107
RPP 2013/2014 [1]. Each variant where adjusted on the same data sample by simultaneous fit to the data on collisions: (p, p) (p, n, d); Σ− p; π ∓ (p, n, d); K ∓ (p, n, d); γ p; γ γ; γ d. To trace the variation of the range of applicability of simultaneous √ fit results, several fits were produced with lower energy cutoffs: sab ≥ 5, ≥ 6, ≥ 7 GeV until the “uniformity” of the fit quality (FQ) across different collision will became acceptable with good value of overall fit quality (F Q = χ2 /ndf , F Q ≦ 1). 3.1. Model HPR1R2
Σa
∓
b
2 s H log sab M +P ab ab η1 = ab sM +R 1 s η 2 ±Rab sab M 2
ℜ
a∓ b
=
1 Σa ∓ b
s
Heisenberg term Pomeranchuk term C+ Reggeon term C− Reggeon term
s πH log sab M sab η1 ab M −R1 tan η12π s η 2 ab sM η2 π ab ±R2 cot 2 s
Heisenberg term C+ Reggeon term C− Reggeon term
where upper signs are for particles and lower signs for anti-particles. The adjustable parameters are as follows: 2 H = π (~c) M 2 in mb, where notation H is after Heisenberg(1952,1975); P ab in mb, are Pomeranchuk’s(1958) constant terms; Riab in mb are the intensities of the effective secondary Regge pole contributions named after Regge-Gribov(1961); 2 2 s, sab M = (ma + mb + M ) are in GeV ; ma , mb , (mγ ∗ = mρ(770) ), M all in GeV are the masses of initial state particles and the mass parameter defining the rate of universal rise of the total cross sections. Parameters M , η1 and η2 are universal for all collisions considered. For collisions with deuteron target Hd = λH where dimensionless parameter λ is introduced to test the universality of the Heisenberg rise for particle–nuclear and nuclear–nuclear collisions. Exact factorization hypothesis was used for both H log2 ( ssab ) and P ab M to extend the universal rise of the total hadronic cross sections to the γ(p, d) → hadrons and γγ → hadrons collisions.
108
This results in one additional adjustable parameter δ with substitutions: ! " ! # s s 2 2 γ(p,d) p(p,d) +P ⇒ δ (1, λ)H log +P , H log γ(p,d) γ(p,d) sM sM s s 2 2 γγ 2 pp +P ⇒ δ H log +P H log sγγ sγγ M M In this variant we have 35 adjustable parameters and 1047 observational equations to “indirect measuring” (estimate) the best “reference values” of parameters and their scattering region (SR) in 35-dimensional parameter space. In cases with “moderate” nonlinearity one can construct SR by two methods: the Hessian method recommended in GUM [3] and by the adaptive Monte Carlo method (MCM) advocated in GUM-S1 [4] and GUM-S2 [5] documents. In the cases under study we construct and compare both SR: • the SRhess constructed by the standard NonlinearModelFit procedure in Mathematica 8; • the SRprop constructed by propagation of assumed normal distribution of experimental uncertainties to the “empirical” distribution of the parameter uncertainties. In fact, Hessian method gives the parameter covariance matrix as inverse Hessian matrix calculated at minimum point corresponding to ~y ref −1 1 ∂ 2 χ2 (~y ) \ ref ref (yi − yi )(yj − yj ) = . · 2 ∂yi ∂yj ~yref Inserting it into equation ∆χ2\ (~y | ~y ref )
1 ∂ 2 χ2 (~y ) \ = · · (yi − yiref )(yj − yjref ) + . . . 2 ∂yi ∂yj ~yref
we obtain ∆χ2\ (~y | ~y ref ) = Np and SRhess is deemed as region in the parameter space defined by inequality χ2 (~y ) − χ2 (~y ref ) ≦ Np = 35 Input data and plots with their model description are accessed by URLs from references [1], [2].
109
3.1.1. Parameter uncertainties estimation We have 1047 independent random input quantities xi ∈ N (ˆ xi , ui ) and 35 dependent quantities yj (xi ) which are estimated by MLS. First of all we should decide what is the result of an indirect measurement in this case. From the GUM-S2(2011) (clause 7.6) we have general recommendation yˆj = where
1 Nstop
Nstop
X
yjr ,
Uij =
r=1
Nstop
1 Nstop − 1
X r=1
(yir − yˆi )(yjr − yˆj )
= yj (xri ) is the reference vector obtained from of independent ˆ used to indicate expectation drawn xri by measuring procedure; (.)
yjr
random value (or estimated output value) which constitute a part of the measurement result; Uij is the output covariance matrix of the obtained MC-sample of yjr ; Nstop is the cardinality of the MC-sample. This recommendation works well in case of linear measuring model only, but in general case we propose a more natural estimates to be the result of indirect measurements, namely: yˆj ⇒ yjref ,
ref Uij =
1 Nstop
Nstop
X r=1
(yir − yiref )(yjr − yjref ).
In case of the MLS measurement method estimates yˆj should be replaced by the best fit parameter values yjref . For the nonlinear measuring model Yj = Fj (X1 , X2 , ..., XN ), yˆj = Fˆj (xr1 , xr2 , ..., xrN ) 6= Fj (ˆ x1 , x ˆ2 , ..., x ˆN ) and ref again the estimates yˆj should be replaced by yj = Fj (ˆ x1 , x ˆ2 , ..., x ˆN ) as it: (i) is independent of Nstop ; (ii) always belongs to the manifold where the probability is defined; (iii) it tends to the indirect measuring result recommended by the GUM-S2 when input data become more and more precise (u(xi ) ց 0). ref Covariance matrix Uij also should be replaced Uij ⇒ Uij as well. ref Our measuring method is to estimate the yj as the best fit parameters yj by minimization of the quadratic form: X σ ab − Σab (sab ; yj ) 2 X ρab − ℜab (sab ; yj ) 2 2 k k l l + χ (yj ) = ab ) ab ) u(σ u(ρ l k a,b,k a,b,l
over parameters yj , i.e.
min χ2 (yj ) = χ2 (yjref ). yj
110
We have goodness of fit indicator F Q = 0.963 that corresponds to fit confidence level CLref (1047 − 35) ≈ 0.9993 at ndf = 1012. In our case to perform propagation of assumed normal distribution of experimental scores for observable experimental values at each energy point we construct new quadratic form χ2r (yj ) !2 !2 ab ab ab X σl,r X ρab − Σab (sab l ; yj ) k,r − ℜ (sk ; yj ) 2 χr (yj ) = + u(σlab ) u(ρab k ) a,b,l a,b,k with the same set of adjustable parameters yj , where experimental value of observable at each energy point replaced by the random value drawn from corresponding assumed distribution independently, but simultaneously for all experimental data points. Index r marks the consecutive simultaneous replacements. Minimizations of the χ2r (yj ) in the parameter space with fit starting vector yjref for all r will give us a sample of propagated reference vectors i.e. min χ2r (yj ) = χ2r (yjprop ) yj
that form the empirical distribution of random reference vectors in the propagated scattering region SRprop . The obtained “empirical” PDF for the sample of values ∆χ2 (yjprop ) = χ2 (yjprop ) − χ2 (yjref ) is visually well fitted by χ2 (ν) distribution with ν ≈ 35 (see Fig. 1). This is a signal for the “moderate” nonlinearity and ∆χ2 (yjprop ) quantiles can be used for sampling of the whole MC propagated vector sample to construct the scatter regions with different coverage probabilities. At this stage we have a lucky situation (in our task). Indeed, we have Nstop = 1.7 × 106 vectors belonging to SRprop and a repeatable reliable procedure to extract samples with predefined coverage probability. Indeed, we have the scatter region like a “jet” of vectors that is mapped by PDF histogram of our ∆χ2 (yjprop ) quadratic form values to the one dimensional distribution. This distribution is well fitted by χ2 (34.91577...) distribution with confidence level of distribution fit test value: CLDF T (1.7 × 106 ) = 0.858. This value is acceptable as we have: max ∆χ2 (~y prop ) = 93.7814
~ y prop
and Quantile[χ2(34.91577...), p = 0.99999972579] = 93.7814.
111
Fig. 1. Distribution of ∆χ2 (yjprop ) (gray histogram) and curve of fitted χ2 (ν = 34.92) (gray part of the curve corresponds to coverage probability ≦ 0.95).
Thus, we may take Nstop = 1.7 × 106 to be the “stopping rule” for our MC-sampling as we have obtained SRprop sample with practically 100% coverage probability in terms of the approximate analytic distribution we have chosen by a reasonable fit of our 35-dimensional scatter region mapped to one-dimensional statistic χ2(34.91577...) with help of ∆χ2 (yjprop ) construction. 3.1.2. Summary on HPR1R2 model parameters measurement Now we can formulate the first level result of our measurement. Our task was to get reliable estimates of the HP R1R2 model parameters and to construct reasonable parameter scattering region with traceable calculation of its coverage probability. We propose a simple quantitative probabilistic reliability indicator (RLEV ) of parameters measurement: RLEV (Nstop ) = CLref (ndf ) × CLDF T (Nstop ) × SCP , where: CLDF T (Nstop ) denotes CL of “DistributionFitTest” at Nstop , SCP stand for the stipulated coverage probability. In our case we have: RLEV (Nstop ) = 0.999 × 0.858 × 0.999 = 0.857. Thus, our measurement is reliable at level of 86%.
112
This RLEV could be used in classifying results of measurements as “reliable” or “inconsistent” and in risk assessments in implementing measurement results in applications: It is strongly recommended by JCGM documents that the summarization of the measurement results should be as complete as possible and expressed in computer usable form as well. The minimal structure that will give any interested person to check and reproduce our statements is as follows: • measured experimental data sample treated as independent variables in measurement model (file with data or URL and procedure to extract needed sample); • parametric model and procedure to construct our best estimate of the model parameter vector value based on available experimental knowledge; • procedure that maps the scatter region of experimental data in 1047dimensional space onto scatter region in 35-dimensional space of the model parameters that are treated as dependent variables in measurement model; • file with Nstop 36-component binary vectors (χ2 (yjr ) − χ2 (yjref ), yjr ) with yjr ∈ SRprop (in our case it is ≈ 490 Mb). At this state we can say nothing about the geometric form of the scattering region except that we have a jet of Nstop 35-dimensional vectors randomly populated around the best fit vector yjref . CAUTION! Nevertheless, it should be noted, that there is no way to model parameter distribution as 35-dimensional normal distribution with covariance matrix constructed on the whole SRprop sample. Let ref Q2(~y ) = Uα,β (yα − yαref )(yβ − yβref )
be the quadratic form dependent on ~y , centered at ~y ref , and with covariance matrix constructed as the second moment of the whole SRprop vectors with respect to ~y ref . In this case, if ~y gen ∈ N [~y ref , U ref ], then Q2(~y gen ) ∈ χ2 [35]. Let ~y gen ∈ SRgen - the scatter region with cardinality Nstop drawn from N [~y ref , U ref ]. We have calculated two statistics: χ2 (~y prop ),
χ2 (~y gen )
and plotted corresponding histograms on Fig. 2 for comparison.
113
Fig. 2. Distributions of ∆χ2 (yjprop ) (gray histogram) and χ2 (~ y 100%gen ) (hard gray histogram). Curve is the fitted χ2 (ν = 34.92...) (gray part of the curve corresponds to coverage probability ≦ 0.95).
It is seen that the χ2 (~y 100%gen ) histogram has a large part of its right hand decline out of the χ2 (~y prop ) histogram area. This means that in the SRgen there are vectors that could not be obtained by our MCM procedure. This is an indication that obtained scatter region is non convex or distribution is asymmetric in the parameter space. Definitely to use 100% quantile area of the SRprop to calculate second moment of the empirical distribution will give ellipsoid possibly containing vectors out of fits stability region. We tried the 50% quantile area and obtained results presented on Fig. 3 where we have much better situation. The χ2 (~y 50%gen ) histogram now has much larger overlap with χ2 (~y prop ) histogram and its maximum is close to the left edge of the SRprop with the main body inside it. We have no good enough idea how to imbed an 35-dimensional ellipsoid of maximal possible volume (to get the as large coverage probability as possible) into such SRprop . Nevertheless we have played with quantile values and obtained more hopeful situation plotted on Fig. 4.
114
Fig. 3. Distributions of ∆χ2 (yjprop ) (gray histogram), ∆χ2 (yj50%gen ) (hard gray histogram). Curve is the fitted χ2 (ν = 34.92...) (gray part of the curve corresponds to coverage probability ≦ 0.95).
Fig. 4. Distributions of ∆χ2 (yjprop ) (gray histogram), ∆χ2 (yj68.5%gen ) (hard gray histogram). Curve is the fitted χ2 (ν = 34.2...) (gray part corresponds to coverage probability ≦ 0.95).
115
In the last case we can claim more compact result: instead of file with ref whole propagated vectors we propose to present a covariance matrix U68.5% constructed on the 0.685 quantile of the whole propagated sample and as the parameters multivariate probability distribution function the normal ref distribution N (~y ref , U68.5% ). Now reliability indicator is not so good because we use poorer statistics of SRgen gen RLEV (Nstop ) = 0.999 × 0.685 × 0.765 = 0.52
Last factor in the RLEV (0.765) is forced coverage probability to keep all generated vectors inside the SRprop for sure. Acknowledgements This work is supported in part by the Russian Foundation for Basic Research (RFBR) grants 14-07-00362 and 14-07-00950. References [1] J. Beringer et al. [Particle Data Group], Review of Particle Physics, Phys. Rev. D86 (2012) 010001, http://pdg.lbl.gov/2013/hadronic-xsections/hadron.html
[2] K.A. Olive et al. [Particle Data Group], Review of Particle Physics, Chin. Phys. C38 (2014) 090001 , http://pdg.lbl.gov/2014/hadronic-xsections/hadron.html. [3] JCGM 100:2008, Evaluation of measurement data - “Guide to the expression of uncertainty in measurement”, http://www.bipm.org/utils/common/documents/jcgm/JCGM_100_2008_E.pdf. [4] JCGM 101:2008, Evaluation of measurement data - Supplement 1 to the “Guide to the expression of uncertainty in measurement” - Propagation of distributions using a Monte Carlo method , http://www.bipm.org/utils/common/documents/jcgm/JCGM_101_2008_E.pdf
[5] JCGM 102:2011, Evaluation of measurement data - Supplement 2 to the “Guide to the expression of uncertainty in measurement” - Extension to any number of output quantities , http://www.bipm.org/utils/common/documents/jcgm/JCGM_102_2011_E.pdf
Advanced Mathematical and Computational Tools in Metrology and Testing X Edited by F. Pavese, W. Bremser, A. Chunovkina, N. Fischer and A. B. Forbes © 2015 World Scientific Publishing Company (pp. 116–123)
A NEW APPROACH FOR THE MATHEMATICAL ALIGNMENT MACHINE TOOL-PATHS ON A FIVE-AXIS MACHINE AND ITS EFFECT ON SURFACE ROUGHNESS SALIM BOUKEBBAB1,*, JULIEN CHAVES-JACOB2 Laboratoire Ingénierie des Transports et Environnent, Faculté des Sciences de la Technologie Université Constantine 1, Campus Universitaire Zarzara, 25000 Constantine, Algérie Tel: +213 (0)31 81 90 66 * E-mail : [email protected] 1
JEAN-MARC LINARES2,†, NOUREDDINE AZZAM1,2 2 Aix Marseille Université CNRS - UMR 7287 13288 Marseille Cedex 9, France Tel: + 33 (0) 4 42 93 90 96 † E-mail : [email protected] This paper proposes a procedure to adapt the geometry of the toolpath to remove a constant thickness on a five-axis machine. The aim of this work is to contribute to the automation of prosthesis machining, mainly, in the preparation of polishing surface. The proposed method can deform and adapt a toolpath to respect the geometry of the manufactured surface. This method is based on three steps: alignment, deformation and smoothing toolpath. In the alignment step, a mapping is carried out between the measured surface of prostheses and the nominal toolpath using the Iterative Closest Point (ICP) algorithm. The aligned toolpath is deformed in two steps. The first step is the projection of aligned points on the measured surface (defined by STL file). In the second step, these points are offset by a value (ap) to obtain the required geometry. During the deformation step a meshed surface is used, reducing the smoothness of the deformed toolpath. Experimental tests on industrial prostheses are conducted to validate the effectiveness of this method. During these tests the effects of the smoothing methods on the surface quality of machined parts are presented.
1. Introduction The surface quality of surgical implants is one of the most important properties to be controlled in their design and manufacture. The polishing operation represents the final action in the production cycle to improve the quality of implants surfaces. Generally, knee prosthesis is constituted of three parts. Two
116
117
metal parts are fixed respectively on the femur and one on the tibia. The third part is intercalated between the two metallic’s and it is made up of a very strong plastic resistant called the polyethylene, which improves the knee slip [1]. To reduce the removed bone volume the knee prostheses thickness is reduced. Thus, this small thickness is caused by deformations due to the foundry process [2]. The geometry has a small influence on the lifespan of the prosthesis, because the intercalated parts in polyethylene will be deformed to compensate geometry errors of the femoral part which is commonly made in cobalt-chromium alloy. On the other hand, the surface discontinuities and the surface quality (roughness and waviness) have a major influence on the lifespan of the prosthesis; this implies that we must have a very accuracy surface quality and to ensure the thickness of the prosthesis to avoid the prosthesis failure. When CNC machines are used to polish these functional surfaces, the polishing force is not controlled because usual CNC machines drive the position and not the applied force. This effect requires a geometrical adaptation of the machining toolpath at each rough work piece [3]. In manual polishing, the operator uses his eyes to adapt his toolpath. In the proposed method, a three-dimensional measurement is needed to obtain the rough part geometry made by foundry process. An STL model is generated after this measurement process, it should be noted here that the STL format is obtained by a triangulation of real work piece after acquisition step. The initial tool trajectory is calculated by a CAM (Computer Aided Manufacturing). It is defined on the nominal model given by CAD (Computer-aided design) software, with the respect of toolpath synchronization (figure 1). It makes it possible to avoid the traces on manufactured surface and thus avoid the build of CAD model of each deformed part and to remake a special CNC program [4-5].
Figure 1: Tool-path generate using CAD model of the knee prosthesis (femoral condyle).
The main objective of this research work is to modify a trajectory of machining calculated on a nominal model to remove a constant thickness over a rough
118
surface of part coming from the foundry. In this paper, the case of femoral component of knee prostheses (femoral implant: condyles) is studied. The CNC toolpaths are made only on the upper part of the knee condyle. 2. Description of the developed procedure This study proposes a method to adapt the geometry of the toolpath with the aim to remove a constant thickness. As presented in introduction, this case is present in the machining process of the femoral component of knee prostheses. The figure 2 illustrates the stages of this method. The proposed toolpath deformation method is composed of three stages: the measured surface alignment, toolpath deformation and toolpath smoothing. Each of these three items is studied in relation to the bibliography.
Figure 2: The stage of the method
The deformation of the toolpath is performed in three steps: a) Aligning the tool path (computed on the nominal model) and the STL model of the rough surface using the ICP algorithm, b) Deformation of the tool path, c) Smoothing of the deformed toolpath.
119
3. The alignment process using ICP algorithm The alignment process using the ICP algorithm begins by the measurement of rough surface which must be aligned with the nominal toolpath. The ICP algorithm is a well-known method for registering a 3D set of points to a 3D model [6]. It will be noted that the successive coordinates of the drive point expressed in the coordinate system of the workpiece give the nominal toolpath. Some CAM software options allow expression of the toolpath of the cutter contact point [3]. Subsequently, these coordinates are noted PCC(xi,yi,zi) and the tool axis direction, u. On the other hand, an STL file defines the measured surface [7]. It is composed of vertices, edges, and triangular facets. Each facet has a normal vector, n. It should be noted here that P’CC(xi,yi,zi) is the vertical projection of PCC(xi,yi,zi) on a triangular facet. A rigid transformation [Tt] consists in the rotation matrix [R] and the translation vector {T} giving the iterative transformation Eq. 1. P’CC(xi,yi,zi) = [R]× PCC(xi,yi,zi) + {T}
(1)
The transformation is calculated in the aim to displace the nominal toolpath on the measured surface. The algorithm minimizes the sum of squared residual errors between the set of points and the model, and finds a registration that is locally the best fit using the least-squares method Eq. 2. f (>R @, ^T `)
1 Ns
Ns
¦P
' CC i
>Tt @ u PCC i
2
(2)
i 1
4. Deformed toolpath and offset step After alignment phase, the toolpath is deformed in two steps (figure 3). In the first one the projection of the aligned points on the measured surface (STL model) is realised. In the second step, an offset of these points by a value (ap) is necessary to obtain the required geometry. These steps are detailed below.
Figure 3: Deformation of the nominal toolpath
120
4.1. Projection aligned points Firstly, all the points of the trajectory P’CC(xi,yi,zi) are projected on all facets of the STL model. A test is carried out to verify if the projection is inside the triangle or not. The distance between P’CC(xi,yi,zi) and a triangular element of STL model (figure 3) is determinate using the Eq. 3. The triangle vertices are denoted N1, N2 and N3. Eq. 4 is used to calculate the point PCC_def(xi,yi,zi). Ei = PCCiN1 . n OPCC_def(xi,yi,zi) = OP’CC(xi,yi,zi) + Ei . n
(3) ) (4)
Where n is the unit vector of the triangular element and Ei is the distance between P’CC(xi,yi,zi) and PCC_def(xi,yi,zi). 4.2. Offsetting the toolpath after projection The projected toolpath is offset with a quantity ap: depth of cut inside material (figure 3). The equation Eq. 5 is used to carry out to determinate the points PCCi_def_dec(xi,yi,zi). OPCCi_def_dec (xi,yi,zi) = OP’CC(xi,yi,zi) + (Ei – ap). n
(5)
It will be noted that, on a meshed surface (plane element); the local normal is submitted at discontinuous variations along a toolpath. This last will induce discontinuities on the deformed toolpath [3]. This deformation induces oscillations, principally, in the axis of the machine and this is observed in the manufacturing surface, because the initial trajectory is far from the target surface (figure 4). To resolve this impediment, section 5 proposes a method to smooth the toolpath within a pre-assigned tolerance.
Figure 4: Discontinuities observed on the deformed tool path.
The generation of toolpath starting from model STL generates disturbances of deformed trajectory then decelerations of the machine and defects on the part. These phenomena are harmful with respect to production and the surface quality.
121
Toolpath smoothing is carried out to improve surface quality after the deformation step. A technique of smoothing methods is developed in literature. Some authors propose the B-Spline curve interpolation to smooth the nominal toolpath points [8-9]. 5. Smoothing toolpath and experimental validations The proposed smoothing method is based on smoothing axis by axis with a 3dimensional admissible tolerance IT. This method may be applied to the 3 axes of the toolpath or only to one. On each axis a low degree polynomial ( Fˆi+1 ;
129
(4) Recalculate values Fˆi , ..., Fˆi+m , which satisfy the inequalities Fˆi > Fˆi+1 ≥ ... ≥ Fˆi+m , in the following way: Pi+m j=i rj ˆ ˆ Fi = ... = Fi+m = Pi+m ; j=i nj
(5) Repeat Steps (3) and (4) until conditions F1 ≤ F2 ≤ ... ≤ Fk are satisfied.
Then, the NPMLE of the unknown distribution function F (t) from current status data can be expressed as follows: 0, t < t1 , Fˆ , t 6 t < t , 1 1 2 Fˆ (t) = ... ˆ Fk , tk 6 t.
Let us now consider the following statistics for testing the goodness-offit hypothesis H0 : F (t) ∈ {F0 (t; θ), θ ∈ Θ}. The chi-square type statistic can be written as k X (eni − ei )2 , (5) Xn2 = ei i=1
where eni = ni Fˆ (ti ) is the empirical number of failures at the i-th inspection time, and ei = ni F0 (ti ; θˆn ) is the expected number of failures at i-th inspection time, with θˆn being the MLE of the unknown parameter. The Kolmogorov type statistic can be defined as (6) Dn = sup Fˆ (t) − F0 (t; θˆn ) , 0