Recent Developments in Fuzzy Logic and Fuzzy Sets (Studies in Fuzziness and Soft Computing, 391) 3030388921, 9783030388928

This book provides a timely and comprehensive overview of current theories and methods in fuzzy logic, as well as releva

148 48 6MB

English Pages 220 [213] Year 2020

Table of contents :
Contents
Fuzziness in Information Extracted from Tweets’ Hashtags and Keywords
1 Introduction
2 Hashtags and Clustering
2.1 Tweet and Hashtags
2.2 Fuzzy Sets
2.3 Fuzzy Clustering
2.4 Cluster Quality and Visualization
3 Collected Data
3.1 Hashtag Data
3.2 Presidential Election 2012 Data
4 Hashtag Data: Analysis of Popularity and Changes
4.1 Popularity Analysis
4.2 Analysis of Changes
5 Analysis of Election Data
5.1 Hashtags and Keywords of Tweets
5.2 Analysis of Tweets
5.3 Party Signatures
5.4 Similarities Between Party Signatures
6 Conclusion
References
Why Triangular and Trapezoid Membership Functions: A Simple Explanation
1 Ubiquity of Triangular and Trapezoid Membership Functions
2 Main Idea
3 First Formalization and the First Set of Results
4 Second Formalization and the Second Set of Results
5 Proofs
References
Statistical Approach to Fuzzy Cognitive Maps
1 Introduction
2 Fuzzy Cognitive Maps
3 Statistical Approach to Fuzzy Cognitive Maps
3.1 Stage 1, the Linear Nontransformed FCM Model
3.2 Stage 2, the Transformed FCM Model
3.3 The Liquid Tank Model
4 Conclusions
References
Probabilistic and More General Uncertainty-Based (e.g., Fuzzy) Approaches to Crisp Clustering Explain the Empirical Success of the K-Sets Algorithm
1 Clustering by Similarity: Formulation of the Practical Problem
2 The Main Idea Behind Clustering-by-Similarity
3 Probabilistic Approach: Towards the Precise Formulation of the Problem
4 Probabilistic Approach: Precise Formulation of the Problem, Resulting Clustering Algorithm, and Their Relation to K-Sets Algorithm and Its Foundations
5 Towards a More General Uncertainty-Based Approach
6 What if We Have Disproportionate Clusters?
7 Disproportionate Clusters: Probabilistic Case
8 Disproportionate Clusters: Case of General Uncertainty
9 Conclusion
References
Semi-supervised Learning to Rank with Nonlinear Preference Model
1 Introduction
2 Problem Statement
3 Semi-supervized Learning to Rank
3.1 Linear Pointwise Semi-supervised Ranking Learning
3.2 Nonlinear Kernel-Based Pointwise Semi-supervised Ranking Learning
3.3 Pairwise Ranking Learning
4 Learning to Rank Using Aggregated Data
4.1 Ranking Learning on Clusters Problem
4.2 Semi-supervised Clustering in Feature Space
4.3 Recurrent Ranking Learning on Clusters Using Stream Dataset
5 Conclusion
References
The Concept of Linguistic Variable Revisited
1 Introduction
2 Preliminaries
3 Semantics of Linguistic Expressions
3.1 Possible Worlds, Intension and Extension
3.2 Formalization of the Meaning of Linguistic Expressions
4 Linguistic Variable
4.1 Linguistic Framework
4.2 Evaluative Linguistic Expressions
4.3 Modified Definition of Linguistic Variable
4.4 Fuzzy Numbers
5 Conclusion
References
On the Z-Numbers
1 Introduction
2 Computation with Z-Numbers
2.1 Sum of Z-Numbers ( Ax ,Rx ) and ( Ay ,Ry ) [1]
2.2 Subtraction of Z-Numbers ( Ax ,Rx ) and ( Ay ,Ry ) [1]
2.3 Multiplication of Z-Numbers ( Ax ,Rx ) and ( Ay ,Ry ) [1]
2.4 Division of Z-Numbers ( Ax ,Rx ) and ( Ay ,Ry ) [1]
3 Ranking of Z-Numbers
3.1 Z-Numbers Ranking Algorithm by Sigmoid Function
3.2 Z-Numbers Ranking Algorithm by Sigmoid Function Based on the Combination of Convex
4 Z-Number Linear Regression (ZLR) Estimates Using Neural Network [5]
4.1 First State
4.2 Second State
5 The Definition of a Derivative-Based Z-Number [11]
5.1 Z-Number Initial Value Problem (ZIVP)
5.2 Numerical Examples (Economics Example)
6 Conclusions
References
Fuzzy Normed Linear Spaces
1 Introduction
2 Preliminaries
3 Katsaras's Type Fuzzy Norm
4 Felbin's Type Fuzzy Norm
5 Bag-Samanta's Type Fuzzy Norm
6 Convergence in Fuzzy Normed Linear Spaces
7 Fuzzy Continuous Linear Operators in Fuzzy Normed Linear Spaces
8 Conclusions
References
A Calculation Model of Hierarchical Multiplex Structure with Fuzzy Inference for VRSD Problem
1 Introduction
2 VRSDP/SD Problem and Its Formulation
2.1 Description of the VRSDP/SD Problem
2.2 Constants and Variables in VRSD Problem
2.3 Constraints for VRSD Problem
2.4 Evaluation for VRSDP/SD Problem
2.5 Objectives Solving VRSDP/SD Problem
3 HIMS Calculation Model
3.1 Strategy in Atomic Level of HIMS
3.2 Strategy in Molecular Level
3.3 Fuzzy Inference in Individual Level
3.4 Evaluation and Selection
3.5 The Implementation of HIMS Model
4 Experiments and Discussion
4.1 Experiment (1): Working with All Vehicles
4.2 Experiment (2): Working with just Number
4.3 Analysis of Experimental Results
5 Conclusion
References
Designing the Researchers’ Management Decision Support System Based on Fuzzy Logic
1 Introduction
2 Modeling of Researchers’ Activity Assessment
3 The Technique for the Assessment of Researchers’ Activity Based on the Fuzzy Relation Model
4 Functional Blocks of Researchers’ Performance Assessment System
5 Conclusion
References

Recommend Papers

Fuzzy Logic in Financial Analysis (Studies in Fuzziness and Soft Computing, 175) 3540232133, 9783540232131

In today’s increasingly complex and uncertain business environment, financial analysis is yet more critical to business

100 21 3MB Read more

Integration of Fuzzy Logic and Chaos Theory (Studies in Fuzziness and Soft Computing, 187) 3540268995, 9783540268994

The 1960s were perhaps a decade of confusion, when scientists faced d- culties in dealing with imprecise information and

113 37 15MB Read more

Innovations in Fuzzy Clustering: Theory and Applications (Studies in Fuzziness and Soft Computing, 205)

Clustering has been around for many decades and located itself in a uniquepositionasafundamentalconceptualandalgorithmic

114 71 3MB Read more

Fuzzy Probability and Statistics (Studies in Fuzziness and Soft Computing, 196)

This book combines material from our previous books FP (Fuzzy Probabilities: New Approach and Applications,Physica-Verla

121 22 2MB Read more

Fuzzy Applications in Industrial Engineering (Studies in Fuzziness and Soft Computing, 201) 3540335161, 9783540335160

Industrial Engineering (IE) is concerned with the design, improvement, and installation of integrated systems of people,

104 66 7MB Read more

Fuzzy Control: Fundamentals, Stability and Design of Fuzzy Controllers (Studies in Fuzziness and Soft Computing, 200) 9783540317654, 3540317651

“Fuzzy Control - the revolutionary computer technology that is changing our world” - these and other headlines could be

114 80 4MB Read more

Fuzzy Probabilities: New Approach and Applications (Studies in Fuzziness and Soft Computing, 115) 3540250336, 9783540250333

In probability and statistics we often have to estimate probabilities and parameters in probability distributions using

105 5 6MB Read more

Weighted and Fuzzy Graph Theory (Studies in Fuzziness and Soft Computing, 429) 303139755X, 9783031397554

One of the most preeminent ways of applying mathematics in real-world scenario modeling involves graph theory. A graph c

118 89 2MB Read more

Fuzzy Database Modeling of Imprecise and Uncertain Engineering Information (Studies in Fuzziness and Soft Computing, 195)

Computer-based information technologies have been extensively used to help industries manage their processes and informa

115 110 1MB Read more

Fuzzy Chaotic Systems: Modeling, Control, and Applications (Studies in Fuzziness and Soft Computing, 199) 3540332200, 9783540332206

Bringing together the two seemingly unrelated concepts,fuzzy logic andchaos theory,isprimarilymotivatedbytheconceptofsof

98 93 18MB Read more

Recent Developments in Fuzzy Logic and Fuzzy Sets (Studies in Fuzziness and Soft Computing, 391)
3030388921, 9783030388928

Author / Uploaded
Shahbazova

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Studies in Fuzziness and Soft Computing

Shahnaz N. Shahbazova Michio Sugeno Janusz Kacprzyk Editors

Recent Developments in Fuzzy Logic and Fuzzy Sets Dedicated to Lotfi A. Zadeh

Studies in Fuzziness and Soft Computing Volume 391

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland

The series “Studies in Fuzziness and Soft Computing” contains publications on various topics in the area of soft computing, which include fuzzy sets, rough sets, neural networks, evolutionary computation, probabilistic and evidential reasoning, multi-valued logic, and related ﬁelds. The publications within “Studies in Fuzziness and Soft Computing” are primarily monographs and edited volumes. They cover signiﬁcant recent developments in the ﬁeld, both of a foundational and applicable character. An important feature of the series is its short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by ISI, DBLP and Ulrichs, SCOPUS, Zentralblatt Math, GeoRef, Current Mathematical Publications, IngentaConnect, MetaPress and Springerlink. The books of the series are submitted for indexing to Web of Science.

More information about this series at http://www.springer.com/series/2941

Shahnaz N. Shahbazova Michio Sugeno Janusz Kacprzyk •

•

Editors

Recent Developments in Fuzzy Logic and Fuzzy Sets Dedicated to Lotﬁ A. Zadeh

123

Editors Shahnaz N. Shahbazova Department of Information Technology and Programming Azerbaijan Technical University Baku, Azerbaijan

Michio Sugeno Tokyo Institute of Technology Kamakura, Japan

Janusz Kacprzyk Systems Research Institute Polish Academy of Sciences Warsaw, Poland

ISSN 1434-9922 ISSN 1860-0808 (electronic) Studies in Fuzziness and Soft Computing ISBN 978-3-030-38892-8 ISBN 978-3-030-38893-5 (eBook) https://doi.org/10.1007/978-3-030-38893-5 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

Fuzziness in Information Extracted from Tweets’ Hashtags and Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shahnaz N. Shahbazova

1

Why Triangular and Trapezoid Membership Functions: A Simple Explanation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vladik Kreinovich, Olga Kosheleva and Shahnaz N. Shahbazova

25

Statistical Approach to Fuzzy Cognitive Maps . . . . . . . . . . . . . . . . . . . . Vesa A. Niskanen

33

Probabilistic and More General Uncertainty-Based (e.g., Fuzzy) Approaches to Crisp Clustering Explain the Empirical Success of the K-Sets Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vladik Kreinovich, Olga Kosheleva, Shahnaz N. Shahbazova and Songsak Sriboonchitta

61

Semi-supervised Learning to Rank with Nonlinear Preference Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . L. Lyubchyk, A. Galuza and G. Grinberg

81

The Concept of Linguistic Variable Revisited . . . . . . . . . . . . . . . . . . . . 105 Vilém Novák On the Z-Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Toﬁgh Allahviranloo and Somayeh Ezadi Fuzzy Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Sorin Nădăban, Simona Dzitac and Ioan Dzitac

v

vi

Contents

A Calculation Model of Hierarchical Multiplex Structure with Fuzzy Inference for VRSD Problem . . . . . . . . . . . . . . . . . . . . . . . . 175 Kewei Chen, Fangyan Dong and Kaoru Hirota Designing the Researchers’ Management Decision Support System Based on Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Masuma Mammadova and Zarifa Jabrayilova

Fuzziness in Information Extracted from Tweets’ Hashtags and Keywords Shahnaz N. Shahbazova

Abstract Social media becomes a part of our lives. People use different form of it to express their opinions on variety of ideas, events and facts. Twitter, as an example of such media, is commonly used to post short messages—tweets—related to variety of subjects. The paper proposes on application of fuzzy-based methodologies to process tweets, and to interpret information extracted from those tweets. We state that the obtained knowledge is fully explored and better comprehend when fuzziness is used. In particular, we analyze hashtags and keywords to extract useful knowledge. We look at the popularity of hashtags and changes of their popularity over time. Further, we process hashtags and keywords to build fuzzy signatures representing concepts associated with tweets. Keywords Hashtags · Tweets · Fuzziness · Fuzzy sets · Fuzzy clustering · Fuzzy signatures · Hashtag popularity · Cluster quality

1 Introduction The unquestionable and overwhelming engagement of individuals in social network related activities is perceived as a source of valuable information. Any type of data left by the users on the web is recognized as providing a crucial and otherwise difficult to obtain information. More and more often we see examples of that. Analysis of the users’ posts leads to identification of the users’ options and attitudes regarding variety of things and events. The analysis of the users’ data is triggered by a high expectation of finding a unique and important insight regarding issues and events important for the users, and the users’ options about them.

S. N. Shahbazova (B) Department of Information Technology and Programming, Azerbaijan Technical University, 25 H.Cavid Ave., Baku AZ1073, Azerbaijan e-mail: [email protected] © Springer Nature Switzerland AG 2020 S. N. Shahbazova et al. (eds.), Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing 391, https://doi.org/10.1007/978-3-030-38893-5_1

1

2

S. N. Shahbazova

Among variety of social media networks Twitter [1] is recognized as one of the most influential social forums. The nature of messages that can be posted on it— called tweets—makes them, i.e., tweets an important source of information. These tweets represent the users’ opinions and thoughts expressed in short and simple messages. The concept of fuzzy sets introduced in 1965 by L. A. Zadeh [2] is an important contribution to processes of analysis of data that represents human-generated information. On one hand, imprecision and vagueness are inherently present in any type of data generated by the users. On the other hand, the results of analysis of data should be able to find and handle imprecision, and convey these findings to the users. We state that the extracted information is richer and better represents undelaying features when fuzziness is applied. In particular, we extract information embedded in hashtags—we look at the popularity of tags and changes of their popularity over time; we process hashtags and build fuzzy signatures representing concepts associated via collected tweets. In this paper, we describe a simple methodology of analyzing a set of Twitter hashtags. The main focus of the method is investigation of temporal aspects of this data. We are interested in analysis of hashtags from the point of view of their dynamics. We identify groups of hashtags that exhibit similar temporal patterns, look at their linguistic descriptions, and recognize hashtags that are the most representative of these groups, as well as hashtags that do not fit the groups very well. The presented and used method is based on a fuzzy clustering process. Once the clusters are created we examine obtained clusters in detail and draw multiple conclusions regarding variations of hashtags over time. Further, we construct fuzzy signatures of political parties based on analysis of hashtags and noun-phrases extracted from a set of tweets associated with US elections of 2012. We use obtained signatures to analyze similarities between issues and opinions important for each party. The paper is divided into the following sections. We start with a brief introduction to the concepts of tweets, fuzzy sets, and fuzzy clustering, Sect. 2. Section 3 provides a brief description of used data: hashtags collected from Hashtagify.me; and tweets associated with US elections 2012. Further, we focus on analysis of hashtags—we provide some examples of hashtag popularity; describe a data pre-processing leading to representation of popularity changes. We present the results of a clustering processes and provide discussion of its results. We illustrate different behavior of hashtags popularity. All this constitutes Sect. 4. The following section, Sect. 5, targets analysis of presidential election tweets: creation of party signatures and their evaluating similarity between them. Section 6 contains discussion and conclusion.

Fuzziness in Information Extracted …

3

2 Hashtags and Clustering 2.1 Tweet and Hashtags Twitter—one of the most popular online message systems—allows its users to post short messages called tweets. According to dictionary.com [3], the definition of a tweet is: … 2. (Digital Technology) a very short message posted on the Twitter website: the message may include text, keywords, mentions of specific users, links to websites, and links to images or videos on a website.

The users posting these messages include special words in the text. These words— hashtags—are easily recognizable and play the role of “connectors” between messages. An informal definition of hashtags—obtained from Wikipedia [4]—is as follows: A hashtag is a type of label or metadata tag used on social network and microblogging services which makes it easier for users to find messages with a specific theme or content. Users create and use hashtags by placing the hash character (or number sign) # in front of a word or unspaced phrase, either in the main text of a message or at the end. Searching for that hashtag will then present each message that has been tagged with it.

As it can be induced, hashtags carry quite a weight regarding marking and identifying topics the users wants to talk about or draw attention to. The spontaneous way hashtags are created—there are no restrictions regarding what a hashtag can be—is their crucial feature. This allows for forming a true image of the users’ interests, things important for them, and things that draw their attention. As the result, any type of analysis of hashtag data could lead to a better understanding of the users’ attitudes, as well as detection of events, incidents, and calamities.

2.2 Fuzzy Sets Fuzzy set theory [2] aims at handling imprecise and uncertain information in various domains. Let D represents a universe of discourse. A fuzzy set F with respect to D is defined by a membership function μF : D → [0, 1], assigning a membership degree μ(d) to each d ∈ D. This membership degree represents the level of belonging of d to F. The fuzzy set can be represented as pairs: F=

µ(d1 ) µ(d2 ) , ,... d1 d2

For more information on fuzzy sets and systems, please consult [5, 6].

4

S. N. Shahbazova

2.3 Fuzzy Clustering One of the most popular methods of analysis of data focuses on identifying clusters of data-points that exhibit substantial levels of similarity. There are multiple methods of clustering data that differ in their ability to find data clusters, and their complexity [7–10]. Among many clustering algorithms there are ones that utilize fuzzy methodology [11–13]. In such a case, clusters of data-points do not have sharp boarders. In general, data-points belong to clusters to a degree. In the fuzzy terminology, we talk about a degree of belonging (membership) of a data-point to a given cluster. As the result, there are points that fully belong to a given cluster—membership value of 1, as well as points that belong to a cluster to a degree—membership values between 0 and 1. Such an approach provides more realistic segregation of data—very rarely we deal with a situation that everything is clear, and data can be divided into sets of data-points that are “clean”, i.e., contain points that simply belong or do not belong to clusters. The method used here is based on a fuzzy clustering method called FANNY [14]. The optimization is performed via minimizing the following objective function k v=1

n i=1

n

r r j=1 μiv μ jv d(i, j) n 2 j=1 μrjv

where n is a number of data-points, k is a number of clusters, μ is a membership value of a data-point to a cluster, d(i, j) is a distance or difference between points i and j. The selection of that approach has been dictated by the fact that we do not want to create fictitious centers of clusters, as it happens in widely popular fuzzy clustering method FCM [12]. Additionally, there is its new implementation in R programming language [15] that is used here.

2.4 Cluster Quality and Visualization Clusters contain multiple data-points that are distributed in the space embraced by the clusters’ boundaries. Some of these points are quite inside—have high values of membership, while some are close to the boundaries—have small values of membership while at the same time they have comparable values of membership to other clusters. An interesting measure indicating quality of a cluster, i.e., demonstrating that data-points that belong to this cluster are well fitted into it is called silhouette width [16]. This measure is represented by the following ratio for a given element i from a cluster k:

Fuzziness in Information Extracted …

s(i, k) =

5

OU T (i) − I N (i, k) max(OU T (i), I N (i, k))

with N OU T (i) = min j=k

j

d(i, m) , Nj

m=1

Nk I N (i, k) =

d(i, m) Nk

m=1

where d(i, m) is a distance (or a difference) between data-points i and m, N k is a size of cluster k, N j is a size of any other cluster. The value of s(i, k) allows us to identify the closest cluster to a point i outside the cluster k. Positive values of silhouette indicate good separation of clusters. The process of visualization of multi-dimensional clusters is fairly difficult. A possible solution could be a projection of clusters into selected dimensions. But then, the issue is which dimensions to choose. In his paper, we use an approach introduced in [17]. The approach called CLUSPLOT is based on a reduction of the dimension of data by principal component analysis [18]. Clusters are plotted in coordinates representing the first two principal components, and are graphically represented as ellipses. To be precise, each cluster is drawn as a spanning ellipse, i.e., as a smallest ellipse that covers all its elements.

3 Collected Data 3.1 Hashtag Data The process of data analysis is performed on real data representing popularity of hashtags. The data are obtained from the website Hashtagify.me, and contain information about 40 different hashtags. The popularity ratings have been obtained for the period of nine weeks. The sample of data for a few selected hashtags is shown in Table 1. The values presented in Table 1 show the popularity (as % relative to other hashtags) for every week. For example, #KCA was the most popular hashtag for the first five weeks. However after the fifth week, its popularity has started decreasing. The hashtag #callmebaby did not even exist for the first few weeks, than rapidly gained popularity, and after two weeks its popularity has been around 85%. Very similar behavior can be observed for the #NepalQuake. Its popularity in the last few weeks has been in the range from 53 to 72%. The hashtag #iphone, on the other hand, is characterized via a continuous—with some small fluctuations—level of popularity: 80–84%. In order to analyze behavioral patterns of hashtags a simple processing of data has also been performed. Here, we are interested in the percentage of changes of popularity of hashtags. The data are presented in Table 2. Here, the calculations have been done using a very simple formula:

0.0

56.9

44.0

0.0

80.4

#SoFantastic

#Nepal

#NepalQuake

#iphone

100.0

-nine

80.8

0.0

44.5

71.5

0.0

100.0

-eight

81.9

0.0

42.3

72.7

30.3

100.0

-seven

Popularity (# weeks in the past)

#callmebaby

#KCA

Hashtag

Table 1 Popularity of selected HashTags -six

81.2

0.0

42.5

73.5

20.9

100.0

-five

79.8

0.0

42.6

75.8

65.5

100.0

-four

81.9

0.0

42.5

97.4

98.0

93.0

-three

83.2

0.0

43.2

92.3

95.1

84.3

-two

84.6

59.9

73.8

36.6

89.1

85.2

-one

84.1

72.5

80.8

37.7

85.7

81.1

Zero

82.8

53.1

63.6

48.8

87.3

75.9

6 S. N. Shahbazova

Fuzziness in Information Extracted …

7

Table 2 Changes in popularity of selected HashTags Hashtag

Popularity (# weeks in the past) Zero versus nine

Zero versus seven

Zero versus five

Zero versus three

Zero

−24.1

−24.1

−24.1

−8.4

75.9

#callmebaby

87.3

57.0

21.8

−7.8

87.3

#SoFantastic

−8.1

−23.9

−27.0

−43.5

48.8

#Nepal

19.6

21.3

21.0

20.4

63.6

#NepalQuake

53.1

53.1

53.1

53.1

53.1

2.4

0.9

3.0

−0.4

82.8

#KCA

#iphone

changezer o ver sus N = popularit yweek:zer o − popularit yweek:N The calculated change represents a difference between the popularity value for the current week week:zero and the popularity value for a considered week:N. For example, the value changezer o ver sus nine = −24.1 means that the popularity of #KCA in week zero is −24.1 lower than its popularity in week -nine.

3.2 Presidential Election 2012 Data The created data set focuses on elections in United States. We have selected elections of 2012. The main reason for such a selection is an importance and scope of the 2012 elections. The elections were a very large even in the US history. They consist of the following elections: (1) the 57th presidential election; (2) Senate elections; and (3) House of Representative elections. The first step in collecting tweets of the members of parties has been creation of a list of Twitter accounts of members of a parliament and most important members of parties. Twitter has a feature called twitter list where you can create a collection of Twitter accounts for people to follow. Almost all parties share lists of party members, parliament members or party related accounts. Such lists enable to promote the party’s ideas, and make it easy to follow news related to the party. Also, some websites offer such lists for individuals to follow. In order to create our own lists for each party, we merge all accounts that appear in those party lists in one list. Such created list will be used to collect tweets. The collection process has been done using the Twitter Search API. We have constructed a program—twitter search collector—that periodically collects and stores tweets using eight different API keys. The program requests only tweets that have an ID higher than the last tweets we collected with the previous/last usage of the collector. The details regarding number of tweets associated with each party are presented in Table 3. As we can see.

8 Table 3 Collected tweets statistics

S. N. Shahbazova Party

Number of tweets

Number of accounts

Republican

95,193

560

Democratic

95,731

361

Libertarian

13,202

63

Green

8625

175

Justice

62,612

43

2128

16

Socialism and liberation

4 Hashtag Data: Analysis of Popularity and Changes The analysis of the hashtag popularity data is performed in two stages. Firstly, we look at clustering of popularity. The purpose here is to identify groups of hashtags that are characterized by similar popularity levels on a weekly basis. Secondly, we focus on information related to bi-weekly changes in the popularity levels. The results are illustrated using the CLUSPLOT methodology [17]. We are able to visualize clusters of hashtags despite a multi-dimensional nature of data.

4.1 Popularity Analysis The clustering of hashtag data (Table 1) has lead to identification of four clusters, Fig. 1. As mentioned above, the visualization of the clusters is done in a 2-dimensional space of two most significant components of PCA. We also include a plot repressing values of silhouette width, Fig. 2. The membership of hashtags to clusters is presented in Table 4. As we can see in figures, the hashtags have been split into well-separated groups.

Fig. 1 Popularity of hashtags: visualization of clusters

Fuzziness in Information Extracted …

9

Fig. 2 Popularity of hashtags: obtained silhouette values

In order to better comprehend each of the clusters, let us identify a most representative element of each of them. For this purpose we define a measure of misfit. Its value for a hashtag i and a cluster k is calculated with the following formula: mis f iti,k =

max j=k (μi, j ) μi,k

where μi,k is the membership value of i in k, and max j=k (μi, j ) represents the maximum value of membership of the hashtag i to any other cluster but k. This allows us to identify the elements that are characterized by the minimum degree of “not-fitting” into a given cluster, Table 5. As it can be seen, the identified hashtags fully belong to their clusters. Based on their popularity values, we can identify a signature of each cluster. The popularity values of the most characteristic hashtags for each cluster are shown in Table 6. At this stage, we can describe each of these clusters using a linguistic description, Table 7. Such analysis of hashtags provides us with a good understanding of their overall popularity. Linguistic descriptions give a glimpse about popularity patterns that exist in the hashtags we investigate. The clusters can be used as indicators which hashtags exhibits similar popularity ratings, and to which popularity pattern they belong.

4.2 Analysis of Changes As another way of analyzing temporal data we look at the bi-weekly changes in popularity of hashtags. Once again we perform clustering. After trying multiple ways of representing temporal aspects of data, we have decided to look at changes in popularity of hashtags in different weeks in the reference to the current week. A few examples of the processed data are in Table 2. This data illustrate a tendency of

10

S. N. Shahbazova

Table 4 Popularity-based clusters Cluster

Hashtags

Cluster I (marked with circle, Fig. 1)

#KCA #gameinsight #FOLLOW #RT #arabicWord-02 #EXO #japaneaseWord-01 #FF #followback #RETWEET #FOLLOWTRICK #ipad #japaneaseWord-02 #android #ipadgames #TeamFollowBack #arabicWord-03 #love #TFB #iphone #video #free

Cluster II (marked with triangle, Fig. 1)

#theyretheone #onedirection #Vote1DUK #vote5sos #VoteFifthHarmony #FifthHarmony #SoFantastic #voteonedirection #5SecondsOfSummer

Cluster III (marked with plus, Fig. 1)

#arabicWord-01 #callmebaby #callme

Cluster IV (marked with x, Fig. 1)

#AlwaysInOurHartsZaynMalik #THF #Nepal #Kathmandu #NepalQuake #japaneaseWord-03

Table 5 Popularity-based clusters: characteristic elements

Cluster

Hashtags (misfit degree)

Cluster I (circle in Fig. 1)

#ipadgames (0.00)

Cluster II (triangle in Fig. 1)

#SoFantastic (0.17)

Cluster III (plus in Fig. 1)

#callmebaby (0.08)

Cluster IV (x in Fig. 1)

#Kathmandu (0.03)

0.0

28.2

56.9

85.1

#Kathmandu

#SoFantastic

#ipadgames

-nine

85.0

71.5

29.7

0.0

-eight

85.6

72.7

20.9

30.3

-seven

Popularity (# weeks in the past)

#callmebaby

Hashtag

Table 6 Popularity of characteristic (for each cluster) HashTags

85.8

73.5

20.4

20.9

-six

83.2

75.8

29.7

65.5

-five

84.1

97.4

25.6

98.0

-four

85.8

92.3

20.2

95.1

-three

87.3

36.6

52.9

89.1

-two

86.0

37.7

55.7

85.7

-one

84.5

48.8

48.1

87.3

Zero

Fuzziness in Information Extracted … 11

12 Table 7 Linguistic descriptions of popularity-based clusters

S. N. Shahbazova Cluster

Linguistic description

Cluster I (circle in Fig. 1)

constantly_high

Cluster II (triangle in Fig. 1)

moderate_high_low

Cluster III (plus in Fig. 1)

nothing_very-high_high

Cluster IV (x in Fig. 1)

low_low_moderate

changes of popularity in the reference to the current popularity value. Overall, the purpose is to group hashtags not based on their absolute values of popularity, but based on their bi-weekly changes in the reference to the current popularity values. The process of clustering has resulted in six clusters. They are illustrated in Fig. 3. The details, i.e., the hashtag members of each cluster are presented in Table 8. The clusters are quite “isolated” from each other, Fig. 4. The most characteristic hashtags for each cluster are included in Table 9. Once we identified the most characteristic elements of each cluster, we look at their values associated with each data-point/hashtag—bi-weekly changes (Table 10) and weekly popularity (Table 11). Both tables for bi-weekly changes (Table 10) and weekly popularities (Table 11) provide us with an interesting insight into patterns associated with each cluster. For example, for the hashtags #RETWEET we have values close to zero for changes (Table 10, first row), and quite large values for the popularity (Table 10, first row). This prompts us to label this cluster as “constantly large”, Table 12. This table contains linguistic descriptions for all clusters. In the case of clustering bi-weekly changes of popularity, we also look at the hashtags with the highest values of misfit measure. The list of elements with highest values of misfit measures is presented in Table 13. The table also contains information about other clusters these elements have the highest membership value. This analysis allows us to form option about boarder-line data points. The hashtags listed in Table 13 should be treated as the ones as being potentially in transition between clusters.

Fig. 3 Changes of hashtags’ popularity: visualization of clusters

Fuzziness in Information Extracted …

13

Table 8 Popularity-change clusters Cluster

Hashtags

Cluster A (circle in Fig. 3)

#KCA #RT #RETWEET #TeamFollowBack #love #TFB #video

Cluster B (triangle in Fig. 3)

#theyretheone #onedirection #FifthHarmony #SoFantastic #AlwaysInOurHartsZaynMalik #5SecondsOfSummer

Cluster C (plus in Fig. 3)

#arabicWord-01 #callmebaby #callme

Cluster D (x in Fig. 3)

#Vote1DUK #vote5sos #VoteFifthHarmony #voteonedirection

Cluster E (diamond in Fig. 3)

#gameinsight #FOLLOW #arabicWord-02 #EXO #japaneaseWord-01 #FF #followback #FOLLOWTRICK #ipad #japaneaseWord-02 #android #ipadgames #arabicWord-03 #iphone #free

Cluster F (inv-triangle in Fig. 3)

#Nepal #Kathmandu #NepalQuake #japaneaseWord-03

14

S. N. Shahbazova

Fig. 4 Changes of hashtags’s popularity: obtained silhouette values Table 9 Popularity-change clusters—least MisFit HashTags Cluster

Hashtags

Cluster A (circle in Fig. 3)

#RETWEET (0.02)

Cluster B (triangle in Fig. 3)

#SoFantastic (0.03)

Cluster C (plus in Fig. 3)

#callme (0.03)

Cluster D (x in Fig. 3)

#Vote1DUK (0.02)

Cluster E (diamond in Fig. 3)

#followback (0.01)

Cluster F (inv-triangle in Fig. 3)

#japaneaseWord-03 (0.01)

Table 10 Characteristic hashtags of popularity-change clusters Hashtag

Popularity (# weeks in the past) Zero versus nine

Zero versus seven

#RETWEET

−0.7

−2.1

#SoFantastic

−8.1

−23.9

#callme #Vote1DUK

64.1 −51

Zero versus five 0.6 −27

Zero versus three

Zero

−1.5

84.7

−43.5

48.8

70

−7.2

1.7

90.9

−53.9

−56.7

−27.3

43.3

−5

83.9

39.1

45.1

#followback

−0.3

−4.1

1.9

#japaneasW-3

44.1

44.1

33.3

5 Analysis of Election Data 5.1 Hashtags and Keywords of Tweets The collected tweets represent texts sent by individuals from different parties that support specific ideas, issues and concerns. Multiple different hashtags are used to

85.4

56.9

26.8

94.3

84.2

1.0

#SoFantastic

#callme

#Vote1DUK

#followback

#japaneasW-3

-nine

1.0

84.2

99.6

30.3

71.5

86.0

-eight

1.0

88.0

97.2

20.9

72.7

86.8

-seven

Popularity (# weeks in the past)

#RETWEET

Hashtag

Table 11 Characteristic hashtags: their popularity values -six

6.0

87.2

96.0

63.6

73.5

86.1

11.8

82.0

100.0

98.1

75.8

84.1

-five

-four

13.9

82.8

75.1

95.2

97.4

84.8

-three

6.0

88.9

70.6

89.2

92.3

86.2

-two

6.1

93.6

76.1

85.7

36.6

89.9

-one

37.5

90.5

67.0

87.3

37.7

87.2

Zero

45.1

83.9

43.3

90.9

48.8

84.7

Fuzziness in Information Extracted … 15

16 Table 12 Popularity-change clusters: linguistic descriptions

Table 13 Popularity— change clusters—the most Misfit HashTags

S. N. Shahbazova Cluster

Linguistic description

Cluster A (circle in Fig. 3)

Constantly large

Cluster B (triangle in Fig. 3)

Relatively high, rapid decline

Cluster C (plus in Fig. 3)

Rapid growth, then constant

Cluster D (x in Fig. 3)

Substantial recent decline

Cluster E (diamond in Fig. 3)

More-or-less constant

Cluster F (inv-triangle in Fig. 3)

Substantial recent growth

Cluster

Hashtags

Cluster A (circle in Fig. 3)

#KCA misfit measure—0.85 membership to A: 0.34 membership to E: 0.29

Cluster B (triangle in Fig. 3)

#5SecondsOfSummer misfit measure—0.97 membership to A: 0.31 membership to E: 0.30

Cluster C (plus in Fig. 3)

#callmebaby negligible misfit—0.09

Cluster D (x in Fig. 3)

#voteonedirection negligible misfit—0.06

Cluster E (diamond in Fig. 3)

#japaneaseWord-02 misfit measure—0.81 membership to E: 0.52 membership to A: 0.42 #japaneaseWord-01 misfit measure—0.57 membership to E: 0.60 membership to A: 0.34 #EXO misfit measure—0.55 membership to E: 0.40 membership to A: 0.22

Cluster F (inv-triangle in Fig. 3)

#Nepal misfit measure—0.82 membership to F: 0.34 membership to A: 0.28

represent those concepts. We process the tweets in order to extract the most essential hashtags. Further, we analyze texts of tweets and extract the most representative nounphrases from them. The ‘significance’ of both hashtags and phrases is determined by a measure called document frequency. It is a number of tweets that include a given

Fuzziness in Information Extracted …

17

Table 14 Hashtags and noun-phrases of tweets associated with parties Party

Hashtags and keywords

Republican

job, 4job, here, bill, obama, tax, hear, vote, american, discuss, presid, senat, us, budget, gop, tcot, pass, debt, work, energy

Democratic

obama2012, vote, job, romnei, obama, us, help, support, american, work, tax, women, bill, gop, act, health, family, right, nation, student, mitt, state, people, elect, plan, make

Libertarian

libertarian, liberti, parti, johnson, gari, independ, tcot, candid, stori, govgaryjohnson, vote, obama, freedom, teaparti, us, what, hess4governor, govern, ronpaul, romnei, 2012, presidenti, paul, help, redey, elect, free, election2012

Green

green, parti, jillstein2012, greenparti, vote, stein, candid, jill, debat, us, gpu, presidenti, 2012, state, watch, support, regist, live, join

Justice

occupy, people, occupywallstreet, protest, polic, just, arrest, vote, teaparti, sai, support, call, obama, make, occupyseattl, street, occupyoakland, help, state, citi, solidar, syria, wiunion, tcot, march

Socialism and liberation

votepsl, petalindsai, candid, campaign, lindsai, presidenti, election2012, yari_nobord, live, people, us, social, fight, osorio, elect, polic, mikeprysn, protest

hashtag/noun-phrase. The resulted sets of hashtags and phrases associated with each party are shown in Table 14. The union of all hashtags/phrases contain 94 different words. Among them there is no single word that is found in all hashtag/phrase sets associate with individual parties. However, there a number of them are shared between multiple parties. The most common hashtags/phrases for multiple parties are presented in Table 15.

5.2 Analysis of Tweets Let us take a closer look at parties using quantitative analysis of hashtags/phrases associated with each of them. Let us focus on the main parties: Republican and Democratic. Table 16 contains a union of all hashtags/phrases with the value of their document frequency. The rows have been sorted based on the document frequency values. The three highest values for each party are presented in bold font. Additionally, Fig. 5 illustrates graphically the comparative analysis of both parties from the point of view of used hashtags/phrases. Besides the hashtags/phrases that are unique for each party, there are a number of them used by both of them. As it can be seen at the first section of the plot there are a few hashtags/phrases that are used similarly: american, work, gop, and for some there is quite a difference: vote, job, bill.

18

S. N. Shahbazova

Table 15 Hashtags and keywords—most common

Used by

Hashtags and keywords

5 parties

us vote

4 parties

obama candid elect help presidenti romnei state support tcot

3 parties

2012 american bill election2012 gop job live make parti peopl polic protest tax teaparti work

5.3 Party Signatures A quantitative analysis seems to promise more interesting results. Here, we use fuzzy set based technologies for two ways of interpreting and analyzing date about hashtags: building a fuzzy set signature of a given party [19, 20], and compare the signatures to determine similarities between them (next subsection). If we take into consideration a term hashtag/phrase vs document frequency for a given party, Table xyz4, we can think about them as indicators of importance of hashtags. If we broaden this, we can think about them as terms expressing important of specific ideas/opinions/issues for party members. The idea of building a fuzzy set ‘Party Signature’ seems to be natural. This is a set with membership values representing a degree of importance of a specific hashtag/noun-phrase (idea/opinion/issue) for party members. The proposed way of building such a fuzzy set is: Signatur e = where

a2 ai aN a1 , ,..., ,..., hashtag1 hashtag2 hashtagi hashtag N

(5.1)

Fuzziness in Information Extracted … Table 16 Hashtags and keywords—republican and democratic parties

19 Republican

Democratic

vote

3901

5844

job

8439

5512

obama

4211

5422

us

3446

4951

american

3621

3936

work

2955

3823

tax

4072

3176

bill

5284

2861

gop

3167

2763

obama2012

0

6980

romnei

0

5503

make

0

4141

support

0

4080

women

0

3071

plan

0

2874

act

0

2609

health

0

2421

elect

0

2395

right

0

2379

nation

0

2330

mitt

0

2284

peopl

0

2248

state

0

2229

student

0

2188

4job

6057

0

here

5410

0

hear

3963

0

discuss

3571

0

presid

3551

0

senat

3547

0

budget

3365

0

tcot

3163

0

pass

3146

0

debt

2978

0

energi

2735

0

famili

2411

0

20

S. N. Shahbazova

Fig. 5 Illustration of document frequency of hashtags and noun-phrases used in tweets of members of Republican and Democratic parties

ai =

document f r equency o f hashtag1 (5.2) maximum document f r equency among all hashtags f or par t y

The values of the proposed membership degree are determined as a ratio of a document frequency for a given hashtag/phrase to a maximum document frequency of hashtags/phrases used by all party members in the collected tweets. In other words, the value assigned to a single hashtag/phrase expresses the party’s degree of attention dedicated to this hashtag/phrase when compared to the most ‘popular’ hashtag/phrase, i.e., the hashtag/phrase with the highest value of document frequency. Another version of the membership can be obtained when the ratio contains the maximum document frequency of hashtags/phases among all parties: ai =

document f r equency o f hashtag1 maximum document f r equency among all hashtag f or all par ties (5.3)

The Party Signatures are shown in Fig. 6. The figure represents the signatures of all six parties combined on per hashtag/noun-phrase basis, i.e., a single row represents degree of memberships of a given hashtag/noun-phrase for each party. Figure 6a has signatures calculated in reference to maximum document frequency for a single party (Eq. 5.2), while Fig. 6b has signatures calculated in reference to maximum document frequency among all parties (Eq. 5.3). In this case, we can observe popularity of hashtags/phrases across all parties. The hashtags/phrases job, vote, obama, us are the most popular ones.

Fuzziness in Information Extracted …

21

Fig. 6 Illustration of comparison of Party Signatures with degree of membership calculated based on a party maximum of document frequency; and b on global maximum of document frequency

5.4 Similarities Between Party Signatures The main advantage of creating Party Signature is the ability to apply fuzzy based technologies for their comparison. In this particular case, we will look at two similarities. The first fuzzy similarity measure we use here is represented by the following formula [21, 22]: |μ A (x) − μ B (x)| (5.4) S1 = 1 − x∈X x∈X (μ A (x) + μ B (x)) This measure is simple in calculations, however does not perform well for values of x that have a membership degree equal to zero. Therefore, we also use another similarity measure [21, 23] derived based on Intuitionistic fuzzy sets [24] S2 = √

C(A, B) T (A) ∗ T (B)

(5.5)

where T (A) = C(A, B) =

x∈X

μ2A (x) + v 2A (x)

x∈X

{μ A (x) ∗ μ B (x) + ν A (x) ∗ ν B (x)}

(5.6)

22

S. N. Shahbazova

ν A (x) =(1 − μ A (x))

(5.7)

The component T (Eq. 5.6) is called informational energy of a fuzzy set [25], and C (Eq. 5.7) is the correlation of ordinary fuzzy sets [23, 26]. The results of application of the fuzzy similarity measure S1 for party signatures is presented in Table 17. The signature used here is determine on all 94 hashtags/nounphrases that represent a union of all hashtags/phrases extracted from tweets of all parties [27–29]. The two highest values are marked with bold fonts. As we can see, the concerns expressed by the same hashtags/noun-phases lead to relative high similarity between Republican and Democratic Party. The similarities between other parties seems relatively small. We also use another fuzzy similarity—S2 —that takes into account values of zero of membership degrees [30, 31]. The obtained results are displayed in Table 18. The fact that there is a small number of common hashtags/noun-phrases means that many values of party signatures are zero. This results in high similarity values. Please note that there are some differences in the pairs of party signatures with top similarity values. A closer analysis of both results leads us to the conclusion that the fuzzy similarity measure S1 seems to be more adequate for comparison of fuzzy signatures that contain a significant number of zeros. In order to better understand similarities between party signatures, we also determine similarity using only the most common 26 hashtags/noun-phrases [32, 33], Table 17 Similarity based on all hashtags—global normalization, S1 Rep. Rep. Dem.

Dem. 0.4831

0.4831

Lib.

Green

Justice

S&L

0.1206

0.0657

0.0908

0.0405

0.1088

0.0926

0.1822

0.0650

Lib.

0.1206

0.1088

Green

0.0657

0.0926

0.2254

0.2254

Just

0.0908

0.1822

0.1419

0.0838

S&L

0.0405

0.0650

0.1145

0.1315

0.1419

0.1145

0.0838

0.1315 0.0641

0.0641

Table 18 Similarity based on all hashtags—global normalization, S2 Rep. Rep. Dem.

Dem.

Lib.

Green

Justice

S&L

0.9360

0.8980

0.8985

0.8921

0.8726

0.8877

0.8899

0.8947

0.8614

0.9360

Lib.

0.8980

0.8877

Green

0.8985

0.8899

0.9424 0.9424

Just

0.8921

0.8947

0.9269

0.9296

S&L

0.8726

0.8614

0.9166

0.9240

0.9269

0.9166

0.9296

0.9240 0.9075

0.9075

Fuzziness in Information Extracted …

23

Table 19 Similarity based on selected hashtags—IND. normalization, S1 Rep. Rep.

Dem.

Lib.

Green

Justice

S&L

0.6900

0.2430

0.1271

0.1751

0.0771

0.1703

0.1378

0.2778

0.0981

0.5956

0.3545

0.2805

0.2025

0.3112

Dem.

0.6900

Lib.

0.2430

0.1703

Green

0.1271

0.1378

0.5956

Just

0.1751

0.2778

0.3545

0.2025

S&L

0.0771

0.0981

0.2805

0.3112

0.1464 0.1464

Table 19. The results are presented in Table 19. Here, we observe interesting similarities between parties other than Republican and Democratic.

6 Conclusion The importance of analysis of data generated via the users of social networks is unquestionable. These data is a source of valuable and unique information that reflects a true need of the analysis—better understanding of events, facts important for the user, and their option about them. The presented here analysis of temporal aspects of hashtags—their popularities over time and the changes of these popularities—is an attempt to look at dynamic nature of the user-generated data. The application of fuzzy clustering shown here provides a number of interesting benefits related to fact that categorization of hashtags is not crisp. The further investigation of fuzzy-based measures leads to interesting conclusions. The construction of fuzzy signatures based on frequency of occurrence of hashtags is an interesting approach to express importance of opinions and issues represented via tweets’ hashtags and noun-phrases. A simple process of constructing such signatures is presented here. Once the signatures are obtained, they are used to compare the importance of opinions/issues articulated by groups of individuals represented by the signatures. These processes have been applied to tweets representing US elections 2012.

References 1. 2. 3. 4. 5.

https://twitter.com/. Accessed 8th May 2015 L.A. Zadeh, Fuzzy sets. Inf. Control 8, 338–353 (1965) http://dictionary.reference.com. Accessed 8th May 2015 http://www.wikipedia.org. Accessed 8th May 2015 G. Klir, B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, (Prentice Hall, 1995)

24

S. N. Shahbazova

6. W. Pedrycz, F. Gomide, Fuzzy Systems Engineering: Toward Human-Centric Computing, (Wiley-IEEE Press, 2007) 7. A. Chaturvedi, P. Green, J. Carroll, K-modes clustering. J. Classif. 18(1), 35–55 (2001) 8. A.K. Jain, M.N. Murty, P.J. Flynn, Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999) 9. A.K. Jani, Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010) 10. K.L. Wu, M.S. Yang, Alternative c-means clustering algorithms. Pattern Recogn. 35, 2267– 2278 (2002) 11. A. Baraldi, P. Blonda, A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans. Syst Man, Cybern. Part B Cybern. 29(6), 778–785 (1999) 12. J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms (Plenum Press, New York, 1981) 13. Z.X. Huang, M.K. Ng, A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst. 7(4), 446–452 (1999) 14. L. Kaufman, P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis (Wiley, New York, 1990) 15. http://www.r-project.org. Accessed 8th May 2015 16. P.J. Rousseeuw, Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987) 17. G. Pison, A. Struyf, P.J. Rousseeuw, Displaying a clustering with CLUSPLOT. Comput. Stat. Data Anal. 30, 381–392 (1999) 18. L.T. Jolliffe, Principal Component Analysis, 2nd edn. (Springer, Berlin, 2002) 19. M.Z. Reformat, R.R. Yager, in Using Tagging in Social Networks to Find Groups of Compatible Users, 2013 IFSA-NAFIPS Join Congress, Edmonton, Canada, 24–28 June 2013 20. R.R. Yager, M.Z. Reformat, Looking for like-minded individuals in social networks using tagging and fuzzy sets. IEEE Trans. Fuzzy Syst. 21(4), 672–687 (2013) 21. A. Pal, B. Mondal, N. Bhattacharyya, S. Raha, Similarity in fuzzy systems. J. Uncertainty Anal. Appl. 2(1) (2014) 22. C.P. Pappis, N.I. Karacapilidis, A comparative assessment of measures of similarity of fuzzy values. Fuzzy Sets Syst. 56(2), 171–174 (1993) 23. T. Gerstenkorn, J. Manko, Correlation of intuitionistic fuzzy sets. Fuzzy Sets Syst. 44(1), 39–43 (1991) 24. K.T. Atanassov, Intuitionistic fuzzy sets. Fuzzy Sets Syst. 20(1), 87–96 (1986) 25. D. Dumitrescu, A definition of an informational energy in fuzzy sets theory. Stud. Univ. BabesBolyai Math. 22(2), 57–59 (1977) 26. D. Dumitrescu, Fuzzy correlation. Studia Univ. Babes-Bolyai Math. 23, 41–44 (1978) 27. F.Y. Cao, J.Y. Liang, L. Bai et al., A framework for clustering categorical time-evolving data. IEEE Trans. Fuzzy Syst. 18(5), 872–882 (2010) 28. S.N. Shahbazova, Development of the knowledge base learning system for distance education. Int. J. Intell. Syst. 27(4), 343–354 (2012) 29. S.N. Shahbazova, Application of fuzzy sets for control of student knowledge, Appl. Comput. Math. Int. J. 10(1), 195–208 (2011). ISSN 1683–3511. (Special issue on fuzzy set theory and applications) 30. O. Koshelova, S.N. Shahbazova, “Fuzzy” multiple-choice quizzes and how to grade them. J. Uncertain Syst. 8(3), 216–221 (2014). Online at: www.jus.org.uk 31. A.M. Abbasov, S.N. Shahbazova, Informational modeling of the behavior of a teacher in the learning process based on fuzzy logic. Int. J. Intell. Syst. 31(1), 3–18 (2015) 32. S. N. Shahbazova, Modeling of creation of the complex on intelligent information systems learning and knowledge control (IISLKC). Int. J. Intell. Syst. 29(4), 307–319 (2014) 33. L.A. Zadeh, A.M. Abbasov, S.N. Shahbazova, Fuzzy-based techniques in human-like processing of social network data. Int. J. Uncertainty, Fuzziness Knowl. Based Syst. 23(1), 1–14 (2015)

Why Triangular and Trapezoid Membership Functions: A Simple Explanation Vladik Kreinovich, Olga Kosheleva and Shahnaz N. Shahbazova

Abstract In principle, in applications of fuzzy techniques, we can have different complex membership functions. In many practical applications, however, it turns out that to get a good quality result – e.g., a good quality control – it is sufficient to consider simple triangular and trapezoid membership functions. There exist explanations for this empirical phenomenon, but the existing explanations are rather mathematically sophisticated and are, thus, not very intuitively clear. In this paper, we provide a simple – and thus, more intuitive – explanation for the ubiquity of triangular and trapezoid membership functions. Keywords Fuzzy logic · Triangular membership function · Trapezoid membership function

1 Ubiquity of Triangular and Trapezoid Membership Functions Why fuzzy sets and membership functions: reminder. In the traditional 2-valued logic, every property is either true or false. Thus, if we want to formally describe a property like “small” in the traditional logic, then every value will be either small or not small. This may sound reasonable until one realizes that, as a result, we have a threshold value t separating small values from non-small one: V. Kreinovich (B) · O. Kosheleva University of Texas at El Paso, 500 W. University, El Paso, TX 79968, USA e-mail: [email protected] O. Kosheleva e-mail: [email protected] S. N. Shahbazova Azerbaijan Technical University, Baku, Azerbaijan e-mail: [email protected] © Springer Nature Switzerland AG 2020 S. N. Shahbazova et al. (eds.), Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing 391, https://doi.org/10.1007/978-3-030-38893-5_2

25

26

V. Kreinovich et al.

• every value below t is small, while • every value above t is not small. This means that, for any small ε > 0 – e.g., for ε = 10−10 : • the value t − ε is small while • a practically indistinguishable value t + ε is not small. This does not seem reasonable at all. To make a formalization of properties like “small” more reasonable, Lotfi Zadeh proposed: • instead of deciding which value is small and which is not, • to assign, to each possible value x of the corresponding quantity, a degree μ(x) to which, according to the expert, this value is small (or, more generally, to what extend this value satisfies the corresponding property); see, e.g., [1, 2, 5, 7–9]. This degree can be determined by asking an expert to mark this degree on a scale from 0 to 1. Alternatively, we can ask the expert to mark this degree, e.g., on a scale from 0 to 10, and then divide the resulting marked degree by 10. Thus, we get the values μ(x) ∈ [0, 1] corresponding to different values x. The function μ(x) assigning this degree to each possible value is known as a membership function, or, alternatively, a fuzzy set. Intuitively, we expect that if two values x and x are close, then the expert will assign similar degrees to these two values, i.e., that the degrees μ(x) and μ(x ) will also be close. Triangular and trapezoid membership functions. In the beginning, practitioners applying fuzzy techniques dutifully followed the definition of a membership function. Namely, for each imprecise property like “small”, they asked experts, for many different values of x, what are their degrees μ(x). Surprisingly, it turned out that from the application viewpoint, all this activity was wasted: whether we talk about control or planning or whatever other activity, the quality of the result usually does not change if we replace the elicited membership function with a simple piecewise-linear one that has a shape of a triangle or a shape of the trapezoid. Specifically, a triangular membership function has the following form, for some parameters x and > 0: • μ(x) = 0 for x ≤ x − ; x − ( x − ) for x −≤ x ≤ x; • μ(x) = ( x + ) − x • μ(x) = for x ≤ x ≤ x + , and • μ(x) = 0 for x ≥ x + . Similarly, a trapezoid membership function has the following form, for some parameters x , δ, and , for which 0 < δ < : • μ(x) = 0 for x ≤ x − ; x − ( x − ) for x −≤ x ≤ x − δ; • μ(x) = −δ

Why Triangular and Trapezoid Membership Functions …

27

• μ(x) = 1 when x −δ ≤ x ≤ x + δ; ( x + ) − x • μ(x) = for x +δ ≤ x ≤ x + , and −δ • μ(x) = 0 for x ≥ x + . Why triangular and trapezoid membership functions work so well? To most people, the surprisingly empirical success of triangular and trapezoid membership functions was very unexpected. The only person who was not very surprised was . . . Lotfi A. Zadeh himself, since he always had an intuition that in many cases, the simplest – and thus, intuitively clearest methods – work the best: • when we use simple, intuitively clear methods, we utilize both the formulas and our intuition, while • when we use complex, difficult-to-intuitively understand methods, we have to rely only on formulas, we cannot use our intuition – and thus, our results are often worse. On the qualitative level, this is a reasonable explanation. However, it is desirable to also have a more convincing, quantitative explanation of why triangular and trapezoid membership functions work so well. There already are explanations for this empirical phenomenon, but they are not very intuitive. In our previous papers [3, 4], we have provided quantitative explanations for the surprising empirical success of triangular and trapezoid membership functions. These explanations are based either on the general ideas of signal processing (and related wavelets) or on the type-2 fuzzy analysis of the problem. From the mathematical viewpoint, both explanations seem to be reasonable. But, honestly, would Lotfi Zadeh – if he was still alive – be fully happy with these explanations? We do not think so. He would complain that these explanations are too complex and thus, not very intuitive. Would it be possible – he would ask (as he asked in many similar situations) – to come up with simpler, more intuitive explanation, an explanation where we would be able to support the corresponding mathematics by the intuitive commonsense understanding? What we do in this paper. In this paper, we provide such a simple reasonably intuitive explanation. Is this a final answer? Probably not. Maybe an even simpler and an even more intuitive explanation is possible. However, the new explanation – motivated by Zadeh’s quest for simplicity – is already much simpler that the explanations that we had before, so we decided to submit it for publication.

2 Main Idea As we have mentioned earlier, one of the main motivations for fuzzy technique was the need to make sure that if the values x and x are close, then the corresponding

28

V. Kreinovich et al.

membership degrees μ(x) and μ(x ) should also be close. How can we formalize this idea? When x and x are close, i.e., when x = x + x for some small x, then the difference μ(x ) − μ(x) = μ(x + x) − μ(x) between the corresponding values of the membership function can be – at least for smooth membership functions – represented as μ (x) · x + o(x). Therefore, the requirement that this difference is small is equivalent to requiring that the absolute value of the derivative |μ (x)| is small. There are two ways to formalize this requirement: • we can require that the worst-case value of this derivative is small, or • we can require that the average – e.g., mean squared – value of this derivative is small. The smaller the corresponding characteristic, the more the resulting membership function is in line with the original fuzzy idea. Thus, it is reasonable, for both formalizations, to select a membership function for which the value of the corresponding characteristic is the smallest possible. Let us show that in both cases, this idea leads to triangular and trapezoid membership functions.

3 First Formalization and the First Set of Results Definition 1 Let x < x be two real numbers. For each continuous almost everywhere differentiable function μ(x) defined on the interval [x, x], let us define its worst-case non-fuzziness degree Dw (μ) as Dw (μ) = max |μ (x)|. x∈[x,x]

Comment. The requirement that the membership function is almost everywhere differentiable is needed so that we can define the largest value of the derivative. This requirement is not as restrictive as it may seem, since usually, membership functions are piecewise-monotonic, and it is known that all monotonic functions – and thus, all piecewise-monotonic functions – are almost everywhere differentiable. Proposition 1 Among all continuous almost everywhere differentiable function μ(x) defined on the interval [x, x] for which μ(x) = 0 and μ(x) = 1, the following linear function has the smallest worst-case non-fuzziness degree: μ(x) =

x−x . x−x

Comment. For reader’s convenience, all the proofs are placed in a special Proofs section.

Why Triangular and Trapezoid Membership Functions …

29

Proposition 2 Among all continuous almost everywhere differentiable function μ(x) defined on the interval [x, x] for which μ(x) = 1 and μ(x) = 0, the following linear function has the smallest worst-case non-fuzziness degree: μ(x) =

x−x . x−x

Discussion. Thus, if we assume that μ(x) = 0 for all x ∈ / [ x − , x + ] and μ( x ) = 1, then, due to Propositions 1 and 2, the most fuzzy membership function – i.e., the function with the smallest possible worst-case non-fuzziness degree – will be the corresponding triangular function. Similarly, if we assume that μ(x) = 0 for all x ∈ / [ x − , x + ] and μ(x) = 1 for all x ∈ [ x − δ, x + δ], then, due to Propositions 1 and 2, the most fuzzy membership function – i.e., the function with the smallest possible worst-case non-fuzziness degree – will be the corresponding trapezoid function. Thus, we indeed get a reasonably simple explanation for the ubiquity of triangular and trapezoid membership functions.

4 Second Formalization and the Second Set of Results Definition 2 Let x < x be two real numbers. For each continuous almost everywhere differentiable function μ(x) defined on the interval [x, x], let us define its average non-fuzziness degree Da (μ) as Da (μ) =

1 · x−x

x

(μ (x))2 d x.

x

Proposition 3 Among all continuous almost everywhere differentiable function μ(x) defined on the interval [x, x] for which μ(x) = 0 and μ(x) = 1, the following linear function has the smallest average non-fuzziness degree: μ(x) =

x−x . x−x

Proposition 4 Among all continuous almost everywhere differentiable function μ(x) defined on the interval [x, x] for which μ(x) = 1 and μ(x) = 0, the following linear function has the smallest average non-fuzziness degree: μ(x) =

x−x . x−x

Discussion. Thus, similarly to the previous section, we can show that the most fuzzy membership functions are triangular and trapezoid ones.

30

V. Kreinovich et al.

5 Proofs def

Proof of Proposition 1. For the linear function μ (x), we have μ (x) = K = for all x and thus, Dw (μ ) = K . Let us prove:

1 x−x

• that we cannot have a smaller value of Dw (μ), and • that the only function with this value of worst-case degree of non-fuzziness is the linear function. In other words, let us prove: • that we cannot have |μ (x)| < K for all x, and • moreover, that we cannot have |μ (x)| ≤ K for all x and μ (x) < K for some x. Indeed, due to the known formula relating integration and differentiation, we have

x

μ(x) − μ(x) = 1 − 0 =

μ (x) d x.

x

If we had |μ (x)| ≤ K (hence μ (x) ≤ K ) for all x and μ (x) < K for some x, then we would have

x

1=

x

μ (x) d x
0.23 is 0.092 at time = t + 1. Cluster analysis is also useful when we examine how the possible target values are obtained according to its drivers. In this context, we may apply both ordinary and fuzzy clustering techniques [16, 17]. Table 7 provides an example on the cluster centers with Matlab’s fuzzy subtractive clustering tool when four clusters are created with our data [5]. Hence, for example, according to the first cluster center, we may reason that If initially C2 ≈ 0.500 and C3 ≈ 0.633 and C4 ≈ 0.527 and C5 ≈ 0.511 and C7 ≈ 0.472 and C8 ≈ 0.376 and C10 ≈ 0.671, then we will obtain C1 ≈ 0.208. In fact, these clusters and their centers may be used in fuzzy rule-based reasoning models if we aim at predicting the target values according to their drivers [2, 8, 29]. This widely-adopted approach is thus a fluent nonlinear method for predicting directly the transformed values from the initial concept vectors. Discriminant analysis, in turn, is an example on the corresponding statistical method [16, 17].

Statistical Approach to Fuzzy Cognitive Maps

49 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0

-3

-2,5

-2

-1,5

-1

-0,5

0

0,5

1

Concept1 Lambda=1

Probability

Lambda=5

Fig. 12 Transformed values of concept C1 when lambda is 1 and 5 versus their untransformed values. Also, the probabilities of concept values of C1 being above 0.23 according to the nontransformed values of C1

Table 7 Examples of cluster centers of the drivers and the corresponding target values Concepts at time = t

Cluster center 1

Cluster center 2

Cluster center 3

Cluster center 4

C2

0.500

0.460

0.613

0.710

C3

0.633

0.459

0.287

0.050

C4

0.527

0.456

0.280

0.405

C5

0.511

0.499

0.148

0.768

C7

0.472

0.564

0.586

0.293

C8

0.376

0.679

0.210

0.726

C10

0.671

0.283

0.314

0.702

C1 at time = t + 1

0.208

0.263

0.454

0.184

We may also apply the multinomial regression models in this context in which case the categorical response variable may have more than two values [16–18]. An example of this is provided below.

50

V. A. Niskanen

The foregoing statistical methods provide both novel and supplementary information on FMCs which base on widely-used stochastic estimates and reasoning. In this manner, we may thus avoid certain subjective and ad hoc conclusions when we interpret or fine-tune our FCM models. We will apply these ideas to an empiric example below.

3.3 The Liquid Tank Model Our suggested methods seem useful within the human sciences which is in the focus in this study. But we may also apply these methods to other fields and thus our example considers the well-known control application because fuzzy control models still play a central role in the fuzzy community. Our empiric example considers the control application presented in [21, 23]. In this model two valves, valve 1 and valve 2, supply different liquids into the tank. These liquids are mixed for a certain chemical reaction, and our goal is to maintain the desired liquid level (amount of liquid) and specific liquid gravity in the tank. The third valve, valve 3, is used to drain liquid from the tank. This FCM model applies the connection matrix presented in Table 8 which is given in [23], and now the preceding values of the target concepts are also used in simulations (self-loops, thus the diagonal values in the matrix are 1). The transformed concept values will use Formula (2.2) with lambda = 1 as in the original model. Table 9 presents the possible concept values prior to and after the transformations when 10,000 random initial concept vectors were used as above. Figures 13 and 14 depict the corresponding graph (without self-loops) and the concept values in ten iterations, respectively. The model will yield fixed-points to the concepts. When stepwise linear regression analyses were applied to such models in which each concept at a time was the response variable and the other concepts acted as the possible predictors, we obtained the simplified connection matrix in Table 10, and this Table only contained the statistically significant drivers to each target concept. For example, Gravity was insignificant driver to Liquid level. We notice that our final regression models yielded very high R-square values. The average error in the concept values in ten iterations was 0.008 when our new FCM values were compared Table 8 The original weights of concepts Liquid level

Valve 1

Valve 2

Valve 3

Gravity

Liquid level

1

−0.207

−0.112

0.064

0.264

Valve 1

0.298

1

0.061

0.069

0.067

Valve 2

0.356

0.062

1

0.063

0.061

Valve 3

−0.516

0.07

0.063

1

0.068

Gravity

0.064

0.468

0.06

0.268

1

Statistical Approach to Fuzzy Cognitive Maps

51

Table 9 The descriptive statistics of the possible nontransformed and transformed concept values when 10,000 initial random vectors were used Concepts

Range

Minimum

Maximum

Mean

Std. deviation

Liquid level

1.92

−.36

1.56

.5976

.34447

Valve 1

1.63

−.11

1.52

.6952

.32617

Valve 2

1.20

−.07

1.13

.5324

.29359

Valve 3

1.46

.00

1.46

.7336

.30088

Gravity

1.46

.00

1.46

.7261

.30022

Liquid level

.42

.41

.83

.6413

.07757

Valve 1

.35

.47

.82

.6633

.07170

Valve 2

.27

.48

.76

.6275

.06770

Valve 3

.31

.50

.81

.6722

.06545

Gravity

.31

.50

.81

.6706

.06541

Nontransformed

Transformed

5.5 5

1

4.5 4

2

3.5 3

3

2.5 2

4

1.5 1

5

0.5 0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

Fig. 13 Graph of the connection matrix in Table 8. (1 = liquid level, 2 = valve1, 3 = valve2, 4 = valve3, 5 = gravity)

to the original ones. Figure 15 depicts our simplified FCM when the self-loops are not presented. We will only focus on predicting the liquid levels in the tank. Hence, in [21, 23] these rules were given, • if Liquid_level < 0.68, the level is low.

52

V. A. Niskanen 0,8 0,7

Concept values

0,6 0,5 0,4 0,3 0,2 0,1 0

0

1

2

3

4

5

6

7

8

Nr. of iteration Liquid level

Valve1

Valve2

Valve3

Gravity

Fig. 14 Concept values in ten iterations in the original FCM when lambda = 1 (iteration 0 denotes the original random values)

Table 10 The simplified connection matrix and the R-squares of the regression models Liquid level

Valve 1

Valve 2

Valve 3

Gravity

Liquid level

1

−0.166

−0.074

0

0.349

Valve 1

0.313

1

0.098

0.106

0

Valve 2

0.37

0

1

0

0

Valve 3

−0.501

0

0

1

0

Gravity

0

0.507

0

0.304

1

R-square

0.999

0.998

0.997

0.998

0.996

• If 0.68 ≤ Liquid_level ≤ 0.70, the level is appropriate (the goal level). • If Liquid_level > 0.70, the level is high. If we, for example, will construct such logistic regression model with SPSS which may provide the probabilities of obtaining low liquid level according to our driver concept values, we should first create a new dichotomous target concept, Liquid_d, • Liquid_d = 1, when the liquid level < 0.68, • Liquid_d = 0, otherwise. The statistics of this logistic regression model, whose Nagelkerke R-square value is 0.974, is presented in Table 11 (the insignificant predictor Gravity was removed). Hence, we notice that our final predictors seem significant according to the Wald tests and the signs of the linear regression coefficients, B, indicate (quite self-evidently) that, among others,

Statistical Approach to Fuzzy Cognitive Maps

53

4

1

3.5 3

5

2.5 2

2

1.5 1

3

0.5

1

1.5

4

2

2.5

3

3.5

4

4.5

Fig. 15 Graph of the connection matrix in Table 10. (1 = liquid level, 2 = valve 1, 3 = valve 2, 4 = valve 3, 5 = gravity)

Table 11 The estimates of the logistic regression coefficients for the new liquid level Concepts

B

S.E.

Wald

df

Sig.

Initial liquid level

−94.964

5.195

334.192

1

.000

Valve 1

−28.492

1.585

323.082

1

.000

Valve 2

−34.163

1.892

325.998

1

.000

Valve 3

48.440

2.672

328.687

1

.000

Constant

69.063

3.780

333.862

1

.000

• If the initial liquid level increases, the risk to low liquid level is lower (its B value is negative). • If the initial flow restriction increases in valve 1, the risk to low liquid level is lower (its B value is negative). • If the initial flow restriction increases in valve 2, the risk to low liquid level is lower (its B value is negative). • If the initial flow restriction increases in valve 3, the risk to low liquid level is higher (its B value is positive). As above, the probabilities of obtaining the low liquid level are calculated with the function, Probability = 1/(1 + exp(−1 · Z)), when

(3.3.1)

54

V. A. Niskanen

Fig. 16 The probabilities of obtaining the liquid level below the median versus the transformed liquid levels

Z = − 94.964 · Liquid_level − 28.492 · Valve l − 34.163 · Valve 2 + 48.440 · Valve 3 + 69.063 and exp is the exponential function (Fig. 16). For example, if the initial predictor values are Liquid level = 0.96, Valve 1 = .032, Valve 2 = 0.53, Valve 3 = 0.96 at time = t, the probability of obtaining low liquid level is 0.029 at time = t + 1. In [18] the corresponding multinomial regression model with SPSS was also constructed in which case the Nagelkerke R-square was 0.979, and these results are in Table 12. In this case the goal level in [21, 23], 0.68 ≤ liquid level ≤ 0.70, was our reference class and the low and high levels were below and above this level, respectively. This model will actually yield two logistic regression models, and their probabilities base on the analyses of, • low level compared to the goal level • high level compared to the goal level. Hence, the interpretation on our regression coefficients, B, in Table 12 is similar to that of logistic regression and thus we may reason on the stochastic grounds, among others,

Statistical Approach to Fuzzy Cognitive Maps

55

Table 12 The estimates of the multinomial logistic regression coefficients for the liquid tank model Liquid level Goal versus low

B Intercept

Wald

df

Sig.

14.739

28.827

1

0.000

−109.707

20.377

28.985

1

0.000

Valve 1

−33.785

6.341

28.392

1

0.000

Valve 2

−40.126

7.551

28.238

1

0.000

Liquid_level

Valve 3 Goal versus high

St. Error 79.136

10.236

28.797

1

0.000

−126.517

30.898

16.766

1

0.000

151.329

37.194

16.554

1

0.000

Valve 1

45.568

11.036

17.049

1

0.000

Valve 2

52.309

12.773

16.772

1

0.000

Valve 3

−79.126

19.38

16.67

1

0.000

Intercept Liquid_level

54.93

• Goal versus low: the increase in the initial liquid level will cause lower risk to achieve low liquid level (B value is negative). • Goal versus low: the increased flows in the valves 1 and 2 will cause lower risk to achieve low liquid level from the goal level (their B values are neg) • Goal versus low: The increased flow in valve 3 will cause higher risk to achieve low liquid level from the goal level (B value is positive). • Goal versus high: the increase in the initial liquid level will cause higher risk to achieve high liquid level (B value is positive). • Goal versus high: The increased flows in valves 1 and 2 will cause higher risk to achieve high liquid level from the goal level (their B values are positive). • Goal versus high: The increased flow in valve 3 will cause lower risk to achieve high liquid level from the goal level (B value is negative). These examples thus correspond well to the basic principles for controlling this system with its FCM. The specific probabilities of the low and high liquid levels may be calculated with the linear regression coefficients in Table 12, and then with the logistic function, 1/(1 + exp(−Z)), as above. Hence, Zlow level = − 109.707 · Liquid_level − 33.785 · Valve l − 40.126 · Valve 2 + 54.930 · Valve 3 + 79.136

(3.3.2)

Zhigh level =151.329 · Liquid_level + 45.568 · Valve 1 + 52.309 · Valve 2 − 79.126 · Valve 3 − 126.517 (3.3.3) For example, given the initial concept values at time = t, Liquid_level = 0.45, Valve l = 0.03, Valve 2 = 0.89, Valve 3 = 0.06

56

V. A. Niskanen

Table 13 The descriptives of the initial concept values that lead to the goal values of the liquid level Concept

Minimum

Maximum

Mean

Std. deviation

Liquid level

.19

1.00

.6748

.18255

Valve 1

.00

1.00

.5202

.28840

Valve 2

.00

1.00

.5290

.28294

Valve 3

.00

1.00

.4854

.28565

Gravity

.01

1.00

.4988

.29010

the probabilities of obtaining the low, goal and high liquid levels at time = t + 1 are 0.03, 0.97 and 0.00, respectively. If we focus on those initial vector values which will lead to the goal values of the liquid level, Table 13 presents the descriptives of these values, and Table 14 presents examples of such typical initial vectors when the subtractive clustering was applied. We notice in Table 13 that the liquid level should be at least 0.19 for achieving its goal level in the subsequent FCM iterations, whereas the other concepts may vary more freely. Thanks for our stochastic approach, we may again avoid better subjective and ad hoc decisions when we interpret our connection matrix weights, simplify our original models and forecast our FCM concept values in the simulations.

4 Conclusions Fuzzy systems have proven to be applicable in various computer model constructions, and numerous scientific articles are already available about these models. The fuzzy cognitive maps were considered above, and this field of fuzziness has also been studied quite much already. However, many of the studies on these maps still seem to pivot on the methods of the engineering sciences and neural networks, and thus they do not necessarily reveal sufficiently their model performance. Hence, we expect certain methods which may supplement or clarify our research outcomes and even rely less on subjective or ad hoc reasoning. Our approach above considered the fuzzy cognitive maps from the standpoint of the quantitative human sciences, and in this context, we applied more wide-ranging methods. In particular, statistical methods were also applied because they play a central role in this field. Thanks for these methods, we may avoid subjectivity better and simultaneously acquire further information on our models. We only focused on the numerical fuzzy cognitive maps, and especially on their a priori models, because these models will provide the basis for their other studies. The statistical analysis of a posteriori models with the history data and their linguistic versions will be an interesting objective of the future examinations.

Center 1

0.723

0.504

0.427

0.553

0.454

Concepts

Liquid level

Valve 1

Valve 2

Valve 3

Gravity

0.463

0.233

0.766

0.550

0.447

Center 2

0.791

0.756

0.706

0.183

0.789

Center 3

0.216

0.191

0.266

0.812

0.556

Center 4

0.630

0.908

0.280

0.867

0.873

Center 5

0.227

0.820

0.755

0.931

0.669

Center 6

Table 14 Examples of cluster centers of the initial concept values that lead to the goal values of liquid level

0.758

0.136

0.142

0.169

0.699

Center 7

0.101

0.487

0.787

0.060

0.765

Center 8

0.963

0.231

0.624

0.871

0.378

Center 9

Statistical Approach to Fuzzy Cognitive Maps 57

58

V. A. Niskanen

The possible values of the fuzzy cognitive maps were studied by using the random initial concept vectors and with a random connection matrix. In this manner, we were able to estimate the performance of these maps, and due to our randomized approach, the generalization capability of our outcomes seems better. Various regression models were also applied for providing a stochastic basis for this performance and also reducing subjectivity. Our results indicated that they corresponded well to those of obtained by the prevailing cognitive map methods. Furthermore, our methods provided supplementary and objective information on these models as well as enhanced their fine-tuning. Naturally, we may also apply other statistical methods, and these are possible objectives of the future studies. Our approach still awaits further justifications which will base on the performance of the concrete applications, and thus future studies are expected in this problem area. Acknowledgements I express my thanks to the distinguished Editors for having this opportunity to be one of the contributors of this book. This article is dedicated to the memory of my mentor and friend, the great Professor Lotfi Zadeh.

References 1. R. Axelrod, Structure of Decision. The Cognitive Maps of Political Elites (Princeton University Press, Princeton, 1976) 2. H. Bandemer, W. Näther, Fuzzy Data Analysis (Kluwer, Dordrecht, 1992) 3. M. Buruzs, M. Hatwágner L.T. Kóczy, Expert-based method of integrated waste management systems for developing fuzzy cognitive map, in Complex System Modelling and Control Through Intelligent Soft Computations. Studies in Fuzziness and Soft Computing, vol. 319, ed. by Q. Zhu, A. Azar (2015), pp. 111–137 4. J.P. Carvalho, J. Tome, Rule based fuzzy cognitive maps in socio-economic systems, in Proceedings of the IFSA Congress (Lisbon, 2009), pp. 1821–1826 5. S. Chiu, Fuzzy model identification based on cluster estimation. J. Intell. Fuzzy Syst. 2, 267–278 (1994) 6. V. Dimitrov, B. Hodge, Social Fuzziology—Study of Fuzziness of Social Complexity (Physica Verlag, Heidelberg, 2002) 7. D. Freedman, Statistical models: Theory and practice (Cambridge University Press, Cambridge, 2005) 8. Fuzzy Logic User’s Guide 2018a, Mathworks, 2018, www.mathworks.com/help/pdf_doc/ fuzzy/fuzzy.pdf 9. M. Glykas (ed.), Fuzzy Cognitive Maps (Springer, Heidelberg, 2010) 10. P. Grzegorzewski, O. Hryniewicz, M. Gil, Soft Methods in Probability. Statistics and Data Analysis (Physica Verlag, Heidelberg, 2002) 11. M. Hatwagner, V. Niskanen L. Koczy, Behavioral analysis of fuzzy cognitive map models by simulation, in Proceedings of the IFSA ’17 Congress, Otsu, Japan, https://ieeexplore.ieee.org/ abstract/document/8023345/ 12. S. Kim, C. Lee, Fuzzy implications of fuzzy cognitive map with emphasis on fuzzy causal relationship and fuzzy partially causal relationship. Fuzzy Sets Syst. 97(3), 303–3013 (1998) 13. B. Kosko, Fuzzy Engineering (Prentice Hall, Upper Saddle River, New Jersey, 1997) 14. K.C. Lee, W.J. Lee, O.B. Kwon, J.H. Han, P.I. Yu, Strategic planning simulation based on fuzzy cognitive map knowledge and differential game. Simulation 71(5), 316–327 (1998) 15. R. Kruse, K. Meyer, Statistics with Vague Data (Reidel, Dordrecht, 1987)

Statistical Approach to Fuzzy Cognitive Maps

59

16. J. Metsämuuronen, Essentials in Research Methods in Human Sciences, Multivariate Analysis (Sage, London, 2017) 17. J. Metsämuuronen, Essentials in Research Methods in Human Sciences, Advanced Analysis (Sage, London, 2017) 18. V.A. Niskanen, Application of logistic regression analysis to fuzzy cognitive maps, in Fuzzy Logic Theory and Applications, vol. 2, ed. by L. Zadeh, R. Aliev (World Scientific Publishing, Singapore, 2019) 19. V.A. Niskanen, Concept map approach to approximate reasoning with fuzzy extended logic, in Fuzzy Technology: Present Applications and Future Technology, Studies in Fuzziness and Soft Computing, vol. 335, ed. by M. Fedrizzi, M. Collan, J. Kacprzyk, (Springer, Heidelberg, 2016), pp. 47–70 20. J. Novak, Learning, Creating, and Using Knowledge: Concept Maps as Facilitative Tools in Schools and Corporations (Lawrence Erlbaum Associates Inc, New Jersey, 1998) 21. E. Papageorgiou, E. Stylios, P. Groumpos, Fuzzy cognitive map learning based on nonlinear Hebbian rule, in AI 2003. LNCS (LNAI), vol. 2903, ed. by T. Gedeon, L. Fung (Springer, 2003), pp. 256–268 22. W. Pedrycz, A. Jastrzebska, W. Homenda, Design of fuzzy cognitive maps for modeling time series. IEEE Transactions of Fuzzy Systems 24(1), 120–130 (2016) 23. W. Stach, L. Kurgan, W. Pedrycz, Expert-based and computational methods for developing fuzzy cognitive maps, in Fuzzy Cognitive Maps, ed. by M. Glykas (Springer, 2010), pp. 24–41 24. W. Stach, L.A. Kurgan W. Pedrycz, Numerical and linguistic prediction of time series with the use of fuzzy cognitive maps. IEEE Trans. Fuzzy Syst. 16 (2008) 25. W. Stach, L. Kurgan, W. Pedrycz, M. Reformat, Genetic learning of fuzzy cognitive maps. Fuzzy Sets Syst. 153, 371–401 (2005) 26. W. Stach, L. Kurgan, W. Pedrycz, A survey of fuzzy cognitive map learning methods. Issues Soft Comput. Theory Appl., pp. 71–84 (2005) 27. C. Stylios, P. Groumpos, Modeling complex systems using fuzzy cognitive maps. IEEE Trans. Syst. Man Cybern. Part A 34(1), 155–162 (2004) 28. F. Wenstøp, Quantitative analysis with linguistic values. Fuzzy Sets Syst. 4, 99–115 (1980) 29. L. Zadeh, Fuzzy logic = computing with words. IEEE Trans. Fuzzy Syst. 2, 103–111 (1996) 30. L. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 90(2), 111–127 (1997) 31. L. Zadeh, From computing with numbers to computing with words—from manipulation of measurements to manipulation of perceptions. IEEE Trans. Circ. Syst. 45, 105–119 (1999) 32. L. Zadeh, Toward a perception-based theory of probabilistic reasoning with imprecise probabilities. J. Stat. Plann. Infer. 105(2), 233–264 (2002)

Probabilistic and More General Uncertainty-Based (e.g., Fuzzy) Approaches to Crisp Clustering Explain the Empirical Success of the K-Sets Algorithm Vladik Kreinovich, Olga Kosheleva, Shahnaz N. Shahbazova and Songsak Sriboonchitta

Abstract Recently, a new empirically successful algorithm was proposed for crisp clustering: the K-sets algorithm. In this paper, we show that a natural uncertaintybased formalization of what is clustering automatically leads to the mathematical ideas and definitions behind this algorithm. Thus, we provide an explanation for this algorithm’s empirical success. Keywords Clustering · K-sets algorithm · Probabilistic uncertainty · Fuzzy uncertainty

1 Clustering by Similarity: Formulation of the Practical Problem Clustering is important. In many practical situations, there are so many different objects that it is not possible to come up with an individual approach to each of these objects. In such situations, a reasonable idea is to divide all the objects into a small V. Kreinovich (B) · O. Kosheleva University of Texas at El Paso, 500 W. University, El Paso, TX 79968, USA e-mail: [email protected] O. Kosheleva e-mail: [email protected] S. N. Shahbazova Azerbaijan Technical University, Baku, Azerbaijan e-mail: [email protected] S. Sriboonchitta Faculty of Economics, Chiang Mai University, Chiang Mai, Thailand e-mail: [email protected] © Springer Nature Switzerland AG 2020 S. N. Shahbazova et al. (eds.), Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing 391, https://doi.org/10.1007/978-3-030-38893-5_4

61

62

V. Kreinovich et al.

number of groups (clusters), so that we will be able to develop a reasonable strategy of dealing with each group. For example, it is not (yet) practically possible to come up with a completely individualized medicine, so a natural idea is: • to divide all the patients into groups corresponding to different diseases, and then • to develop treatments for each of these diseases. Scientific clustering: by origin. In science, it is often desirable to bring together related objects, i.e., objects which are related by a small change from each other. For example, in astronomy, there are different sequences of star’s evolution, so we group together all the stars corresponding to the same sequence – even when they are at completely different parts of this sequence. Similarly, in biology, it is sometimes reasonable group together animals on different stages of their development. For such situations, there are many well-defined clustering methods. In these methods, we usually mark sufficiently close objects as belonging to the same cluster, and then apply the transitive closure, i.e., consider objects a and b belonging to the same cluster if there is a sequence of objects a = a0 , a1 , …, an−1 , an = b in which each object ai is sufficiently close to the next one ai+1 . In the scientific classification, objects from the same cluster are not necessarily similar to each other. In this approach, belonging to the same cluster does not necessarily mean similarity. For example, in this approach, each butterfly gets grouped together with its caterpillar, but, of course, there is much more similarity between different butterflies than between the butterfly and its caterpillar. Practical clustering: by similarity. In most practical situations, it is desirable to classify objects by their similarity. For example, for medical purposes, it makes much more sense to group together patients who have flu than to group them by diseases they had in childhood. Such clustering-by-similarity is the main topic of this paper. Formulation of the problem. One of the main problems with clustering is that it is not clear what exactly we want. It is often difficult to come up with a reasonable objective measure of which clustering is better. In some cases, we know the desired result: e.g., we have a DNA-based classification of plants, and we want to come up with clusters which are the closest to this classification. However, in general, this is not clear. K-sets algorithm: a recent approach. Recently, a new approach to clustering has been proposed [2]. This approach leads to an empirically successful algorithm that the authors call K-sets. Remaining problem and what we do in this paper. The K-sets approach optimizes some seemingly reasonable criterion, but it is not clear why specifically this criterion has been chosen – there are, in principle, many possible similar criteria, and it is not clear why namely this one has been empirically successful.

Probabilistic and More General Uncertainty-Based …

63

In this paper, we show that a natural fuzzy-based formalization of clustering can explain this criterion – and can, thus, explain the empirical success of the K-sets algorithm. Comment. At this stage, we are only considering crisp clustering, when each object is assigned to a single cluster. In practice, it is often useful to have a probabilistic or fuzzy clustering, in which each object can be assigned to two or more different clusters, with different degrees of belonging. For example, in medical analysis, based on some patients’ symptoms, these patients may be in between flu and cold; in such cases, it is reasonable to consider such patients as somewhat belonging to two clusters at the same time. It is desirable to extend our analysis to such clustering, but as of now, we only know how to apply this analysis to crisp clustering.

2 The Main Idea Behind Clustering-by-Similarity The main idea of clustering by similarity is that once we divide the objects into clusters, then: • objects within each cluster are similar to each other, while • objects assigned to different clusters are not similar to each other.

3 Probabilistic Approach: Towards the Precise Formulation of the Problem Probabilistic case: a brief description. In this section, we assume that for every two objects a and b, we know the probability p(a, b) that these objects are similar. This may be a subjective probability evaluated by an expert, or this may be a probability estimated based on some objective criterion. In this case, the probability that a and b are not similar is equal to 1 − p(a, b). From similarity between objects to quality of clustering. It is reasonable to assume that for different pairs, similarities are independent events. This assumptions makes sense if we have no information about the relation between these events. In this case, we do not know the full probability distribution, we have many different probability distributions consistent with the given information. In such situations, a reasonable idea is to use the Maximum Entropy Approach (see, e.g., [3]) to select the most reasonable probability distribution. For situations in which we know the probabilities of individual events, but we do not know the correlation between these events, the Maximum Entropy approach selects a distribution in which these events are independent [3]. Under this independence assumption, for each clustering, i.e., for each subdivisions of the set of all objects into K disjoint sets S1 , . . . , SK , the probability

64

V. Kreinovich et al.

P(S1 , . . . , SK ) that this clustering is consistent with the available information about similarity is equal to the product of the corresponding probabilities: P(S1 , . . . , Sk ) =

a∼b

p(a, b) ·

(1 − p(a, b)),

ab

where a ∼ b means that the objects a an b belong to the same cluster: a ∼ b ⇔ ∃k (a ∈ Sk & b ∈ Sk ). How do we select the best clustering. In the probabilistic case, a natural idea is to use the Maximum Likelihood approach (see, e.g., [7]), and select the clustering S1 , . . . , SK for which the probability P(S1 , . . . , SK ) is the largest possible.

4 Probabilistic Approach: Precise Formulation of the Problem, Resulting Clustering Algorithm, and Their Relation to K-Sets Algorithm and Its Foundations Precise formulation of the problem. For every two objects a and b, we know the probability p(a, b) that these two objects are similar. We then need to find a clustering, i.e., a subdivision of the set of all objects into disjoint subsets S1 , . . . , Sk for which the product p(a, b) · (1 − p(a, b)) P(S1 , . . . , SK ) = a∼b

ab

attains the largest possible value, where a ∼ b means that ∃k (a ∈ Sk & b ∈ Sk ). Let us simplify this optimization problem. The above formulation is mathematically precise, but from the computational viewpoint, it is not perfect. First of all, to compute the above value, we need to consider all possible pairs (a, b). Often, there are many possible objects, so the number of pairs is huge. So, it is desirable to cut down on the number of pairs for which we perform the computations when gauging the quality of different clusterings. This can be done, e.g., if we take into account that for every constant C > 0, maximizing the objective function F(x) is equivalent to maximizing the objective F(x) . In our case, we can take C = (1 − p(a, b)), where the product is function C a,b taken over all possible pairs. If we divide the above objective function by this constant C, we get an equivalent objective function in which we only need to consider pairs belonging to the same cluster: a∼b

p(a, b) . 1 − p(a, b)

Probabilistic and More General Uncertainty-Based …

65

This product can be, in its turn, subdivided into products corresponding to each individual cluster Sk : K p(a, b) . 1 − p(a, b) k=1 a,b∈S k

In practice, we may have hundreds and thousands of objects, and multiplying hundreds of numbers smaller than 1 lead us to close to 0 real fast, to values below the machine zero. This problem can be avoided if we take into account that logarithm is a strictly increasing function, and the logarithm of the product is equal to sum of the logarithms. Thus, maximizing the above objective function is equivalent to maximizing the sum K γ (a, b), k=1 a,b∈Sk

where we denoted def

γ (a, b) = ln

p(a, b) 1 − p(a, b)

= ln( p(a, b)) − ln(1 − p(a, b)).

Thus, we arrive at the following equivalent formulation: • we are given a function γ (a, b), and • we need to find a clustering for which the above sum takes the largest possible value. Relation to the K-sets approach. In the above formula, we count each un-ordered pair exactly once. Instead, we can consider all possible ordered pairs – meaning that we will count each un-ordered pair twice. This will simply increase the value of the above objective function by a factor or 2, without changing which clustering is better and which is worse. With this re-arrangement, the above formula takes the form K

γ (a, b),

k=1 a∈Sk b∈Sk

i.e., the form

K

γ (Sk , Sk ),

k=1

where we denoted γ (A, B) =

γ (a, b).

a∈A b∈B

This is exactly the modularity index Q which is used in [2] to find the best clustering – now we have an explanation for this formula.

66

V. Kreinovich et al.

Towards a natural algorithm for finding the best clustering. How can we find the best clustering? When the number of clusters is known, a natural idea is to iteratively optimize. Namely, once we have some subdivision into clusters S1 , . . . , SK , we then, or each element a, decide to which of these clusters this element should belong so as to maximize the value of the objective function. γ (a, b) If the element a is assigned to the cluster Sk , then it contributes the sum b∈Sk

to the objective function. Thus, we assign each the element to the cluster k for which this sum is the largest. In other words, we arrive at the following algorithm. , . . . , SK . Then, for Resulting iterative algorithm. We start with some clustering S1 γ (a, b), and we each element a and for each cluster k, we compute the sum b∈Sk

assign the element a to the cluster k for which this sum is the largest. Thus, we get a new clustering, so we repeat this procedure until the process converges. Comment. Instead of computing the sum over all the elements b ∈ Sk , we can use the Monte-Carlo techniques and estimate this sum by selecting a few random points from the set Sk . Relation to K-sets algorithm. What we described is exactly the iterative procedure in the K-sets algorithm. Thus, we have a natural explanation for this empirically successful algorithm. Comment. So far, we have only considered the case when the number of clusters K is fixed. However, it is also possible to use the same criterion to select the optimal value K . For this purpose, we can check whether the value of the objective function increases if we merge two clusters – or, vice versa, split one of the clusters into two; see [2] for more details.

5 Towards a More General Uncertainty-Based Approach Need to go beyond probabilities. In the above text, we assumed that for every two objects a and b, we know the probability p(a, b) that these objects are similar. In particular, this can be subjective probability, describing the expert’s degree of confidence that the two objects are similar. In practice, however, the experts often cannot express their degrees of confidence in probabilistic terms. What we can do in such situations is, e.g., ask an expert to mark, on a scale from 0 to 10, his or her degree of confidence that the objects a and b are similar. If the expert marks 7, we can then assign the degree s(a, b) = 7/10. In general, if an expert selects m on a scale from 0 to n, we select s(a, b) = m/n. There can be other ways to gauge the expert’s degree of confidence that the objects a and b are similar. In all these cases, it is convenient to scale the corresponding degrees s(a, b) to the interval [0, 1], so that:

Probabilistic and More General Uncertainty-Based …

67

• the value s(a, b) = 1 means absolutely similar, • the value s(a, b) = 0 means not similar at all, and • intermediate values s(a, b) ∈ (0, 1) correspond to some similarity. How to describe degree of non-similarity in such a general case. If the two objects are similar with degree s(a, b) < 1, then, since perfect similarity corresponds to degree 1, it is reasonable to gauge the degree of non-similarity by the remaining value 1 − s(a, b). From similarity between objects to quality of clustering: need for “and”operations. Based on the degrees of similarity between different pairs of objects, we would like to estimate how well a given clustering corresponds to the desired quality, i.e., to what extent objects within each cluster are similar to each other, and objects from different clusters are not similar to each other. In other words, we are interested, for a given clustering S1 = {a, b, . . .}, S2 = {c, d, . . .}, in a degree to which a and b are similar and c and d are similar, and a and c are not similar, etc. An ideal solution would be to do what we did when we looked for degrees s(a, b) of similarity between objects: ask experts. However, even for a reasonably small class of objects (e.g., 30) and for just two possible clusters (K = 2), there are 230 ≈ 109 possible subdivisions into two clusters, and there is no way to ask a billion questions to an expert. Since we cannot ask the expert to gauge his/her degree of confidence in every possible “and”-combination of individual statements, we have to estimate this degree of confidence based on the degrees of confidence of individual statements. This is one of the main ideas behind fuzzy logic (see, e.g., [4, 6, 8]): we need an estimating algorithm f & (a, b) that, given the expert’s degree of confidence a and b in two statements A and B, provides an estimate f & (a, b) for the expert’s degree of confidence in the “and”-combination A & B. By using an appropriate “and”-operation, we can then estimate the degree to which a given clustering is good as f & (s(a, b), s(c, d), . . . , 1 − s(a, c), . . .), where: • for objects a and b from the same cluster, we combine the degrees of similarity s(a, b), while • for objects a and c from different clusters, we combine the degrees of non-similarity 1 − s(a, c). Which “and”-operations should we use in computing this degree? To answer this question, let us consider natural requirements on possible “and”-operations. Natural properties of “and”-operations. What are the natural properties of an “and”-operation f & (a, b)? First of all, A & B means the same as B & A, so we should expect that applying our estimating function f & to these two different expressions would lead to the same

68

V. Kreinovich et al.

result: f & (a, b) = f & (b, a). Thus, the “and”-operation f & (a, b) must be commutative. Similarly, since A & (B & C) and (A & B) & C mean the same, we expect that f & (a, f & (b, c)) = f & ( f & (a, b), c) for all a, b. and c. In other words, the “and”operation should be associative. If we increase our degree of confidence in A, this should increase our degree of confidence in A & B – so the “and”-operation should be increasing in both variables. In particular, small changes in degree of confidence in A should lead to small changes in degree of confidence in A & B – so the “and”-operation must be continuous. Finally, if A is absolutely true (i.e., if our degree of confidence in the statement A is equal to 1), then the only reason why we may be not fully confident in A & B is because we are not fully confident in B. In other words, in this case, our degree of confidence in A & B should be equal to the degree of confidence in B: f & (1, b) = b. There is a known classification of all “and”-operations that satisfy all these properties. “And”-operations satisfying all these properties are known as t-norms. There exists a full classification of all possible t-norms. In particular, it is known [5] that, for any given accuracy ε > 0, any t-norm can be ε-approximated by an Archimedean t-norm, i.e., by a t-norm of the type f & (a, b) = g −1 (g(a) · g(b)) for some strictly increasing continuous function g(a); here, as usual, g −1 (x) denotes the inverse function. Thus, without losing generality, we can safely assume that our “and”-operation is Archimedean. Which Archimedean t-norm should we choose? Different functions g(x) describe, in general, different “and”-operations. Which one should we choose for our comparison of different clusterings? To have the most adequate comparison, out of all possible “and”-operations, we must select the one which best describes the reasoning of the corresponding experts. For this purpose, we need to compare the expert’s degrees of confidence in different “and”-combinations A & B with the estimates f & (a, b) predicted by different t-norms, and select the t-norm which is the best fit for a given expert. Such t-norm-fitting has a long history: it was first done by researchers who designed the world’s first successful expert system MYCIN; see, e.g., [1]. The resulting formula for the quality of a given clustering. For an Archimedean “and”-operation, the above formula for the degree of quality of clustering takes the following form: −1 g(s(a, b)) · g(1 − s(a, b)) , g a∼b

ab

where a ∼ b means that the objects a and b belong to the same cluster. Let us simplify this formula. Since the function g(x) is strictly increasing and continuous, its inverse g −1 (x) is also strictly increasing, i.e., g −1 (x) < g −1 (y) if and only if x < y.

Probabilistic and More General Uncertainty-Based …

69

Our goal is to compare different clusterings. Thus, instead of the above degrees x, we can consider the values g(x): for every two degrees x and y, the comparison between x and y will lead to the same result as the comparison between g(x) and g(y). The advantage of using g(x) is that, since the original degrees x have the form x = g −1 (E) for some expression E, we thus have g(x) = g(g −1 (E)) = E, i.e., a simpler expression that the original degree g −1 (E). Thus, instead of comparing the original degrees, we can now compare simpler expressions

g(s(a, b)) ·

a∼b

g(1 − s(a, b)).

ab

So, we arrive at the following precise formulation of the problem. Case of general uncertainty: precise formulation of the problem. For every two objects a and b, we know the degree of similarity s(a, b) between these objects. We need to also determine, by comparing the expert’s opinions on individual statements and on their “and”-combinations, the t-norm (and thus, the corresponding function g(x)) that most adequately describes the expert’s reasoning. Then, we need to find a clustering, i.e., a subdivision of the set of all objects into disjoint subsets S1 , . . . , Sk for which the product

g(s(a, b)) ·

a∼b

g(1 − s(a, b))

ab

attains the largest possible value, where a ∼ b means that ∃k (a ∈ Sk & b ∈ Sk ). Let us simplify this expression. Similarly to the probabilistic case, we can simplify this optimization problem if we take into account that for every constant C > 0, maximizing the objective function F(x) is equivalent ti maximizing the objective F(x) . In our case, we can take C = g(1 − s(a, b)), where the product is function C a,b taken over all possible pairs. If we divide the above objective function by this constant C, we get an equivalent objective function in which we only need to consider pairs belonging to the same cluster: a∼b

g(s(a, b)) . g(1 − s(a, b))

This product can be, in its turn, subdivided into products corresponding to each individual cluster Sk : K g(s(a, b)) . g(1 − s(a, b)) k=1 a,b∈S k

Again, similarly to the probabilistic case, it is computationally beneficial to optimize the logarithm of this expression, i.e., a function

70

V. Kreinovich et al. K

γ (a, b),

k=1 a,b∈Sk

where we denoted

g(s(a, b)) γ (a, b) = ln g(1 − s(a, b)) def

= ln(g(s(a, b))) − ln(g(1 − s(a, b))).

Thus, we arrive at the following equivalent formulation: • we are given a function γ (a, b), and • we need to find a clustering for which the above sum takes the largest possible value. Algorithms and their relation to the K-sets approach. Similarly to the probabilistic case, if we consider all possible ordered pairs – meaning that we count each unordered pair twice – then we get the equivalent objective function K

γ (a, b),

k=1 a∈Sk b∈Sk

i.e., the function

K

γ (Sk , Sk ),

k=1

where we denoted γ (A, B) =

γ (a, b).

a∈A b∈B

This is exactly the modularity index Q which is used in [2] to find the best clustering – now we have a general uncertainty-based explanation for this formula. Thus, this approach – and the corresponding iterative clustering algorithm – have been justified not only for the probabilistic case, but also for the case of general (not necessarily probabilistic) uncertainty.

6 What if We Have Disproportionate Clusters? Formulation of the problem. The above descriptions work well if all the clusters are of the same order of magnitude. However, sometimes, one (or more) of the clusters is much smaller than the others. In this case, in the above formulation, the probabilities (or, more generally, degrees of confidence) corresponding to a small cluster will “drown” in the preponderance of degrees corresponding to larger clusters. What shall we do in this case?

Probabilistic and More General Uncertainty-Based …

71

How to deal with this problem: an informal idea. A natural idea is that, instead of considering similarity between all possible pairs, we consider “average” similarity between clusters: • on average, two objects picked from the same cluster should be similar, while • on average, two objects picked from two different clusters, should not be similar. Example. For example, if we classify dogs to one cluster and cats to a different cluster, then the above idea means that: • two randomly selected dogs should be similar to each other, • two randomly selected cats should be similar to each other, but • a randomly selected dog should not be similar to a randomly selected cat. How can we describe this idea in precise terms?

7 Disproportionate Clusters: Probabilistic Case Formulation of the problem. For every two objects a and b from the same cluster Sk , we have the probability p(a, b) that these objects are similar. How can we describe the “average” probability that the objects from this cluster are similar to each other? First idea and its formalization. The probabilities of similarities between objects p(a, b). from the same cluster enter the objective function as a product term a,b∈Sk

What does it mean to replace all these probabilities by an “average” one? A natural idea is to come with a single probability pk so that when we use pk instead of all the different probabilities p(a, b), we get the exact same result. In other words, we p(a, b) = pk , i.e., equivalently, should have a,b∈Sk

a,b∈Sk

pkPk =

p(a, b),

a,b∈Sk

where Pk is the number of pairs (a, b). If we count each pair twice and denote the number of element in the k-th cluster by Nk , then we get Nk2 pairs, and thus, ⎛ pk = ⎝

⎞1/Nk2 p(a, b)⎠

.

a∈Sk b∈Sk

Similarly, for every two clusters Sk and S , the terms corresponding to pairs a ∈ Sk and b ∈ S form the product (1 − p(a, b)). So, it is reasonable to pick up an a∈Sk b∈S

“average” probability pk of non-similarity as the value for which

72

V. Kreinovich et al.

(1 − p(a, b)) =

a∈Sk b∈S

pk .

a∈Sk b∈S

This formula leads to ⎛ pk = ⎝

⎞1/(Nk ·N ) (1 − p(a, b))⎠

.

a∈Sk b∈S

In these terms, we want to optimize the overall probability that: • within each cluster, the objects are, on average, similar, and • for every two different clusters, the objects are non-similar. Thus, we want to find the clustering that maximizes the following probability K

pk ·

k=1 K k=1

⎛ ⎝

p(a, b)⎠

pk =

k 0 such that D Z ( f (t), f (t0)) < , for all t ∈ I = [a, b] with |t − t0 | < δ. Also, we denote C([a, b], R Z ) = { f : I → R Z | f is continuous}. Lemma 16 For every Z-process f : I → R Z where f (t) = ( f A (t), f B (t)) and f A , f B : I → R F . f is continuous (i.e. f ∈ C([a, b], R Z )) if and only if f A and f B are continuous (i.e. f A , f B ∈ C([a, b], R F )) [12]. Definition 17 (gH-differentiability of a Z-process) [11] Let t0 ∈ (a, b) and h be such that t0 +h ∈ (a, b), then the gH-derivative of a mapping f : (a, b) → R Z at t0 is defined as f g H (t0 ) = lim

h→0

f (t0 + h) g H f (t0 ) h

(2.45)

Since, f (t0 + h), f (t0 ) ∈ R Z , then we have f (t0 + h) = (( f A (t0 + h), f B (t0 + h)), f (t0 ) = ( f A (t0 ), f B (t0 )), hence, Eq. (2.45) convert to the following equation

( f A (t0 + h), f B (t0 + h)) g H ( f A (t0 ), f B (t0 )) , (2.46) f A g H (t0 ), f B g H (t0 ) = lim h→0 h

Therefore, we can say that f is g H − di f f er entiable a t0 iff

On the Z-Numbers

129

⎧ ⎨ f A (t0 ) = lim gH ⎩ f B g H (t0 ) =

f A (t0 +h) g H f A (t0 ) h f B (t0 +h) g H f B (t0 ) lim h h→0 h→0

(2.47)

In the other word, if f g H (t0 ) ∈ R Z that satisfying in (2.45) exists, we say that f is g H − di f f er entiable at t0 . Lemma 18 For every Z-process f : I → Rz where f(t) = (fA (t), fB (t)) and fA , fB : I → RF. f is gH−differentiable if and only if fA and fB are gH−differentiable [12]. As we have seen in definition of gH − difference and g-difference, both of these differences are based on gH − difference for each α-cut of the involve Z-numbers; this level characterization is obviously inherited by the gH − derivative, with respect to the level-wise gH − derivative. Definition 19 (Level-wise gH-differentiability of a Z-process) [11] Let t0 ∈ (a, b) and h be such that t0 + h ∈ (a, b), then the level-wise gh-derivative (shortly LgH-derivative) of a mapping f : (a.b) → R Z at t0 is defined as the set if interval-valued g H − derivatives, if they exist,

f Lg H (t0 )

α

1 f [t0 + h]α g H [ f (t0 )]α h→0 h

= lim

(2.48)

Similarly to the last definition, we can write

f ALg H (t0 ), f B Lg H (t0 )

α

1 (( f A [t0 + h]α , f B [t0 + h]α ) h→0 h g H ([ f A (t0 )]α , [ f B (t0 )]α )

= lim

(2.49)

Hence, we can say that f is LgH-differentiable at t0 iff α ⎧ ⎨ f ALg = lim h1 f A [t0 + h]α g H [ f A (t0 )]α H (t0 ) h→0 α ⎩ f = lim h1 f B [t0 + h]α g H [ f B (t0 )]α B Lg H (t0 )

(2.50)

h→0

So, if f B Lg H (t0 )

f Lg H (t0 )

α

α

contain two compact intervals (i.e.

f ALg H (t0 )

α

and

are compact intervals) for all α ∈ [0, 1], then we say that f is level α wise gH -differentiable at t0 and the family of f Lg : α ∈ [0, 1] , is the H (t0 ) LgH-derivative of at t0 , that denoted by f Lg H (t0 ). Please note that, generally the differentiability of the second component of Z-process is not necessary. If the fuzzy number does have a probability distributive function then its derivative does have a probability density function. But if we suppose that the second component is approximated by a fuzzy number it would be better to have the derivative of the second part.

130

T. Allahviranloo and S. Ezadi

Definition 20 Assume that the second part of the Z-number is approximated by a fuzzy number and t0 ∈ I = [a, b] that I be a real interval, and f : I → R Z such that f αA (t), f¯Aα (t), f αB (t) and f¯Bα (t) are differentiable at t0 and 0 ≤ α ≤ 1. We say that [11] • f is (i)–(i)–gH–differentiable at t0 if

f g H (t0 )

α

=

f αA (t0 ) , f¯Aα (t0 ) , f αB (t0 ) , f¯Bα (t0 )

(2.51)

where f A and f B are (i)-gH-differentiable at t0 , i:e:

α α α ¯ f f Ag = f (t ) (t ), (t ) 0 0 0 H A A

(2.52)

α α α ¯ = f f f Bg (t ) (t ), (t ) 0 0 0 H B B

(2.53)

and

• f (ii)-(i)-gH-differentiable at t0 if

α f g H (t0 )

=

α α α α ¯ ¯ f A (t0 ) , f A (t0 ) , f B (t0 ) , f B (t0 )

(2.54)

where f A is (ii)-gH-differentiable at t0 and f B is (i)-gH-differentiable at t0 • f (i)-(ii)-gH-differentiable at t0 if

f g H (t0 )

α

=

f αA (t0 ) , f¯Aα (t0 ) , f¯Bα (t0 ) , f αB (t0 )

(2.55)

where f A is (i)-gH-differentiable at t0 and f B is (ii)-gH-differentiable at t0 . • f (ii)-(ii)-gH-differentiable at t0 if

f g H (t0 )

α

=

f¯Aα (t0 ) , f αA (t0 ) , f¯Bα (t0 ) , f αB (t0 )

(2.56)

where f A and f B are (ii)-gH-differentiable at t0 . Definition 21 We say that a point t0 ∈ (a, b), is a full-switching point for the differentiability of f, if in any neighborhood V of t0 there exist points t1 < t0 < t2 such that [12] (type-I full-switch) at t1 (2.52) and (2.53) hold while (2.57) and (2.58) do not hold, and at t2 (2.57) and (2.58) hold and (2.52) and (2.53) do not hold, or (type-II full-switch) at t1 (2.52) and (2.58) hold while (2.57) and (2.53) do not hold, and at t2 (2.57) and (2.53) hold and (2.52) and (2.58) do not hold, or

On the Z-Numbers

131

(type-III full-switch) at t1 (2.57) and (2.53) hold while (2.52) and (2.58) do not hold, and at t2 (2.52) and (2.58) hold and (2.57) and (2.53) do not hold, or (type-IV full-switch) at t1 (2.57) and (3.44) hold while (2.52) and (2.53) do not hold, and at t2 (2.52) and (2.53) hold and (2.57) and (2.58) do not hold. Definition 22 We say that a point t0 ∈ (a, b), is a semi-switching point for the differentiability of f, if in any neighborhood V of t0 there exist points t1 < t0 < t2 such that (type-I semi-switch) at t1 (2.52) and (2.53) hold while (2.57) and (2.58) do not hold, and at t2 (2.52) and (2.58) hold and (2.57) and (2.53) do not hold, or (type-II semi-switch) at t1 (2.52) and (2.53) hold while (2.57) and (2.58) do not hold, and at t2 (2.57) and (2.53) hold and (2.52) and (2.58) do not hold, or (type-III semi-switch) at t1 (2.52) and (2.56) hold while (2.57) and (2.53) do not hold, and at t2 (2.52) and (2.53) hold and (2.57) and (2.58) do not hold, or (type-IV semi-switch) at t1 (2.52) and (2.58) hold while (2.57) and (2.53) do not hold, and at t2 (2.57) and (2.58) hold and (2.52) and (2.53) do not hold. (type-V semi-switch) at t1 (2.57) and (2.53) hold while (2.52) and (2.58) do not hold, and at t2 (2.57) and (2.58) hold and (2.52) and (2.53) do not hold, or (type-VI semi-switch) at t1 (2.57) and (2.53) hold while (2.52) and (2.58) do not hold, and at t2 (2.52) and (2.53) hold and (2.57) and (2.58) do not hold, or (type-VII semi-switch) at t1 (2.57) and (2.58) hold while (2.52) and (2.53) do not hold, and at t2 (2.57) and (2.53) hold and (2.52) and (2.58) do not hold, or (type-VIII semi-switch) at t1 (2.57) and (2.58) hold while (2.52) and (2.53) do not hold, and at t2 (2.52) and (2.58) hold and (2.57) and (2.53) do not hold. According to the g-difference definition, we can present the g-differentiability concept for a Z-process as: Definition 23 (Generalized Differentiability of a Z-process) [11] Consider t0 ∈ (a, b), and h be such that t0 + h ∈ (a, b), then the g-derivative of a function f : (a, b) → Rz at t0 is defined by f g (t0 ) = lim

h→0

f (t0 + h) g f (t0 ) h

Since, f (t0 + h), f (t0 ) ∈ R Z , then we have f (t0 + h) = (( f A (t0 + h), f B (t0 + h)), f (t0 ) = ( f A (t0 ), f B (t0 )), hence, Eq. (2.45) convert to the following equation

(2.57)

132

T. Allahviranloo and S. Ezadi

( f A (t0 + h), f B (t0 + h)) g ( f A (t0 ), f B (t0 )) , f A g (t0 ), f B g (t0 ) = lim h→0 h

(2.58)

therefore, we can say that f is g − di f f er entiable at t0 iff ⎧ ⎨ f A (t0 ) = lim g ⎩ f B g (t0 ) =

f A (t0 +h) g f A (t0 ) h f B (t0 +h) g f B (t0 ) lim h h→0 h→0

(2.59)

In the other word, if f g (t0 ) ∈ R Z that satisfying in (2.55) exists, we say that f is g − di f f er entiable at t0 . Lemma 25 For every Z-process f : I → Rz where f (t) = ( f A (t), f B (t)) and f A , f B : I → R F . f is g-differentiable if and only if f A and f B are g-differentiable [11]. Definition 26 Let f : (a, b) → R F is g-differentiable at t0 and S = [t1 , t2 ] ⊆ (a, b). The interval S is called a full-transitional region for the differentiability of f, if in any neighborhood S ⊂ (t1 − δ, t2 + δ), δ > 0, there exist points τ1 ∈ (t1 − δ, t1 ) and τ2 ∈ (t2 , t2 − δ) such that [11] (type-I full-transitional region) at τ1 (2.52) and (2.53) hold while (2.55) and (2.56) do not hold, and at t2 (2.55) and (2.56) hold and (2.52) and (2.53) do not hold, or (type-II full-transitional region) at τ1 (2.52) and (2.56) hold while (2.55) and (2.53) do not hold, and at t2 (2.55) and (2.53) hold and (2.52) and (2.56) do not hold, or (type-III full-transitional region) at τ1 (2.55) and (2.53) hold while (2.52) and (2.56) do not hold, and at t2 (2.52) and (2.56) hold and (2.55) and (2.53) do not hold, or (type-IV full-transitional region) at τ1 (2.55) and (2.56) hold while (2.52) and (2.53) do not hold, and at t2 (2.52) and (2.53) hold and (2.55) and (2.56) do not hold. Definition 27 Let f : (a, b) → R F is g-differentiable at t0 and S = [t1 , t2 ] ⊆ (a, b). The interval S is called a semi-transitional region for the differentiability of f, if in any neighborhood S ⊂ (t1 − δ, t2 + δ), δ > 0, there exist points [12] τ1 ∈ (t1 − δ, t1 ) and τ2 ∈ (t2 , t2 − δ) such that (type-I semi-transitional region) at τ1 (2.52) and (2.53) hold while (2.55) and (2.56) do not hold, and at t2 (2.52) and (2.56) hold and (2.55) and (2.53) do not hold, or (type-II semi-transitional region) at τ1 (2.52) and (2.53) hold while (2.55) and (2.56) do not hold, and at t2 (2.55) and (2.53) hold and (2.52) and (2.56) do not hold, or (type-III semi-transitional region) at τ1 (2.52) and (2.56) hold while (2.55) and (2.53) do not hold, and at t2 (2.52) and (2.53) hold and (2.55) and (2.56) do not hold, or (type-IV semi-transitional region) at τ1 (2.52) and (2.56) hold while (2.55) and (2.53) do not hold, and at t2 (2.55) and (2.56) hold and (2.52) and (2.53) do not hold.

On the Z-Numbers

133

(type-V semi-transitional region) at τ1 (2.55) and (2.53) hold while (2.52) and (2.56) do not hold, and at t2 (2.55) and (2.56) hold and (2.52) and (2.53) do not hold, or (type-VI semi-transitional region) at τ1 (2.55) and (2.53) hold while (2.52) and (2.56) do not hold, and at t2 (2.52) and (2.53) hold and (2.55) and (2.56) do not hold, or (type-VII semi-transitional region) at τ1 (2.55) and (2.56) hold while (2.52) and (2.53) do not hold, and at t2 (2.55) and (2.53) hold and (2.52) and (2.56) do not hold, or (type-VIII semi-transitional region) at τ1 (2.55) and (2.56) hold while (2.52) and (2.53) do not hold, and at t2 (2.52) and (2.56) hold and (2.55) and (2.53) do not hold. In the next section, we are going to discuss about several ranking methods of Z-numbers.

3 Ranking of Z-Numbers Considering the concept and very important application of Z-numbers concerning risk assessment, decision analysis and other realms of real word problems, we should focus on ranking or ordering of these numbers as a tool that acts important roll in real word applications. in this part Two new methods for ranking of Z-numbers based on sigmoid function and sign method [4]. Is being studied. To study other methods, refer to references [3, 7, 10]. A Z–number is represented by the following membership functions given by ⎧ x−a1 ⎪ ⎪ a −a ⎪ ⎨ 2 1 ω μ A˜ (x) = a4A˜−x ⎪ ⎪ a4 −a3 ⎪ ⎩ 0

i f a1 ≤ x ≤ a2 i f a2 ≤ x ≤ a3 i f a3 ≤ x ≤ a4 Other wise

⎧ x−b1 ⎪ ⎪ b −b ⎪ ⎨ 2 1 ω μ B˜ (x) = b4B˜−x ⎪ ⎪ b −b ⎪ ⎩ 4 3 0

i f b1 ≤ x ≤ b2 i f b2 ≤ x ≤ b3 i f b3 ≤ x ≤ b4 Other wise

and

˜ B˜ , where A˜ = a1 , a2 , a3 , a4 , ω A˜ and B˜ = b1 , b2 , b3 , b4 , ω B˜ . For Z = A, (I) Convert the B˜ (reliability) into crisp number by using

134

T. Allahviranloo and S. Ezadi

X α=

xμ B˜ d x μ B˜ d x

(3.1)

X

where ∫ denotes an algebraic integration. (II) Add the weight of the B˜ to the A˜ (restriction). The weighted Z-number is denoted as

Z˜ α =

x, μ A˜ α (x) |μ A˜ α = αμ A˜ (x), x ∈ [0, 1]

(3.2)

Note that, α represents the weight of the reliability component of Z-number.

3.1 Z-Numbers Ranking Algorithm by Sigmoid Function ˜ B˜ , Calculate α using Eq. (3.1). Step one: For Z = A, Step two: For Z i = A˜ i , B˜ i , If A˜ i = ai1 , ai2 , ai3 , ai4 ; ω A˜ i is converted into a standardized fuzzy number, ∗ ∗ ∗ ∗ , ai2 , ai3 , ai4 ; ω A˜ i , A˜ i∗ = ai1 ai j , ∀ j = 1, 2, 3, 4.; ∀i = 1, . . . , n. C ω A˜ i ∈ [0, 1] and c = maxi, j ai j , 1 represents the maximum value of the universe of discourse. wher e ai∗j =

Step three: Calculating x A˜ i∗ : ∗ ∗ ∗ ∗ + ai2 + ai3 + ai4 ai1 , 4 ∈ [−1, 1], i = 1, . . . , n

x A˜ i∗ = x A˜ i∗

Step four: Calculate the spread Rank(Z i ) by sigmoid function

(3.3)

On the Z-Numbers

135

⎧ 1 , xA ⎪ ˜∗ > 0 ⎪ i −α x ˜ ∗ ⎪ ⎨ 1+e Ai x A˜ i∗ = 0 Rank(Z i ) = 0, ⎪ ⎪ −1 , x ∗ < 0 ⎪ ⎩ A˜ i 1+e

(3.4)

−α x ˜ ∗ Ai

where Rank(Z i ) ∈ (−1, 1), i = 1, . . . , n. The larger value of Rank(Z i ), the higher the preference of Z i .

3.2 Z-Numbers Ranking Algorithm by Sigmoid Function Based on the Combination of Convex ˜ B˜ , Calculate α using Eq. (3.1). Step one: For Z = A, Step two: For Z i = A˜ i , B˜ i , If A˜ i = ai1 , ai2 , ai3 , ai4 ; ω A˜ i is converted into a ∗ ∗ ∗ ∗ a , ai2 , ai3 , ai4 ; ω A˜ i , where ai∗j = Ci j , ∀ j = standardized fuzzy number, A˜ i∗ = ai1 1, 2, 3, 4., ∀i = 1, . . . , n. ω A˜ i ∈ [0, 1] and c = maxi, j ai j , 1 represents the maximum value of the universe of discourse. Step three: Calculating x A˜ i∗ , (Convex combination can be considered for each case. We explain the procedure for a case (Other states are also similar)). ∗ ∗ ∗ ∗ + (1 − λ) ai1 + ai3 + ai4 λ ai2 , x A˜ i∗ = λδ1 + (1 − λ)δ2 δ1 = δ2 = 2, x A˜ i∗ ∈ [−1, 1], k = 1, . . . , n.

(3.5)

Step four: Calculate the spread ST D A˜ i∗

ST D A˜ i∗ =

! 2 2 " ∗ ∗ λ ai2 − x A˜ i∗ + ai3 − x A˜ i∗

+

(λδ1 + (1 − λ)δ2 ) − 1 ! 2 2 " ∗ ∗ +(1 − λ) ai1 − x A˜ i∗ + ai4 − x A˜ i∗ (λδ1 + (1 − λ)δ2 ) − 1

(3.6)

Step five: Calculate Rank(Z i ), ⎧ 1 ⎪ ⎨ 1+e−G , G > 0 Rank(Z i ) = 0, G=0 ⎪ ⎩ −1 , G < 0 −G 1+e

(3.7)

136

T. Allahviranloo and S. Ezadi

where G = x A˜ i∗ × |α| + ST D A˜ i∗

(3.8)

where Rank(Z i ) ∈ (−1, 1), i = 1, . . . , n. The larger value of Rank(Z i ), the higher the preference of Z i . Example 3.1 Let A˜ i = (0.1, 0.3, 0.3, 0.5; 1, 1) is the A˜ i of Z-numbers in the 6 sets. Namely, all the Z-umbers in the 6 sets have same part A˜ i . The A˜ i is shown in Fig. 1 and all the B˜ i of Z-numbers in the 6 sets are shown in Figs. 1, 2 and Tables 1, 2. Ranking Z-numbers using the Sigmoid’s method based on the combination of convex method is to first set to 6 set in Table 2 in visible. For each 6 sets of values α1 = α2 = 0.75. Compare the results of ranking Z-numbers by using the Sigmoid’s method based on the combination of convex method with some other methods can be seen in Table 3. Fig. 1 A˜ i of Z-numbers in the 6 sets for Example 3.1

Fig. 2 B˜ i of Z-numbers in the 6 sets for Example 3.1

On the Z-Numbers

137

Table 1 The calculated results by the Sigmoid’s method for Example 3.1 Method

Set 1

Set 2

Z1

Z2

Z1

Z2

Mohamad’s method [12]

0.0774

0.0774

0.0774

0.0774

Bakar’s method [7]

0.0288

0.0288

0.0288

0.0288

Kang’s method [10]

0.3000

0.3000

0.3000

0.3000

Sigmoid’s method

0.522

0.522

0.522

0.522

Method

Set 3

Set 4

Z1

Z2

Z1

Z2

Mohamad’s method [12]

0.0774

0.0774

0.0774

0.0774

Bakar’s method [7]

0.0288

0.0288

0.0288

0.0288

Kang’s method [10]

0.3000

0.3000

0.3000

0.3000

Sigmoid’s method

0.522

0.522

0.522

0.522

Method

Set 5 Z1

Z2

Z1

Z2

Mohamad’s method [12]

0.0774

0.0774

0.0774

0.0774

Bakar’s method [7]

0.0288

0.0288

0.0288

0.0288

Kang’s method [10]

0.3000

0.3000

0.3000

0.3000

Sigmoid’s method

0.522

0.522

0.522

0.522

Table 2 A comparison of the Sigmoid’s method with the existing methods for Example 3.1

Set 6

Methods

Ranking

Mohamad’s method [12]

Z1 ≈ Z2

Bakar’s method [7]

Z1 ≈ Z2

Kang’s method [10]

Z1 ≈ Z2

Sigmoid’s method

Z1 ≈ Z2

In the previous sections, we discussed about the operations and several methods of ranking on Z-numbers. Now we have enough knowledge to start for introducing other concepts in this field. According to important application of Z-numbers in economics and prediction, it is necessary to discuss about regression, neural network and other techniques. In this section, linear regression using neural network method on Znumbers is introduced. To this end, first a neural network two layers is introduced to estimate a linear regression for Z-numbers data.

138

T. Allahviranloo and S. Ezadi

Table 3 The calculated results by the Sigmoid’s method based on the combination of convex (sets 1–6) for Example 3.1

4 Z-Number Linear Regression (ZLR) Estimates Using Neural Network [5] Consider the general model of ZLR as follows: [Yi ]z = (AY , BY ),

(4.1)

where AY plays the role of restriction for Yi and BY the role of degree of confidence and both of them have fuzzy values and have been defined as follows: AY = (a0 + a1 xi1 + a2 xi2 + · · · + an xin ),

BY = p(Yi is AY ),

(4.2)

xi1 ∈ R, a0 , a1 , a2 , . . . , an and [Yi ]z are Z − number. We limit the discussion to the case where AY and BY are symmetric triangular fuzzy numbers. Since in real world on applicant, ZLR regression in different conditions (problem information), can have different states so in this section, we study and estimate some of those states.

On the Z-Numbers

139

4.1 First State Suppose that [Y]z in (4.1) consists of around AY value and density function f(X ) = λe−λX , α ≤ λ ≤ β, X > 0 (where α and β are fixed

z numbers and α < β). The probability that Y, be AY , it consists of BY . Suppose y is respectively regular pairs

z y = yk , yH that, yH , yk are fuzzy numbers. yk play the role of restriction and yH play the role of degree of confidence for restrictions, respectively. That is, with the assumption that AY = (a, b) and BY = (c, d) where a and c be the center, b and d the fuzzy widths, yk and yH respectively. So we have b p(a ≤ Y ≤ b) =

b f(X )d X =

a

=

−e−λX |ab

=e

a −λa

λe−λX d X − e−λb

(4.3)

where for α ≤ λ ≤ β, (4.3) will be as follows: p(a ≤ Y ≤ b) =

c = e−λ1 a − e−λ1 b , λ1 = α d = e−λ2 a − e−λ2 b , λ2 = β

(4.4)

where the degree of membership for above relation is introduced with G( p) and is as follows: 1, c ≤ e−λa − e−λb ≤ d G( p) = B( p(a ≤ Y ≤ b)) = 0, other z The objective based on observations in the form of yi , xi1 , xi2 , . . . , xin , i = 1, . . . , n is to obtain an optimal model with fuzzy coefficients for describing and z analyzing the data and predicting based on it where xij are real numbers and yi are of Z-numbers type. For estimating regression with above conditions, we define the proposed method as follows: yT = (N et, p(yT is N et)) = A yT , B yT

(4.5)

where yT is the proposed solution and Net is the feed forward artificial neural network that consists of two layers. The first layer is the inputs layer and the second layer is the out puts layer with linear transfer function that is introduced in the following form: N et = w0 + w1 oi1 + · · · + wn oin , i = 1, . . . , m., j = 1, . . . ., n

(4.6)

140

T. Allahviranloo and S. Ezadi

where w0 and w1 are artificial neural network weights. oij are inputs of neural network and correspond X. Now, Let’s suppose our proposed method consists of the same density function of the problem, that is, f(X ) = λe−λX , α ≤ λ ≤ β. It is clear that the neural network of relation (4.6) has fuzzy value and this means that we have fuzzy weights for real observations, thus Eq. (4.6) could be written as follows: N et = w0 + w1 oi1 + . . . s.t wi is f uzzy.

(4.7)

We suppose N et = N et A1 , N et A2 where the value of N et A1 is the center and the value of N et A2 is the fuzzy width of Net neural network, so the relation (4.7) could be rewritten as follows: N et = N et A1 , N et A2 = w0 A1 , w0 A2 + w1A1 , w1A2 oi1 + w 2 A 1 , w2 A 2 + . . .

(4.8)

where N et A1 and N et A2 are in the following form:

N et A1 = w0 A1 + w1A1 oi1 + . . . N et A2 = w0 A2 + w1A2 |oi1 | + . . .

(4.9)

Now, we should find four weights of relation (4.9) that are met following two conditions: (1) The value of Net that is almost the estimated answer yT , is close to the value limit of the main answer Y, (2) The value of p(yT is N et) approaches the value of p(a ≤ Y ≤ b) = (c, d). To that end, we define the target function of neural network as follows: 2 Min e−( N et A1 λ1 ) − e−aλ1 2 1 −( N et A λ2 ) 2 e − e−bλ2 + 6

(4.10)

where in general, for n observations we will have x. Min

n ! 2 1 2 " # e−( N eti A1 λ1 ) − e−ai λ1 + e−( N eti A2 λ2 ) − e−bi λ2 6 i=1

By minimizing Eq. (4.10) four weights of the neural network, namely, w0 A1 , w1A1 , w0 A2 and w1A2 are obtained. By substituting the weights obtained in Eq. (4.8), the values of N et A1 and N et A2 are obtained and eventually the value of Net in Eq. (4.7) will be obtained. Now, we calculate the probability that yT may be a neural network (Net)

On the Z-Numbers

141

p(yT is N et) = p N et A1 ≤ yT ≤ N et A2 N et A2

= ∫ λe−λo do = e−(λN et A1 ) − e−(λN et A2 ) N et A1

⇒ p(yT is N et) =

$

e−(λ1 N et A1 ) − e−(λ1 N et A2 ) , λ1 = α e−(λ2 N et A1 ) − e−(λ2 N et A2 ) , λ2 = β

(4.11)

By substituting values obtained in Eqs. (4.11) in (4.5), the approximate solution with rating Z-number yT is obtained. In the example section it is shown that the approximate solution is close to the main solution.

4.2 Second State We consider the problem in such a way that the regression coefficients themselves type objective based on following observations: z be of the Z-number yi , xi1 , xi2 , . . . , xin , i = 1, . . . , n., is to obtain an optimal model with Z-number coefficients for describing, analyzing and predicting based on it, where xij are real and

z yi are of Z-number type. The basic model of Z-number linear regression (ZLR) is as follow [Yi ]z = [A0 ]z + [A1 ]z xi1 + [A2 ]z xi2 + . . . + [An ]z xin .

(4.12)

where xi1 ∈ R, [A0 ]z , [A1 ]z , [A2 ]z , . . . , [An ]z , nd [Yi ]z are Z − number

z We suppose z [A]z and y are respectively regular pairs [A]z = (Ak , AH ) and y = yk , yH that AH , , Ak , yH , yk are fuzzy numbers. Ak and yk play the role of restriction and AH and yH play the role of degree of confidence for restrictions, respectively. The discussion is limited to the state where AH , , Ak , yH , yk be symmetric fuzzy triangular numbers. Thus, the main problem with density function f(X ) = λe−λX , α ≤ λ ≤ β where a and b are fixed number greater than zero and α < β are shown as follows: yk , yH = (Ak0 , AH0 ) + . . . + (Akn , AHn )xn .

(4.13)

That is the probability Y, be yk and is equal to yH in the form of relation (4.13). For estimating Z-number regression of Eq. (4.13) with above conditions, we define the proposed method as follows: z = ] [y T

z A , B

z

z z [yT ] = w 0 + xi1 w 1 + . . . + xin w n + [εn ]z , i = 1, . . . , m., j = 1, . . . ., n.

(4.14)

where A plays the role of output restriction of neural network and B the probability that yT be A . wi are the Z-number coefficients of yT where the value of the

142

T. Allahviranloo and S. Ezadi

restriction section of these coefficients are using z determined neural network intro duced in relationship (4.7). Suppose wi = wi A , wi B where wi A and wi B are symmetric fuzzy triangular numbers. Indeed i A means that the related component plays the role of restriction wi and indeed i B means that the related component plays the role of degree of confidence for the first component that has been shown with i A matrix. Supposing that wi A = wi A1 , wi A2 and wi B = wi B1 , wi B2 , Eq. (4.14) can be rewritten as follows: [yT ]z = w0 A1 , w0 A2 w0B ,w 0B2 1 + w1A w (4.15) ,w ,w x + ... 1A 1B 1B 1 2 1 2 where A is in below form and later in Eq. (4.19) we introduce B .

A =

x, A 1 = w0 A1 + w1A 1 A 2 = w0 A2 + w1A2 x.

(4.16)

The coefficients in Eq. (4.16) are obtained from relationships (4.7) and (4.10). Since these coefficients in this method are of Z-number value, now we should obtain the probabilities of these coefficients:

w0 A ! " 2 − λw0 A − λw0 A −λX 2 2 λe dX = e −e p w0 A1 ≤ w0 ≤ w0 A2 = w0 A

⇒ p w0 A1 ≤ w0 ≤ w0 A2 ! " ⎧ ⎪ − λ1 w0 A − λ1 w0 A ⎨ 2 2 , λ = α w0B1 = e ! − e 1 " = ⎪ − λ w 2 0 A ⎩ − λ w 1 − e 2 0 A2 , λ2 = β w0B2 = e 1

(4.17)

and " w1A ! 2 = ∫ λe−λX d X p w ≤ w1 ≤ w1A 1A 2 1

w 1A1 %

=e

&

− λw 1A1

" ! − λw1A 2 −e

" ! ⇒ p w ≤ w1 ≤ w1A 1A 2 1

=

% & ⎧ " ! ⎪ ⎪ − λ1 w − λ1 w1A ⎪ ⎪ 1A ⎪ 2 , λ = α 1 −e ⎨ w1B = e 1

⎪ ⎪ ⎪ ⎪ ⎪ ⎩w

1

%

− λ2 w 1A1 1B2 = e

&

−e

" ! − λ2 w1A 2 , λ = β 2

(4.18)

On the Z-Numbers

143

Table 4 Crisp input-fuzzy output

i

xi

Yi = Y i, ei, ei ¯ T

Interval Yi

1

1

(8, 1.8)

(6.2, 9.8)

2

2

(6.4, 2.2)

(4.2, 8.6)

3

3

(9.5, 2.6)

(6.9, 12.1)

4

4

(13.5, 2.6)

(10.9, 16.1)

5

5

(13, 2.4)

(10.6, 15.4)

That there by, w0B , w0B , w 1B1 and w1B2 are obtained from neural network. 2 1 Ultimately B is computable as follows:

B =

+ λ1 w1B o, λ1 = α, B 1 = w0B 1 1 o, λ2 = β. B 2 = w0B2 + λ2 w1B 2

(4.19)

In examples section, it is shown that the value of A is close to the value of A and the value of B is close to the value of B and this shows that our Z-number weights are almost the regression coefficients (ZLR). Example 1 For Z-number variables, consider dependent variable Y and independent real variable xi , the values given in Table 4. Suppose that the Z-number value of [Yi ]z for each real observation xi is as follow (Table 5): (1.a) Using these data, develop an estimate fuzzy regression equation [Yi ]z = (AY , BY ) by yT = (N et, p(yT is N et)), (first state). Where stopping criteria is k = 32 iterations of the learning algorithm. Where training starts with w0 = (0, 0), w1 = (0, 0). The value of [yT i ]z for density function λe−λX in Table 6 for 0.2 ≤ λ ≤ 0.3 visible. The optimal weights of the neural network is as follows. N et = (5.60, 1.79) + (1.34, 0.17)x (1.b) Using these data, develop fuzzy regression equation [Yi ]z = an estimated z (AY , BY ) by [yT ] = A , B , (two state). Where stopping criteria: k = 19 iterations of the learning algorithm. The training starts with [w0 ]z = (0, 0)(0, 0), [w1 ]z = (0, 0)(0, 0). The value of [yT i ]z in Table 7 visible. Table 5 Crisp input-Z-number output

144

T. Allahviranloo and S. Ezadi

Table 6 Crisp input-Z-number output

Table 7 Crisp input-Z-number output

The optimal weights of the neural network are as follows. w0 A = (5.60, 1.79), (0.37, 0.39), w1A = (1.34, 0.17), (0.20, 0.27).

5 The Definition of a Derivative-Based Z-Number [11] We, the general form of differential equation initial value based on Z-numbers are defined as follows. Z = F(t, z) (5.1) Z (t0 ) = Z 0 ∈ Z ∗ where Z ∗ is set of Z − number s. Suppose, Z ( A, B) is a Z − number s Then Z (t + h) − Z Z (t) h

(5.2)

(A(t + h), B(t + h)) − Z (A(t), B(t)) h

(5.3)

Z (t) = lim

h→0

Now, the relation (5.2), as rewritten Z (t) = lim

h→0

where for (A(t + h), B(t + h)) − Z (A(t), B(t)), We define (A(t + h), B(t + h)) − Z (A(t), B(t))

On the Z-Numbers

145

= A(t + h) −

gh A(t),

B(t + h) − p B(t)

(5.4)

where A(t + h) − gh A(t) is the difference gh, and B(t + h) − p B(t) Difference on the probability distribution functions. Assumption, Z (t) = F(t, Z ) is a differential equation then we have, t

t

Z (s) = a

F(s, Z (s))ds a

t F(s, Z (s))ds

⇒ Z (t) = Z (a) +

(5.5)

a

5.1 Z-Number Initial Value Problem (ZIVP) In real world, most of the phenomena are based on doubt and the information which we have from various subjects such as economic, political, physics and etc., has evaluated according to verbal valuables. Here, we try to formulate and investigate the mentioned information to the initial value problem while our initial data are Znumbers. For instance, evaluation the population growth issue with the Z-numbers data (population growth, very high, usually), in the starting moment t0 and consider how is the information in the next times like in t1. For this purpose, first, we consider a ZIVP as (4.1), then we study the existence and uniqueness condition of solution of this problem;

x (t) = f (t, x(t)), x(t0 ) = x0

(5.6)

where t ∈ [t0 , T ] ⊆ R+ and f is a continuous and f is a continuous mapping from [t0 , T ] × Rz into Rz and x0 is a Z-number in Rz , and [x(t)]α = [x A (t), x B (t)]α =

x αA , x¯ αA , x αB , x¯ Bα

(5.7)

If x(t) is (i)-(i)-gH-differentiable, then

α α (t) = x Agh (t), x B gh (t) x gh = x αA , x¯ αA , x αB , x¯ Bα

So, we can obtain a system that be contain two fuzzy initial value problems, as

146

T. Allahviranloo and S. Ezadi

⎧ ⎪ ⎪ x A (t) = f A (t, x A (t), x B (t)), ⎪ ⎪ ⎪ ⎨ x A (t0 ) = x0 A and ⎪ ⎪ ⎪ x B (t) = f B (t, x A (t), x B (t)), ⎪ ⎪ ⎩ x (t ) = x B 0 0B

(5.8)

So, the ZIVP (5.6) translates into the system of crisp differential equations, as follows: ⎧ α x A = f αA t, x αA (t), x αB (t) ⎪ ⎪ α ⎪ ⎪ ⎪ x¯ A = f¯Aα t, x¯ αA (t), x¯ Bα (t) ⎪ ⎪ ⎪ ⎪ x αB = f αB t, x αA (t), x αB (t) ⎪ ⎪ ⎨ α x¯ B = f¯Bα t, x¯ αA (t), x¯ Bα (t) ⎪ x αA (t0 ) = x αA0 ⎪ ⎪ ⎪ ⎪ x¯ αA (t0 ) = x¯ αA0 ⎪ ⎪ ⎪ ⎪ ⎪ x α (t ) = x αB0 ⎪ ⎩ αB 0 α x¯ B (t0 ) = x¯ B0

(5.9)

where [ f (t, x)]α =

f αA t, x αA (t), x αB (t) , f αB t, x αA (t), x αB (t) , f¯Aα t, x¯ αA (t), x¯ Bα (t) , f¯Bα t, x¯ αA (t), x¯ Bα (t)

(5.10)

Theorem 5.1 (Characterization Theorem) Let x0 ∈ Rz and t ∈ [t0 , T ] and f : [t0 , T ] × Rz → Rz is a continuous Z-process such that: (1) [ f (t, x)]α = [( f A (t, x A (t), x B (t))), ( f B (t, x A (t), x B (t)))]α = f αA t, x αA (t), x αB (t) , f αB t, x αA (t), x αB (t) ,

α α f¯A t, x¯ A (t), x¯ Bα (t) , f¯Bα t, x¯ αA (t), x¯ Bα (t)

(5.11)

(2) Boundary function f αA , f¯Aα , f αB and f¯Bα are equiconnuous and uniformly on any bounded on any bounded set, (3) There L > 0 such that: α f A (t, (x1A , x1B ), (y1A , y1B )) − f αA (t, (x2 A , x2B ), (y2 A , y2B )) ≤ L Max{|(x A , x B ) − (u A , u B )|, |(y A , y B ) − (v A , v B )|}, and α f¯ (t, (x1A , x1B ), (y1A , y1B )) − f¯α (t, (x2 A , x2B ), (y2 A , y2B )) A A

On the Z-Numbers

147

≤ L Max{|(x A , x B ) − (u A , u B )|, |(y A , y B ) − (v A , v B )|}, and α f B (t, (x1A , x1B ), (y1A , y1B )) − f αB (t, (x2 A , x2B ), (y2 A , y2B )) ≤ L Max{|(x A , x B ) − (u A , u B )|, |(y A , y B ) − (v A , v B )|}, and α f¯ (t, (x1A , x1B ), (y1A , y1B )) − f¯α (t, (x2 A , x2B ), (y2 A , y2B )) B B ≤ L Max{|(x A , x B ) − (u A , u B )|, |(y A , y B ) − (v A , v B )|}, for each α ∈ [0, 1]. Then the ZIVP (5.6) and the system of ordinary differential equations (5.6) are equivalent. Proof The equicontinuity of f αA , f Aα , f αN and f Nα implies the continuity of the function f. Also, the Lipschitz properties in (3) ensure that f satisfy a Lipschitz property as follows: sup max f αA (t, (x1A , x1B ), (y1A , y1B )) − f αA (t, (x2 A , x2B ), (y2 A , y2B )) , α∈[0,1]

α f¯ (t, (x1A , x1B ), (y1A , y1B )) − f¯α (t, (x2 A , x2B ), (y2 A , y2B )), A A α f B (t, (x1A , x1B ), (y1A , y1B )) − f αB (t, (x2 A , x2B ), (y2 A , y2B )), α f¯ (t, (x1A , x1B ), (y1A , y1B )) − f¯α (t, (x2 A , x2B ), (y2 A , y2B )) B B ≤ L sup max y αA − x αA , y αA − x αA , y Bα − x Bα , y Bα − x Bα α∈[0,1]

i.e. : Dz ( f (t, x), f (t, y)) ≤ Dz (x, y). By the continuity of f, from this last Lipschitz condition and the boundedness condition in (2) it follows that ZIVP (5.6) has a unique solution. The solution of the problem (5.6) is (i)-(i)-gH-differentiable and so, by the recent theorem, the functions x αA , x αA , x Bα and x Bα are differentiable, and as a conclusion x αA (t), x αA (t) , x Bα (t), x Bα (t) is a solution of (5.6). Note that, these relations satisfy for the other type of gHdifferentiability. In the following theorem, we show that the converse result holds as well and the ZIVP (5.6) be equivalent to the system (5.9). Theorem 5.2 The ZIVP (5.6) has a unique Z-process solution. Proof Based on the previous theorem, the ZIVP (5.6), the relation (5.8) and the system of crisp differential equations (5.9) are equivalent. In the other hand, each of the fuzzy initial value problem in (5.8) has a unique solution. Therefore, it is obvious that the ZIVP (5.6) has a unique solution (Figs. 3, 4 and 5).

148

T. Allahviranloo and S. Ezadi

Fig. 3 Graph for x A (t) of growth model

Fig. 4 Graph for x B (t) of growth model

5.2 Numerical Examples (Economics Example) Let us suppose that high saved money by an employee in a country is significantly over 1000 dollars per month, very likely which is deposited in a bank account and x(t) be the amount of money in a bank account at time t, given in years and we have x (t) = 1300 + 0.04x(t) − 50t

(5.12)

where x(t0): = (high saved money, significantly over 1000 dollars, very likely) and x0 A := (significantly over 1000 dollars), is a triangular fuzzy number

On the Z-Numbers

149

Fig. 5 Graph for x(t) of growth model

(999,1000,1001) and x0B := (very likely), presents the normal probability density 2 (x−μ) 1 function N (x; μ, σ ) = σ √2π exp − 2σ 2 , μ = 1000 and σ = 1. The term 0.04x(t) must reflect the rate of growth due to interest. It is possible that the interest is nominally 4% per year compounded continuously. The term 1300 reflects deposits into the account at a constant rate of 1300 dollars per year. The term 50t accounts for the contribution to the decrease of money which represents the rate of withdrawal. The rate at which money is being withdrawn is increasing with time. Similarly, this ZIVP has a unique solution, too. So, we have (Fig. 6) Fig. 6 Graph of the Example (5.2) for xA (t)

150

T. Allahviranloo and S. Ezadi

x− α (t), x¯ αA (t) = [−1250 + 2249e0.04t + e0.04t α + 1250t, A

− 1250 + 2251e0.04t − e0.04t α + 1250t] and x−

α B

(t), x¯ Bα (t)

= [−1250 + 2247e0.04t + e0.04t α + 1250t, − 1250 + 2253e0.04t − e0.04t α + 1250t]

where α ∈ [0, 1].

6 Conclusions Obviously, the issue of information assurance is of fundamental importance in planning, decision-making, formulation of algorithms and management of information. The issue of reliability is intrinsically complex—an issue that does not lend itself to rigorous formal analysis. So, in this chapter, first, the Z-number and the algebraic actions are defined on it. Then, the parametric form of Z-numbers is introduced. Subsequently the issue of ranking Z-numbers was discussed and some methods for ranking the Z numbers were studied. Also, Applications of Z numbers was studied in different fields. One of these uses is the introduction of a Z-number-based regression. As well as the introduction of differential equations based on Z-numbers.

References 1. R.A. Aliev, O.H. Huseynov, R.R. Aliyev, A.A. Alizadeh, The Arithmetic of Z-Numbers (2015) 2. A.S.A. Bakar, A. Gegov, Multi-layer decision methodology for ranking Z-numbers. Int J Comput Intell Syst. 8, 395–406 (2015). 3. S. Ezadi, T. Allahviranloo, New multi-layer method for Z-number ranking using hyperbolic tangent function and convex combination. Intell. Autom. Soft Comput. (2017) 4. S. Ezadi, T. Allahviranloo, Two new methods for ranking of Z-numbers based on sigmoid function and sign method. Int. J. Intell. Syst. 1–12 (2018) 5. S. Ezadi, T. Allahviranloo, Numerical solution of linear regression based on Z-numbers by improved neural network.Intell. Autom. Soft Comput. (2017) 6. M. Hukuhara, Integration des applications measurables dont lavaleur est un compact convexe. Funkcialaj Ekvacioj 10, 205223 (1967) 7. W. Jiang, Ch. Xie, Y. Luo, Y. Tang, Ranking Z-numbers with an improved ranking method for generalized fuzzy numbers. J. Intell. Fuzzy Syst. 32(3), 1931–1943 (2017) 8. B. Kang, D. Wei, Y. Li, & Y. Deng, Decision making using Z-numbers under uncertain environment. J. Comput. Inf. Syst. 8, 2807–2814 (2012) 9. D.C. Liu, J. Nocedal, On the limited memory BFGS method for large scale optimization. Math. Program. 45(3), 503–528 (1989)

On the Z-Numbers

151

10. D. Mohamada, S. Akma Shaharanib, N.H. Kamisc, in A Z-Number-Based Decision Making Procedure with Ranking Fuzzy Numbers method. International Conference on Quantitative Sciences and Its Applications, pp. 160–166 (2014) 11. S. Pirmuhammadi, T. Allahviranloo, M. Keshavarz, The parametric form of Z-number and its application in Z-number initial value Problem. Int. J. Intell. Syst. 32(10), 1030–1061 (2017) 12. L.A. Zadeh, A note on Z-numbers. Inf. Sci. 181, 2923–2932 (2011)

Fuzzy Normed Linear Spaces Sorin N˘ad˘aban, Simona Dzitac and Ioan Dzitac

Abstract In this chapter we will present different concepts of fuzzy norms on a linear space, introduced by various authors from different points of view. Thus, in 1984, Katsaras was the first who introduced the notion of fuzzy norm, this being of Minkowsky type, associated to an absolutely convex and absorbing fuzzy set. In 1992, Felbin introduced another ideea of fuzzy norm, by assigning a fuzzy real number to each element of the linear space. Following Cheng and Mordeson, in 2003, Bag and Samanta proposed another concept of fuzzy norm, which will be proven most adequate, even if it can be still improved, simplified or generalized. In this context, this chapter will contain the results obtained by N˘ad˘aban and Dzitac in the paper “Atomic decomposition of fuzzy normed linear spaces for wavelet applications”. We also note that a new concept of fuzzy norm was introduced by Saadati and Vaezpour, in 2005. The concept of fuzzy norm has been generalized to continuous t-norm by Gole¸t in 2010. Recently, Alegre and Romaguera proposed the term of fuzzy quasinorm and N˘ad˘aban introduced the notion of fuzzy pseudo-norm. Keywords Fuzzy norm · Fuzzy normed linear spaces · Fuzzy continuous linear operators

S. N˘ad˘aban · I. Dzitac (B) Aurel Vlaicu University of Arad, Arad, Romania e-mail: [email protected]; [email protected] S. N˘ad˘aban e-mail: [email protected] S. Dzitac University of Oradea, Oradea, Romania e-mail: [email protected] I. Dzitac Agora University of Oradea, Oradea, Romania © Springer Nature Switzerland AG 2020 S. N. Shahbazova et al. (eds.), Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing 391, https://doi.org/10.1007/978-3-030-38893-5_8

153

154

S. N˘ad˘aban et al.

1 Introduction In [42] Zadeh said about mathematical rigour of fuzzy logic theory: “Fuzzy logic is not fuzzy. Basically, fuzzy logic is a precise logic of imprecision and approximate reasoning. More specifically, fuzzy logic may be viewed as an attempt at formalization/mechanization of two remarkable human capabilities. First, the capability to converse, reason and make rational decisions in an environment of imprecision, uncertainty, incompleteness of information, conflicting information, partiality of truth and partiality of possibility—in short, in an environment of imperfect information. And second, the capability to perform a wide variety of physical and mental tasks without any measurements and any computations. In fact, one of the principal contributions of fuzzy logic—a contribution which is widely unrecognized—is its high power of precisiation. Fuzzy logic is much more than a logical system. It has many facets. The principal facets are: logical, fuzzy-set-theoretic, epistemic and relational. Most of the practical applications of fuzzy logic are associated with its relational facet”. Fuzzy sets and fuzzy logic introduced by Zadeh in his famous paper [41] have today many useful applications in a large variety of domains: computer science, artificial intelligence, robotics, engineering, economy and many more. Some general aspects of Zadeh’s contributions to the development of Soft Computing and Artificial Intelligence are presented in paper [13]. At the same time, many authors have tried to generalize many of the classical concepts of Mathematics in fuzzy context. The motivation for this approach comes from the fact that fuzzy theory is a powerful and suitable tool to describe the world that surrounds us, full of uncertainty, in a purely mathematical way. In 1977, Katsaras and Liu introduced in paper [21] the notion of fuzzy topological vector space, thus laying the foundations of what we can call today a new mathematical field “Fuzzy Functional Analysis”. The study of fuzzy topological vector spaces was continued by Katsaras in papers [22, 23]. Fuzzy topological vector spaces are fundamental structures in fuzzy functional analysis, but their degree of generality is too high. Thus, many results have been obtained on fuzzy normed linear spaces. In 1984, Katsaras was the first who introduced the notion of fuzzy norm (see [23]), this being of Minkowsky type, associated to an absolutely convex and absorbing fuzzy set. In 1992, Felbin introduced another ideea of fuzzy norm (see [14]), by assigning a fuzzy real number to each element of the linear space. Following Cheng and Mordeson (see [8]), in 2003, Bag and Samanta proposed another concept of fuzzy norm (see [2]), which will be proven most adequately, even if it can be still improved, simplified or generalized. Thus, in paper [33] N˘ad˘aban and Dzitac obtained decomposition theorems for fuzzy norms into a family of semi-norms, in more general settings. We also note that a new concept of fuzzy norm was introduced in paper [35] by Saadati and Vaezpour, in 2005. The concept of fuzzy norm has been generalized to continuous t-norm by Gole¸t in 2010 (see [16]). Recently, Alegre and Romaguera proposed the term of fuzzy quasi-norm (see [1]) and N˘ad˘aban introduced the notion of fuzzy pseudo-norm (see [30]).

Fuzzy Normed Linear Spaces

155

Development has continued. Thus, one important problem concerning fuzzy normed linear spaces is the study of fuzzy bounded linear operators as well as the study of fuzzy continuous mappings in fuzzy normed linear spaces. For this purpose we will refer to the papers of Bag and Samanta [3], Sadeqi and Kia [37] and N˘ad˘aban [29].

2 Preliminaries Let X be a non-empty set. Definition 1 [41] A fuzzy set in X is a mapping μ : X → [0, 1]. We denote by F (X ) the family of all fuzzy sets in X . Definition 2 [41] The union of fuzzy sets {μi }i∈I is defined by μi (x) = i∈I sup{μi (x) : i ∈ I }. The intersection of the fuzzy sets {μi }i∈I is defined by μi i∈I

(x) = inf{μi (x) : i ∈ I }.

Definition 3 [41] For α ∈ (0, 1] and μ ∈ F (X ), the α-level set [μ]α is defined by [μ]α := {x ∈ X : μ(x) ≥ α}. The support of μ is defined by supp μ := {x ∈ X : μ(x) > 0}. Definition 4 [38] A triangular norm (t-norm for short) is a function ∗ : [0, 1] × [0, 1] → [0, 1] such that: 1. 2. 3. 4.

a ∗ b = b ∗ a, (∀)a, b ∈ [0, 1], (commutativity); a ∗ 1 = a, (∀)a ∈ [0, 1], (boundary condition); (a ∗ b) ∗ c = a ∗ (b ∗ c), (∀)a, b, c ∈ [0, 1], (associativity); If a ≤ c and b ≤ d , with a, b, c, d ∈ [0, 1], then a ∗ b ≤ c ∗ d , (monotonicity) .

Example 1 The following are the most important t-norms: ∧, ·, ∗L , which are defined by a ∧ b = min{a, b}, a · b = ab (usual multiplication in [0, 1]) and a ∗L b = max{a + b − 1, 0} (the Lukasiewicz t-norm). Definition 5 [25] The triple (X , M , ∗) is said to be a fuzzy metric space if X is an arbitrary set, ∗ is a continuous t-norm and M is a fuzzy metric, i.e. a fuzzy set in X × X × [0, ∞) which satisfies the following conditions: (M1) M (x, y, 0) = 0, (∀)x, y ∈ X ; (M2) [M (x, y, t) = 1, (∀)t > 0] if and only if x = y; (M3) M (x, y, t) = M (y, x, t), (∀)x, y ∈ X , (∀)t ≥ 0;

156

S. N˘ad˘aban et al.

(M4) M (x, z, t + s) ≥ M (x, y, t) ∗ M (y, z, s), (∀)x, y, z ∈ X , (∀)t, s ≥ 0; (M5) (∀)x, y ∈ X , M (x, y, ·) : [0, ∞) → [0, 1] is left continuous and lim M (x, y, t) = 1.

t→∞

Example 2 [15] Let (X , d ) be a metric space. Let Md : X × X × [0, ∞), Md (x, y, t) =

if t > 0 . 0 if t = 0

t t+d (x,y)

Then (X , Md , ∧) is a fuzzy metric space. Md is called standard fuzzy metric. Theorem 1 [15] Let (X , M , ∗) be a fuzzy metric space. For x ∈ X , r ∈ (0, 1), t > 0 we define the open ball B(x, r, t) := {y ∈ X : M (x, y, t) > 1 − r}. Let TM := {T ⊂ X : x ∈ T iff (∃)t > 0, r ∈ (0, 1) : B(x, r, t) ⊆ T } . Then TM is a topology on X . Theorem 2 [17] A topological vector space X is fuzzy metrizable if and only if it is metrizable. Our basic reference for fuzzy metric space and related structures is [18], for tnorms it is [24] and for other important spaces in fuzzy mathematics it is [12]. For the concept of fuzzy real number, arithmetic operations and ordering on the set of all fuzzy real numbers we refer the reader to the papers [9–12, 14, 19, 20, 27, 39]. Definition 6 It is called fuzzy real number a fuzzy set in R, such that: 1. x is convex, i.e. x(t) ≥ min{x(s), x(r)}, for s ≤ t ≤ r; 2. x is normal, i.e. (∃)t0 ∈ R : x(t0 ) = 1; 3. x is upper semicontinuous, i.e. (∀)t ∈ R, (∀)α ∈ (0, 1] : x(t) < α, (∃)δ > 0 such that |s − t| < δ ⇒ x(s) < α. We will denote by F the set of all fuzzy real numbers. Remark 1 Let F . For all α ∈ (0, 1], the α-level sets [x]α = {t : x(t) ≥ α} are closed intervals [xα− , xα+ ]. Remark 2 R can be embedded in F . If r ∈ R, then the fuzzy real number r is defined by 1 if t = r r(t) = 0 if t = r.

Fuzzy Normed Linear Spaces

157

Definition 7 A fuzzy real number x is called non-negative if x(t) = 0, (∀)t < 0. The set of all non-negative real numbers will be denoted by F + . Definition 8 [27] The arithmetic operations +, −, ·, / on F , are defined by: (x ⊕ y)(t) =

min{x(s), y(t − s)}, (∀)t ∈ R

s∈R

(x y)(t) =

min{x(s), y(s − t)}, (∀)t ∈ R

s∈R

(x y)(t) =

min{x(s), y(t/s)}, (∀)t ∈ R

s∈R∗

(x y)(t) =

min{x(ts), y(s)}, (∀)t ∈ R

s∈R

Remark 3 [10] If x ∈ F and r ∈ (0, ∞), then (r x)(t) = (r x)(t) = x(t/r) and 0 x = 0 x = 0.

3 Katsaras’s Type Fuzzy Norm Let X be a vector space over a field K, where K is R or C. Definition 9 [21] A fuzzy set ρ in X is said to be: 1. convex if tρ + (1 − t)ρ ⊆ ρ, (∀)t ∈ [0, 1]; 2. balanced if λρ ⊆ ρ, (∀)λ ∈ K, |λ| ≤ 1; 3. absorbing if tρ = 1; t>0

4. absolutely convex if it is both convex and balanced. Proposition 1 [21] Let ρ be a fuzzy set in X . Then: 1. ρ is convex if and only if ρ(tx + (1 − t)y) ≥ ρ(x) ∧ ρ(y), (∀)x, y ∈ X , (∀)t ∈ [0, 1]; 2. ρ is balanced if and only if ρ(λx) ≥ ρ(x), (∀)x ∈ X , (∀)λ ∈ K, |λ| ≤ 1. Definition 10 [23] A Katsaras fuzzy semi-norm on X is a fuzzy set ρ in X which is absolutely convex and absorbing. Definition 11 [33] A Katsaras fuzzy semi-norm ρ on X will be called Katsaras fuzzy norm if ρ xt = 1, (∀)t > 0 ⇒ x = 0. Remark 4 (a) We note that

x

x ρ = 1, (∀)t > 0 ⇒ x = 0 ⇔ inf ρ < 1, for x = 0 . t>0 t t

158

S. N˘ad˘aban et al.

(b) The condition ρ xt = 1, (∀)t > 0 ⇒ x = 0 is much weaker than that imposed by Katsaras [23]: inf ρ xt = 0, for x = 0 . t>0

Katasaras’s fuzzy norm under triangular norms is redefined in paper [32]. Let ∗ be a continuous t-norm. Definition 12 [32] A fuzzy set μ in X is called *-convex if μ(tx + (1 − t)y) ≥ μ(x) ∗ μ(y), (∀)x, y ∈ X , (∀)t ∈ [0, 1]. Definition 13 [32] A fuzzy set ρ in X which is *-convex, balanced and absorbing will be called Katsaras’s type fuzzy semi-norm. If in addition ρ

x t

= 1, (∀)t > 0 ⇒ x = 0,

then ρ will be called Katsaras’s type fuzzy norm.

4 Felbin’s Type Fuzzy Norm Definition 14 [14, 39] Let X be a vector space over R and let θ be the origin of X . Let || · || : X → F + and let the mappings L, R : [0, 1] × [0, 1] → [0, 1] be symmetric, nondecreasing in both arguments and satisfying L(0, 0) = 0 and R(1, 1) = 1. We + write [||x||]α = [||x||− α , ||x||α ], for x ∈ X . The quadruple (X , || · ||, L, R) is called fuzzy normed linear space and || · || a fuzzy norm, if 1. ||x|| = 0 if and only if x = θ ; 2. ||rx|| = |r| ||x||, (∀)x ∈ X , r ∈ R; 3. for all x, y ∈ X , − − a. if s ≤ ||x||− 1 , t ≤ ||y||1 and s + t ≤ ||x + y||1 , then

||x + y||(s + t) ≥ L(||x||(s), ||y||(t)), − − b. if s ≥ ||x||− 1 , t ≥ ||y||1 and s + t ≥ ||x + y||1 , then

||x + y||(s + t) ≤ R(||x||(s), ||y||(t)).

Fuzzy Normed Linear Spaces

159

5 Bag-Samanta’s Type Fuzzy Norm Let X be a vector space over a field K, where K is R or C. Let ∗ be a continuous t-norm. Firstly, we note that for some results it is enough that the t-norm ∗ satisfies sup x ∗ x = 1. x∈(0,1)

Definition 15 [33] A fuzzy set N in X × [0, ∞) is called a fuzzy norm on X if it satisfies: (N1) (N2) (N3) (N4) (N5)

N (x, 0) = 0, (∀)x ∈ X ; [N (x, t) = 1, (∀)t > 0] if and only if x = 0; t , (∀)x ∈ X , (∀)t ≥ 0, (∀)λ ∈ K∗ ; N (λx, t) = N x, |λ| N (x + y, t + s) ≥ N (x, t) ∗ N (y, s), (∀)x, y ∈ X , (∀)t, s ≥ 0; (∀)x ∈ X , N (x, ·) is left continuous and lim N (x, t) = 1. t→∞

The triple (X , N , ∗) will be called fuzzy normed linear space. Remark 5 (a) Bag and Samanta [2, 3] gave a similar definition for ∗ = ∧, but in order to obtain some important results they assume that the fuzzy norm satisfies also the following conditions: (N6) N (x, t) > 0, (∀)t > 0 ⇒ x = 0 ; (N7) (∀)x = 0, N (x, ·) is a continuous function and strictly increasing on the subset {t : 0 < N (x, t) < 1} of R. The results obtained by Bag and Samanta can be found in this more general settings (see [33]). (b) I. Gole¸t [16], C. Alegre and S. Romaguera [1] gave also this definition in the context of real vector spaces. Theorem 3 [33] If (X , N , ∗) is a fuzzy normed linear space, then M : X × X × [0, ∞) → [0, 1], M (x, y, t) = N (x − y, t) is a fuzzy metric on X , which is called the fuzzy metric induced by the fuzzy norm N. Moreover, we have: 1. M is a translation-invariant fuzzy metric; t 2. M (λx, λy, t) = M x, y, |λ| , (∀)x ∈ X , (∀)t ≥ 0, (∀)λ ∈ K∗ . Proof (M1) M(x,y,0)=N(x-y,0)=0; (M2) [M (x, y, t) = 1, (∀)t > 0] ⇔ [N (x − y, t) = 1, (∀)t > 0] ⇔ x − y = 0 ⇔ x = y; (M3) M (x, y, t) = N (x − y, t) = N ((−1)(y − x), t) =

160

S. N˘ad˘aban et al.

= N (y − x,

t ) = N (y − x, t) = M (y, x, t) | − 1|

(M4) M (x, z, t + s) = N (x − z, t + s) = N ((x − y) + (y − z), t + s) ≥ ≥ N (x − y, t) ∗ N (y − z, s) = M (x, y, t) ∗ M (y, z, s) (M5) It is obvious. Now we verify properties (1), (2). M (x, y, t); (1) M (x + z, y + z, t) = N ((x + z) − (y + z), t) = N (x −y, t) = t t (2) M (λx, λy, t) = N (λx − λy, t) = N x − y, |λ| = M x, y, |λ| . Corollary 1 [33] Let (X , N , ∗) be a fuzzy normed linear space. For x ∈ X , r ∈ (0, 1), t > 0 we define the open ball B(x, r, t) := {y ∈ X : N (x − y, t) > 1 − r}. Then TN := {T ⊂ X : x ∈ T iff (∃)t > 0, r ∈ (0, 1) : B(x, r, t) ⊆ T } is a topology on X . Moreover (X , TN ) is Hausdorff. Proof The first part results from the previous theorem. Let x, y ∈ X , x = y. Using (N 2), there exists t > 0: N (x − y, t) < 1. Let r = N (x − y, t). As sup x ∗ x = 1, we can find r1 ∈ (0, 1) : r1 ∗ r1 > r. We have

t B x, 1 − r1 , 2

x∈(0,1)

t ∩ B y, 1 − r1 , = ∅. 2

Indeed, if we suppose that there exists z ∈ B x, 1 − r1 , 2t ∩ B y, 1 − r1 , 2t , we ob tain that N x − z, 2t > r1 , N y − z, 2t > r1 . Thus r = N (x − y, t) ≥ N x − z, 2t ∗ N z − y, 2t > r1 ∗ r1 > r, which is a contradiction. Remark 6 Previous result was obtained by Sadeqi and Kia [37], in 2009, using (N 7). Theorem 4 [33] Let (X , N , ∗) be afuzzy normed linear space. Then (X , TN ) is a metrizable topological vector space. Proof First we have to show that the mappings (1) (x, y) → x + y (2) (λ, x) → λ · x are continuous. (1) Let xn → x, yn → y. We have

Fuzzy Normed Linear Spaces

161

t t ∗ N yn − y, → 1. N ((xn + yn ) − (x + y), t) ≥ N xn − x, 2 2 Hence xn + yn → x + y. (2) Let xn → x, λn → λ. We have N (λn xn − λx, t) = N (λn (xn − x) + x(λn − λ), t) ≥ t t ≥ N λn (xn − x), ∗ N x(λn − λ), = 2 2 = N xn − x,

t 2|λn |

∗ N x,

t 2|λn − λ|

→ 1.

This implies that λn xn → λx. Therefore (X , TN ) is a topological vector space. From Theorem 3 we have that X is fuzzy metrizable. Theorem 2 tells us that X is metrizable. Theorem 5 Let (X , N , ∧) be a fuzzy normed linear space. Let pα (x) := inf{t > 0 : N (x, t) > α}, α ∈ (0, 1). Then 1. P = {pα }α∈(0,1) is an ascending family of semi-norms on X ; 2. There exists on X a least fine topology, denoted by TP , compatible with the structure of linear space of X , with respect to which each semi-norm pα is continuous; 3. The locally convex topology TP is Hausdorff; 4. For x ∈ X , s > 0, α ∈ (0, 1), we have: pα (x) < s if and only if N (x, s) > α. 5. TN = TP ; 6. P = {pα }α∈(0,1) is right continuous, i.e. for any decreasing sequence (αn ) in (0, 1), αn → α ∈ (0, 1), we have pαn (x) → pα (x), (∀)x ∈ X . Proof 1. As N (0, t) = 1, (∀)t > 0, we obtain that pα (0) = inf{t > 0; N (0, t) > α} = 0. We show that pα (λx) = |λ|pα (x), (∀)x ∈ X , (∀)λ ∈ K. First we note that, for λ = 0, the previous equality is obvious. For λ = 0, we have t > α} = pα (λx) = inf{t > 0 : N (λx, t) > α} = inf{t > 0 : N x, |λ| t|λ| = inf{t|λ| > 0 : N x, > α} = |λ| inf{t > 0 : N (x, t) > α} = |λ|pα (x). |λ| Thus

162

S. N˘ad˘aban et al.

pα (x) + pα (y) = inf{t > 0 : N (x, t) > α} + inf{s > 0 : N (y, s) > α} = = inf{t + s > 0 : N (x, t) > α, N (y, s) > α} = inf{t + s > 0 : N (x, t) ∧ N (y, s) > α} ≥

≥ inf{t + s > 0 : N (x + y, t + s) > α} = pα (x + y). It remains to be proven that P = {pα }α∈(0,1) is an ascending family. Let α1 ≤ α2 . Then {t > 0 : N (x, t) > α2 } ⊆ {t > 0 : N (x, t) > α1 }. Thus inf{t > 0 : N (x, t) > α2 } ≥ inf{t > 0 : N (x, t) > α1 }, namely pα2 (x) ≥ pα1 (x), (∀)x ∈ X . 2. Indeed, a fundamental system of neighborhoods of 0 is CP = {B(pα , t) : α ∈ (0, 1), t > 0}, where B(pα , t) = {x ∈ X : pα (x) < t}. 3. We need to show that the family of semi-norms P is sufficient, i.e. (∀)x ∈ X , x = 0, (∃)pα ∈ P such that pα (x) = 0. Let x ∈ X , x = 0. We suppose that pα (x) = 0, (∀)α ∈ (0, 1). Then inf{t > 0 : N (x, t) > α} = 0, for all α ∈ (0, 1). Thus N (x, t) > α, (∀)α ∈ (0, 1), (∀)t > 0. Hence N (x, t) = 1, (∀)t > 0. Therefore, from (N 2), we have x = 0, which is in contradiction to x = 0. / {t > 4. “⇒”+ We must show that s ∈ {t > 0 : N (x, t) > α}. We suppose that s ∈ 0 : N (x, t) > α}. Then there exists t0 ∈ {t > 0 : N (x, t) > α} such that t0 < s. (Contrary, s ≤ t, (∀)t ∈ {t > 0 : N (x, t) > α}. Hence s ≤ inf{t > 0 : N (x, t) > α}, i.e. s ≤ pα (x), which is a contradiction.) As t0 ∈ {t > 0 : N (x, t) > α}, t0 < s and N (x, ·) is nondecreasing, we obtain that α < N (x, t0 ) ≤ N (x, s). Hence N (x, s) > α, which leads to a contradiction. “⇐” As N (x, s) > α, we obtain that s ∈ {t > 0 : N (x, t) > α}. Thus pα (x) ≤ s. We suppose that pα (x) = s. As N (x, ·) is left continuous in s, we have lim N (x, t) = t→s,t α. (Contrary, N (x, t) ≤ α, for all t ≤ s. Therefore lim N (x, t) ≤ α. Hence N (x, s) ≤ α, which is a cont→s,t α are in contradiction with the fact that s = inf{t > 0 : N (x, t) > α}. Hence pα (x) = s. Thus pα (x) < s. 5. In the topology TN a fundamental system of neighborhoods of 0 is S (0) = {B(0, r, t) : r ∈ (0, 1), t > 0}. In the topology TP a fundamental system of neighborhoods of 0 is CP = {B(pα , t) : α ∈ (0, 1), t > 0}, where B(pα , t) = {x ∈ X : pα (x) < t}. But the two systems are identical. Thus TN = TP . 6. Let x ∈ X and (αn ) be a decreasing sequence in (0, 1), αn → α ∈ (0, 1). Let s > pα (x). Then N (x, s) > α. As (αn ) is a decreasing sequence and αn → α, there exists n0 ∈ N such that αn < N (x, s), (∀)n ≥ n0 . Therefore pαn (x) < s, (∀)n ≥ n0 . Thus pαn (x) → pα (x). Theorem 6 [33] Let {qα }α∈(0,1) be a sufficient and ascending family of semi-norms on the linear space X . Let N : X × [0, ∞) → [0, 1], defined by

N (x, t) =

sup{α ∈ (0, 1) : qα (x) < t} if t > 0 . 0 if t = 0

Fuzzy Normed Linear Spaces

163

Then (X , N , ∧) is a fuzzy normed linear space. Proof We note firstly that N (x, ·) is nondecreasing. Indeed, for t1 < t2 , we have {α ∈ (0, 1) : qα (x) < t1 } ⊆ {α ∈ (0, 1) : qα (x) < t2 }. Thus sup{α ∈ (0, 1) : qα (x) < t1 } ≤ sup{α ∈ (0, 1) : qα (x) < t2 }. Hence N (x, t1 ) ≤ N (x, t2 ). (N1) N (x, 0) = 0, (∀)x ∈ X is obvious. (N2) If x = 0, then qα (x) = 0, (∀)α ∈ (0, 1). Hence, for all α ∈ (0, 1), we have qα (x) < t, (∀)t > 0. Thus sup{α ∈ (0, 1) : qα (x) < t} = 1, (∀)t > 0. Therefore N (x, t) = 1, (∀)t > 0. Conversely, if N (x, t) = 1, (∀)t > 0, then sup{α ∈ (0, 1) : qα (x) < t} = 1, for all t > 0. Hence, for all α ∈ (0, 1), we have qα (x) < t, (∀)t > 0. Thus, for all α ∈ (0, 1), we have qα (x) = 0. As the family of semi-norms {qα }α∈(0,1) is sufficient, we obtain that x = 0. t . For t > 0, we have (N3) If t = 0, then N (λx, t) = 0 = N x, |λ| N (λx, t) = sup{α ∈ (0, 1) : qα (λx) < t} = sup{α ∈ (0, 1) : |λ|qα (x) < t} = t t = N x, = sup α ∈ (0, 1) : qα (x) < |λ| |λ| (N4) The inequality N (x + y, t + s) ≥ N (x, t) ∧ N (y, s) obviously holds for t = 0 or s = 0. For t > 0, s > 0, we suppose that N (x + y, t + s) < N (x, t) ∧ N (y, s). Then there exists α0 ∈ (0, 1) such that N (x + y, t + s) < α0 < N (x, t) ∧ N (y, s). As N (x, t) > α0 , there exists β1 ∈ {α ∈ (0, 1) : qα (x) < t} such that β1 > α0 . (Contrary, for all β ∈ {α ∈ (0, 1) : qα (x) < t}, we have β ≤ α0 . Hence sup{α ∈ (0, 1) : qα (x) < t} ≤ α0 , which is a contradiction.) As N (y, s) > α0 , there exists β2 ∈ {α ∈ (0, 1) : qα (y) < s} such that β2 > α0 . Let β0 = min{β1 , β2 }. Then β0 > α0 and qβ0 (y) ≤ qβ2 (y) < s, qβ0 (x) ≤ qβ1 (x) < t. Thus qβ0 (x + y) ≤ qβ0 (x) + qβ0 (y) < t + s. Therefore β0 ∈ {α ∈ (0, 1) : qα (x + y) < t + s}. Thus sup{α ∈ (0, 1) : qα (x + y) < t + s} ≥ β0 > α0 , which is in contradiction with the fact that N (x + y, t + s) < α0 . Hence N (x + y, t + s) ≥ N (x, t) ∧ N (y, s). (N5) We prove that lim N (x, t) = 1. Let α0 ∈ (0, 1) arbitrary. We show that there t→∞

exists t0 > 0 such that N (x, t) > α0 , (∀)t ≥ t0 . As N (x, ·) is nondecreasing, it will be enough to show that there exists t0 > 0 such that N (x, t0 ) > α0 . Let t0 > qα1 (x), 0 ∈ (α0 , 1). Then where α1 = 1+α 2 N (x, t0 ) = sup{α ∈ (0, 1) : qα (x) < t0 } ≥ α1 > α0 .

164

S. N˘ad˘aban et al.

We prove now that N (x, ·) is left continuous in t > 0. Case 1. N (x, t) = 0. Thus, for all s ≤ t, as N (x, s) ≤ N (x, t), we have N (x, s) = 0. Therefore lim N (x, s) = 0 = N (x, t). s→t,s 0. Let α0 arbitrary, such that 0 < α0 < N (x, t). Let (tn ) be a sequence such that tn → t, tn < t. We prove that there exists n0 ∈ N such that N (x, tn ) > α0 , (∀)n ≥ n0 . (As α0 ∈ (0, N (x, t)) is arbitrary, we will obtain that lim N (x, tn ) = N (x, t). If 0 < α0 < N (x, t), then there exists β0 ∈ {α ∈ (0, 1) : n→∞ qα (x) < t} such that β0 > α0 . Contrary, for all β ∈ {α ∈ (0, 1) : qα (x) < t}, we have β ≤ α0 . Then sup{α ∈ (0, 1) : qα (x) < t} ≤ α0 , i.e. N (x, t) ≤ α0 , which is a contradiction.) As qβ0 (x) < t and tn → t, tn < t, there exists n0 ∈ N such that for all n ≥ n0 , we have tn > qβ0 (x). Thus N (x, tn ) = sup{α ∈ (0, 1) : qα (x) < tn } ≥ β0 > α0 , (∀)n ≥ n0 . Theorem 7 [33] Let (X , N , ∧) be a fuzzy normed linear space and pα (x) := inf{t > 0 : N (x, t) > α}, α ∈ (0, 1). Let N : X × [0, ∞) → [0, 1], be defined by N (x, t) =

sup{α ∈ (0, 1) : pα (x) < t} if t > 0 . 0 if t = 0

Then 1. P = {pα }α∈(0,1) is a right continuous and ascending family of semi-norms on X ; 2. (X , N , ∧) is a fuzzy normed linear space; 3. N = N . Proof (1) and (2) result from previous theorems. (3) For t = 0, we have N (x, t) = 0 = N (x, t). For t > 0, we have N (x, t) = sup{α ∈ (0, 1) : pα (x) < t} = sup{α ∈ (0, 1) : N (x, t) > α} ≤ N (x, t). We suppose that N (x, t) < N (x, t). Then there exists α0 ∈ (0, 1) such that N (x, t) < α0 < N (x, t). But α0 < N (x, t) implies that pα0 (x) < t. Thus sup{α ∈ (0, 1) : pα (x) < t} ≥ α0 , i.e. N (x, t) ≥ α0 , which is a contradiction. Hence N (x, t) = N (x, t).

Fuzzy Normed Linear Spaces

165

Theorem 8 [33] Let X be a linear space and {qα }α∈(0,1) be a sufficient and ascending family of semi-norms on X . Let N : X × [0, ∞) → [0, 1], be defined by

N (x, t) =

sup{α ∈ (0, 1) : qα (x) < t} if t > 0 . 0 if t = 0

Let pα : X → [0, ∞) be defined by pα (x) := inf{t > 0 : N (x, t) > α}, α ∈ (0, 1). Then 1. (X , N , ∧) is a fuzzy normed linear space; 2. P = {pα }α∈(0,1) is a right continuous and ascending family of semi-norms on X ; 3. pα = qα , (∀)α ∈ (0, 1) if and only if {qα }α∈(0,1) is right continuous. Proof (1) and (2) result from previous theorems. (3) “⇒” Is obvious. “⇐” We suppose that there exists α0 ∈ (0, 1) such that pα0 = qα0 . Then there exists x ∈ X such that pα0 (x) < qα0 (x) or pα0 (x) > qα0 (x). Case A. pα0 (x) < qα0 (x). Let s > 0 such that pα0 (x) < s < qα0 (x). As pα0 (x) < s, we have N (x, s) > α0 . We suppose that α0 < sup{α ∈ (0, 1) : qα (x) < s}. Then there exists β ∈ {α ∈ (0, 1) : qα (x) < s} : α0 < β. (Contrary, α0 ≥ β, for all β ∈ {α ∈ (0, 1) : qα (x) < s}. Thus α0 ≥ sup{α ∈ (0, 1) : qα (x) < s}, which contradicts our assumption.) As β ∈ {α ∈ (0, 1) : qα (x) < s} : α0 < β, we have qα0 (x) ≤ qβ (x) < s, which contradicts the fact that qα0 (x) > s. Thus α0 ≥ sup{α ∈ (0, 1) : qα (x) < s}, namely α0 ≥ N (x, t), which is in contradiction to N (x, t) > α0 . Case B. qα0 (x) < pα0 (x). Let β ∈ (α0 , 1). We will show that pα0 (x) ≤ qβ (x). We suppose that pα0 (x) > qβ (x). Let s > 0 : qβ (x) < s < pα0 (x). As qβ (x) < s, we have N (x, s) ≥ β > α0 . Thus pα0 (x) < s, which is a contradiction. Hence pα0 (x) ≤ qβ (x), lim qβ (x). Therefore pα0 (x) ≤ qα0 (x), which is (∀)β ∈ (α0 , 1). Thus pα0 (x) ≤ β→α0 ,β>α0

in contradiction to qα0 (x) < pα0 (x).

We note that a comparative study concerning Bag-Samanta’s definition, of Katsaras and that of Felbin was made in 2008 in the paper [6]. The following theorems extend some result obtained in [6] to Katsaras’s type fuzzy norm under triangular norms. Theorem 9 [32] If ρ is of Katsaras’s type fuzzy norm, then N (x, t) :=

x ρ t if t > 0 0 if t = 0

is a Bag-Samanta’s type fuzzy norm. Proof (N 1) N (x, 0) = 0, (∀)x ∈ X is obvious. (N 2) [N (x, t) = 1, (∀)t > 0] ⇒ ρ xt = 1, (∀)t > 0 ⇒ x = 0. Conversely, if x = 0, then N (0, t) = ρ(0) = 1, (∀)t > 0.

166

S. N˘ad˘aban et al.

(N 3) We suppose that t > 0 (if t = 0 (N 3) is obvious). We have: N (λx, t) = ρ

λx t

=ρ

|λ|x t

=ρ

x t/|λ|

t = N x, . |λ|

(N 4) If t = 0, then N (x, t) = 0 and N (x, t) ∗ N (y, s) = 0 ∗ N (y, s) = 0 and the inequality N (x + y, t + s) ≥ N (x, t) ∗ N (y, s) is obvious. A similar situation occurs when s = 0. If t > 0, s > 0, then t x s y x+y =ρ · + · ≥ N (x + y, t + s) = ρ t+s t+s t t+s s ρ

x t

∗ρ

y s

= N (x, t) ∗ N (y, s).

(N 5) First, we note that N (x, ·) is non-decreasing. Indeed, for s > t, we have N (x, s) = N (x + 0, t + s − t) ≥ N (x, t) ∗ N (0, s − t) = N (x, t) ∗ 1 = N (x, t). We prove now that N (x, ·) is left continuous in t > 0. Case 1. N (x, t) = 0. Thus, for all s ≤ t, as N (x, s) ≤ N (x, t), we have that N (x, s) = 0. Therefore lim N (x, s) = 0 = N (x, t). s→t,s 0. Let α be arbitrary such that 0 < α < N (x, t). Let (tn ) be a sequence such that tn → t, tn < t. We prove that there exists n0 ∈ N such that N (x, tn ) ≥ α, (∀)n ≥ n0 . As α ∈ (0, N (x, t)) is arbitrary, we will obtain that lim N (x, tn ) = n→∞ N (x, t). α . We note that s < t. As N (x, t) = tρ(x) > 0, we have that ρ(x) > 0. Let s = ρ(x) Indeed, α < t ⇔ α < tρ(x) = N (x, t). s s, (∀)n ≥ n0 . Then x x ≥ρ = sρ(x) = α. N (x, tn ) = ρ tn s Since

tρ(x) = 1, we obtain that

t>0

t>0

N (x, t) = 1. Thus lim N (x, t) = 1. t→∞

Theorem 10 [32] If N is a fuzzy norm of Bag-Samanta’s type, then ρ : X → [0, 1] defined by ρ(x) = N (x, 1), (∀)x ∈ X is a fuzzy norm of Katsaras’s type. Proof First, we note that, by (N 2), we have ρ(0) = N (0, 1) = 1. (1) ρ is *-convex.

Fuzzy Normed Linear Spaces

167

Let t ∈ (0, 1). Then ρ(tx + (1 − t)y) = N (tx + (1 − t)y, 1) = N (tx + (1 − t)y, t + 1 − t) ≥ ≥ N (tx, t) ∗ N ((1 − t)y, 1 − t) = N (x, 1) ∗ N (y, 1) = ρ(x) ∗ ρ(y). If t = 0, then ρ(tx + (1 − t)y) = ρ(y) = 1 ∗ ρ(y) ≥ ρ(x) ∗ ρ(y). The case t = 1 is similar. (2) ρ is balanced. Let x ∈ X , λ ∈ K∗ , |λ| ≤ 1. As N (x, ·) is non-decreasing, we have that 1 ≥ N (x, 1) = ρ(x). ρ(λx) = N (λx, 1) = N x, |λ| If x ∈ X , λ = 0, then ρ(λx) = ρ(0) = 1 ≥ ρ(x). (3) ρ is absorbing. x x Using (N 5), we have that (tρ)(x) = ρ t = N t,1 = N (x, t) = 1. t>0

t>0

t>0

t>0

Finally, ρ

x t

= 1, (∀)t > 0 ⇒ N

x t

, 1 = 1, (∀)t > 0 ⇒ N (x, t) = 1, (∀)t > 0 ⇒ x = 0.

6 Convergence in Fuzzy Normed Linear Spaces This section is based on papers [3, 28, 30]. Definition 16 Let (X , N , ∗) be a fuzzy normed linear space and (xn ) be a sequence in X . The sequence (xn ) is said to be convergent if (∃)x ∈ X such that lim N (xn − n→∞ x, t) = 1, (∀)t > 0. In this case, x is called the limit of the sequence (xn ) and we denote lim xn = x or xn → x. n→∞

Definition 17 Let (X , N , ∗) be a fuzzy normed linear space. The sequence (xn ) is called Cauchy sequence if (∀)r ∈ (0, 1), (∀)t > 0, (∃)n0 ∈ N : N (xn − xm , t) > 1 − r, (∀)n, m ≥ n0 . Remark 7 We note that previous definition is motivated by the work of George and Veeramani [15]. Proposition 2 Let (X , N , ∗) be a fuzzy normed linear space. A sequence (xn ) is convergent if and only if (xn ) is convergent in the topology TN .

168

S. N˘ad˘aban et al.

Proof xn → x in the topology TN ⇔ (∀)r ∈ (0, 1), (∀)t > 0, (∃)n0 ∈ N : xn ∈ B(x, r, t), (∀)n ≥ n0 ⇔ (∀)r ∈ (0, 1), (∀)t > 0, (∃)n0 ∈ N : N (xn − x, t) > 1 − r, (∀)n ≥ n0 ⇔ lim N (xn − x, t) = 1, (∀)t > 0.

n→∞

Proposition 3 If (X , N , ∗) is a fuzzy normed linear space, then every convergent sequence is a Cauchy sequence. Proof Let r ∈ (0, 1), t > 0. Then there exists α, β ∈ (0, 1) such that α ∗ β = 1 − r. sequence, converging to x, for γ ∈ Let γ = max{α, β}. As (xn ) is a convergent (0, 1), t > 0, (∃)n0 ∈ N such that N xn − x, 2t > γ , (∀)n ≥ n0 . Then, for n, m ≥ n0 , we have that t t N (xn − xm , t) = N xn − x + x − xm , + ≥ 2 2 t t N xn − x, ∗ N x − xm , ≥ α ∗ β = 1 − r. 2 2 Definition 18 Let (X , N , ∗) be a fuzzy normed linear space. (X , N , ∗) is said to be complete if any Cauchy sequence in X is convergent to a point in X . A complete fuzzy normed linear space will be called fuzzy Banach space. Definition 19 Let (X , N , ∗) be a fuzzy normed linear space, α ∈ (0, 1) and (xn ) be a sequence in X . The sequence (xn ) is said to be α-convergent if exists x ∈ X such that (∀)t > 0, (∃)n0 ∈ N : N (xn − x, t) > α, (∀)n ≥ n0 . α

In this case, x is called α-limit of the sequence (xn ) and we denote xn → x. Remark 8 It is obvious that any convergent sequence (xn ) is α-convergent, for all α ∈ (0, 1). Conversely, if a sequence (xn ) is α-convergent for all α ∈ (0, 1), then (xn ) is convergent. Proposition 4 Let (X , N , ∗) be a fuzzy normed linear space. A sequence (xn ) is α-convergent to a point x if and only if pα (xn − x) → 0, as n → ∞. Proof pα (xn − x) → 0 ⇔ (∀)t > 0, (∃)n0 ∈ N : pα (xn − x) < t, (∀)n ≥ n0 α

⇔ (∀)t > 0, (∃)n0 ∈ N : N (xn − x, t) > α, (∀)n ≥ n0 ⇔ xn → x.

Fuzzy Normed Linear Spaces

169

Definition 20 Let (X , N , ∗) be a fuzzy normed linear space, α ∈ (0, 1) and (xn ) be a sequence in X . Sequence (xn ) is said to be α-Cauchy if (∀)t > 0, (∃)n0 ∈ N : N (xn − xm , t) > α, (∀)n, m ≥ n0 . Proposition 5 Let (X , N , ∧) be a fuzzy normed linear space and α ∈ (0, 1). Then every α-convergent sequence is α-Cauchy. Proof Let (xn ) be an α-convergent sequence, converging to x. Then (∀)t > 0, (∃)n0 ∈ N : N (xn − x, t) > α, (∀)n ≥ n0 . Hence, for t > 0, we have that t t ≥ N (xn − xm , t) = N xn − x + x − xm , + 2 2 t t min N xn − x, , N x − xm , > α, 2 2 (∀)n, m ≥ n0 . Thus (xn ) is α-Cauchy.

7 Fuzzy Continuous Linear Operators in Fuzzy Normed Linear Spaces In this sections (X , N1 , ∗1 ), (Y , N2 , ∗2 ) will be fuzzy normed linear spaces with the continuous t-norms ∗1 , ∗2 . Definition 21 [3] A mapping T : X → Y is said to be fuzzy continuous at x0 ∈ X , if (∀)ε > 0, (∀)α ∈ (0, 1), (∃)δ = δ(ε, α, x0 ) > 0, (∃)β = β(ε, α, x0 ) ∈ (0, 1) such that (∀)x ∈ X : N1 (x − x0 , δ) > β we have that N2 (T (x) − T (x0 ), ε) > α. If T is fuzzy continuous at each point of X , then T is called fuzzy continuous on X . Theorem 11 [29] Let T : X → Y be a linear operator. Then T is fuzzy continuous on X , if and only if T is fuzzy continuous at a point x0 ∈ X . Proof “⇒” It is obvious. “⇐” Let y ∈ Y be arbitrary. We will show that T is fuzzy continuous at y. Let ε > 0, α ∈ (0, 1). Since T is fuzzy continuous at x0 ∈ X , there exist δ > 0, β ∈ (0, 1) such that

170

S. N˘ad˘aban et al.

(∀)x ∈ X : N1 (x − x0 , δ) > β ⇒ N2 (T (x) − T (x0 ), ) > α. Replacing x by x + x0 − y, we obtain that (∀)x ∈ X : N1 (x + x0 − y − x0 , δ) > β ⇒ N2 (T (x + x0 − y) − T (x0 ), ) > α, namely (∀)x ∈ X : N1 (x − y, δ) > β ⇒ N2 (T (x) − T (y), ) > α. Thus T is fuzzy continuous at y ∈ Y . As y is arbitrary, it follows that T is fuzzy continuous on D(T ). Corollary 2 [29] Let T : X → Y be a linear operator. Then T is fuzzy continuous on X , if and only if (∀)ε > 0, (∀)α ∈ (0, 1), (∃)δ = δ( , α) > 0, (∃)β = β( , α) ∈ (0, 1)such that (∀)x ∈ X : N1 (x, δ) > β we have that N2 (T (x), ε) > α. Theorem 12 [29] A linear operator T : X → Y is fuzzy continuous on X , if and only if (∀)α ∈ (0, 1), (∃)β = β(α) ∈ (0, 1), (∃)M = M (α) > 0 such that (∀)t > 0, (∀)x ∈ X : N1 (x, t) > β ⇒ N2 (T (x), Mt) > α. Proof “⇐” Let ε > 0, α ∈ (0, 1) be arbitrary. Then there exist β = β(α) ∈ (0, 1), M = M (α) > 0 such that (∀)t > 0, (∀)x ∈ X : N1 (x, t) > β ⇒ N2 (T (x), Mt) > α. In particular, for t = Mε , we obtain N1 x, Mε > β ⇒ N2 (T (x), ε) > α. Then, for δ = Mε > 0, we obtain that T is fuzzy continuous on X . “⇒” We suppose that (∃)α0 ∈ (0, 1) such that (∀)β ∈ (0, 1), (∀)M > 0, (∃)t0 = t0 (β, M ) > 0, (∃)x0 = x0 (β, M ) ∈ X , N1 (x0 , t0 ) > β and N2 (T (x), Mt0 ) ≤ α0 . The set V0 = {y ∈ Y : N2 (y, t0 ) > α0 } is an open neighborhood of 0Y . We will prove that, for all neighborhood U of 0X , we have T (U ) V0 , which contradicts the fuzzy continuity of T at 0X . As {B(0, β, s)}β∈(0,1),s>0 is a fundamental system of neighborhoods of 0X , it is enough to show that for all β ∈ (0, 1), s > 0 we have T (B(0, β, s)) V0 . As M > 0 is arbitrary, we can chose s = Mt0 . We note that, for z0 = M1 x0 ∈ X , we have 1 t0 t0 = N1 = N1 (x0 , t0 ) > β. N1 z0 , x0 , M M M

Fuzzy Normed Linear Spaces

171

Hence z0 ∈ B 0, β, Mt0 . We will prove that T (z0 ) ∈ / V0 , namely N2 (T (z0 ), t0 ) ≤ α0 . Indeed, 1 N2 (T (z0 ), t0 ) = N2 T x0 , t0 = N2 (T (x0 ), Mt0 ) ≤ α0 . M Corollary 3 [29] A linear functional f : (X , N1 , ∗) → (C, N , ∧) is fuzzy continuous, if and only if (∃)β ∈ (0, 1), (∃)M > 0 such that (∀)t > 0, (∀)x ∈ X , N1 (x, t) > β ⇒ |f (x)| < Mt. Proof According to the previous theorem f is fuzzy continuous if and only if (∀)α ∈ (0, 1), (∃)β ∈ (0, 1), (∃)M > 0 such that (∀)t > 0, (∀)x ∈ X : N1 (x, t) > β ⇒ N (f (x), Mt) > α. But N (f (x), Mt) > α ⇔ N (f (x), Mt) = 1 ⇔ |f (x)| < Mt. Hence (∃)β ∈ (0, 1), (∃)M > 0 such that (∀)t > 0, (∀)x ∈ X , N1 (x, t) > β ⇒ |f (x)| < Mt. Corollary 4 [29] Let (X , N1 , ∗1 ), (Y , N2 , ∗2 ) be FNLSs and pα (x) := inf{t > 0 : N1 (x, t) > α}, α ∈ (0, 1), qα (x) := inf{t > 0 : N2 (x, t) > α}, α ∈ (0, 1). A linear operator T : X → Y is fuzzy continuous on X if and only if (∀)α ∈ (0, 1), (∃)β = β(α) ∈ (0, 1), (∃)M = M (α) > 0 such that qα (Tx) ≤ Mpβ (x), (∀)x ∈ X . Proof According to the previous theorem, (∀)α ∈ (0, 1), (∃)β = β(α) ∈ (0, 1), (∃)M = M (α) > 0 such that (∀)t > 0, (∀)x ∈ X : N1 (x, t) > β ⇒ N2 (T (x), Mt) > α. Thus, for x ∈ X , we have {t > 0 : N1 (x, t) > β} ⊆ {t > 0 : N2 (Tx, Mt) > α}.

172

S. N˘ad˘aban et al.

Hence inf{t > 0 : N1 (x, t) > β} ≥ inf{t > 0 : N2 (Tx, Mt) > α}, namely inf{t > 0 : N1 (x, t) > β} ≥ inf pβ (x) ≥

t M

> 0 : N2 (Tx, t) > α . Therefore

1 qα (Tx), ∀)x ∈ X . M

Corollary 5 [29] A linear functional f : (X , N1 , ∗) → (C, N , ∧) is fuzzy continuous, if and only if (∃)β ∈ (0, 1), (∃)M > 0 such that |f (x)| ≤ Mpβ (x), (∀)x ∈ X . Remark 9 We note that a subset A of a topological linear space X is said to be bounded if for every neighbourhood V of 0X , there exists a positive number k such that A ⊂ k V . A linear operator T : X → Y is said to be bounded if T maps bounded sets into bounded sets. Based on this remark the following definitions are natural. Definition 22 [37] A subset A of X is called fuzzy bounded, if (∀)α ∈ (0, 1), (∃)tα > 0 such that A ⊂ B(0, α, tα ). Definition 23 [37] A linear operator T : X → Y is said to be fuzzy bounded if T maps fuzzy bounded sets of X into fuzzy bounded sets of Y . We must note that the following result was established by Sadeqi and Kia [37] for fuzzy normed linear spaces of type (X , N , ∧) which satisfy (N 7) and extended by N˘ad˘aban [29] in a more general fuzzy settings. Theorem 13 Let T : X → Y be a linear operator. The following sentences are equivalent: 1. T is fuzzy continuous; 2. T is topological continuous; 3. T is fuzzy bounded. Finally, we note that in paper [3], Bag and Samanta introduced different types of continuities and boundednesses for linear operators and they established the principles of fuzzy functional analysis.

8 Conclusions The development of the theory of fuzzy normed linear spaces has continued. Thus, after the definition of fuzzy normed linear spaces, the fixed point theory on fuzzy normed linear spaces constitutes an attraction for many mathematicians, which obtained fuzzy version of some classical fixed point theorem in the context of fuzzy normed linear spaces. We mention only some important papers [4, 5, 31, 34, 40, 43].

Fuzzy Normed Linear Spaces

173

From the concept of fuzzy normed linear space to the notion of fuzzy normed algebra it is one step that should be done. Recently, based on the concept of fuzzy normed algebra, introduced by Mirmostafaee and Kamel [26] in 2012, Bînzar, Pater and N˘ad˘aban established in paper [7] a characterization for continuous product in fuzzy normed algebra and they proved that any fuzzy normed algebra is with continuous product. In this context we also underline the paper [36] of Sadeqi and Amiripour. At the same time, various applications have begun to appear. We only mention the paper [37] of Sadeqi and Kia, where an application of fuzzy operator theory for image compression is presented, and the paper [33] of N˘ad˘aban and Dzitac, where the concept of atomic decomposition of fuzzy normed linear spaces is introduced, building, in this way, a fertile ground to study, in further papers, the fuzzy wavelet theory. The results presented in this chapter leave to be foreseen that the development of fuzzy operator theory on fuzzy normed linear spaces can be proven to be a powerful tool for many applications.

References 1. C. Alegre, S. Romaguera, Characterizations of fuzzy metrizable topological vector spaces and their asymmetric generalization in terms of fuzzy (quasi-)norms. Fuzzy Sets Syst. 161, 2182–2192 (2010) 2. T. Bag, S.K. Samanta, Finite dimensional fuzzy normed linear spaces. J. Fuzzy Math. 11, 687–705 (2003) 3. T. Bag, S.K. Samanta, Fuzzy bounded linear operators. Fuzzy Sets Syst. 151, 513–547 (2005) 4. T. Bag, S.K. Samanta, Fixed point theorems on fuzzy normed linear spaces. Inform. Sci. 176, 2910–2931 (2006) 5. T. Bag, S.K. Samanta, Some fixed point theorems in fuzzy normed linear spaces. Inform. Sci. 177, 3217–3289 (2007) 6. T. Bag, S.K. Samanta, A comparative study of fuzzy norms on a linear space. Fuzzy Sets Syst. 159, 670–684 (2008) 7. T. Bînzar, F. Pater, S. N˘ad˘aban, On fuzzy normed algebras. J. Nonlinear Sci. Appl. 9, 5488–5496 (2016) 8. S.C. Cheng, J.N. Mordeson, Fuzzy linear operator and fuzzy normed linear spaces. Bull. Calcutta Math. Soc. 86, 429–436 (1994) 9. R.K. Das, B. Mandal, Fuzzy real line structure and metric space. Indian J. pure appl. Math. 33, 565–571 (2002) 10. D. Dubois, H. Prade, Operations on fuzzy numbers. Int. J. Syst. Sci. 9, 613–626 (1978) 11. D. Dubois, H. Prade, Fuzzy Sets and Systems: Theory and Applications (Academic Press, London, 1980) 12. I. Dzitac, The fuzzyfication of classical structures: A general view. Int. J. Comput. Commun. Control 10, 772–788 (2015) 13. I. Dzitac, F.G. Filip, M.J. Manolescu, Fuzzy logic is not fuzzy: World—Renowned computer scientist Lotfi A. Zadeh. Int. J. Comput. Commun. Control 12, 748–789 (2017) 14. C. Felbin, Finite dimensional fuzzy normed liniar space. Fuzzy Sets Syst. 48, 239–248 (1992) 15. A. George, P. Veeramani, On some results in fuzzy metric spaces. Fuzzy Sets and Syst. 64, 395–399 (1994)

174

S. N˘ad˘aban et al.

16. I. Gole¸t, On generalized fuzzy normed spaces and coincidence point theorems. Fuzzy Sets and Syst. 161, 1138–1144 (2010) 17. V. Gregori, S. Romaguera, Some properties of fuzzy metric space. Fuzzy Sets Syst. 115, 485– 489 (2000) 18. Hadži´c, O, Pap, E, in Fixed Point Theory in Probabilistic Metric Spaces, vol. 536. Mathematics and Its Applications (Kluwer, Dordrecht, 2001) 19. M. Janfada, H. Baghani, O. Baghani, On Felbin’s-type fuzzy normed linear spaces and fuzzy bounded operators. Iran. J. Fuzzy Syst. 8, 117–130 (2011) 20. O. Kaleva, S. Seikkala, On fuzzy metric spaces. Fuzzy Sets Syst. 12, 215–229 (1984) 21. A.K. Katsaras, D.B. Liu, Fuzzy vector spaces and fuzzy topological vector spaces. J. Math. Anal. Appl. 58, 135–146 (1977) 22. A.K. Katsaras, Fuzzy topological vector spaces I. Fuzzy Sets Syst. 6, 85–95 (1981) 23. A.K. Katsaras, Fuzzy topological vector spaces II. Fuzzy Sets Syst. 12, 143–154 (1984) 24. E.P. Klement, R. Mesiar, E. Pap, in Triangular Norms. (Kluwer, 2000) 25. I. Kramosil, J. Michálek, Fuzzy metric and statistical metric spaces. Kybernetica 11, 326–334 (1975) 26. A.K. Mirmostafaee, A. Kamel, Perturbation of generalized derivations in fuzzy Menger normed algebras. Fuzzy Sets Syst. 195, 109–117 (2012) 27. M. Mizumoto, J. Tanaka, Some properties of fuzzy numbers, in Advances in Fuzzy Set Theory and Applications, ed. by M.M. Gupta, et al. (North-Holland, New York, 1979), pp. 153–164 28. S. N˘ad˘aban, Fuzzy euclidean normed spaces for data mining applications. Int. J. Comput. Commun. Control. 10(1), 70–77 (2015) 29. S. N˘ad˘aban, Fuzzy continuous mappings in fuzzy normed linear spaces. Int. J. Comput. Commun. Control 10, 834–842 (2015) 30. S. N˘ad˘aban, Fuzzy pseudo-norms and fuzzy F-spaces. Fuzzy Sets Syst. 282, 99–114 (2016) 31. S. N˘ad˘aban, T. Bînzar, F. Pater, Some fixed point theorems for ϕ-contractive mappings in fuzzy normed linear spaces. J. Nonlinear Sci. Appl. 10, 5668–5676 (2017) 32. S. N˘ad˘aban, T. Bînzar, F. Pater, C. Terei, ¸ S. Hoar˘a, Katsaras’s type fuzzy norm under triangular norms. Theory Appl. Math. Comput. Sci. 5, 148–157 (2015) 33. S. N˘ad˘aban, I. Dzitac, Atomic decompositions of fuzzy normed linear spaces for wavelet applications. Informatica 25, 643–662 (2014) 34. R. Saadati, P. Kumam, S.Y. Jang, On the tripled fixed point and tripled coincidence point theorems in fuzzy normed spaces. Fixed Point Theory Appl. (2014). https://doi.org/10.1186/ 1687-1812-2014-136 35. R. Saadati, S.M. Vaezpour, Some results on fuzzy Banach spaces. J. Appl. Math. Comput. 17, 475–484 (2005) 36. I. Sadeqi, A. Amiripour, in Fuzzy Banach algebra. First Joint Congress on Fuzzy and Intelligent Systems, Fedowsi University of Mashhad, Iran, 29–31 Aug 2007 37. I. Sadeqi, F.S. Kia, Fuzzy normed linear space and its topological structure. Chaos Solitons Fractals 40, 2576–2589 (2009) 38. B. Schweizer, A. Sklar, Statistical metric spaces. Pac. J. Math. 10, 314–334 (1960) 39. J.Z. Xiao, X.H. Zhu, On linearly topological structures and property of fuzzy normed linear space. Fuzzy Sets Syst. 125, 153–161 (2002) 40. J.Z. Xiao, X.H. Zhu, Topological degree theory and fixed point theorems in fuzzy normed spaces. Fuzzy Sets Syst. 147, 437–452 (2004) 41. L.A. Zadeh, Fuzzy Sets. Inform. Control 8, 338–353 (1965) 42. L.A. Zadeh, Is there a need for fuzzy logic? Inform. Sci. 178, 2751–2779 (2008) 43. J. Zhu, Y. Wang, C.C. Zhu, Fixed point theorems for contractions in fuzzy normed spaces and intuitionistic fuzzy normed spaces. Fixed Point Theory Appl. (2013). https://doi.org/10.1186/ 1687-1812-2013-79

A Calculation Model of Hierarchical Multiplex Structure with Fuzzy Inference for VRSD Problem Kewei Chen, Fangyan Dong and Kaoru Hirota

Abstract A concept of the VRSD problem (Vehicle Routing, Scheduling and Dispatching Problem) and its formularization are proposed in order to bridge the gap between conventional methods and complex situations in the real world. A HIMS model (a calculation model with HIerarchical Multiplex Structure), is also proposed for the VRSD problem. It contains 3 levels: Atom level (system energies can be controlled by a heuristic method), Molecular level (system properties can be adjusted by a heuristic method and an optimum calculation), and Individual level (system architecture can be modified by fuzzy inference). The experimental results and the evaluations by experts from practice show that the HIMS model provides a feasible, fast, and efficient tool for the application in the real world VRSD problem. Keywords Vehicle routing · Scheduling · Heuristics search · Fuzzy inference · Object-oriented programming

1 Introduction Vehicle routing, scheduling, and dispatching problems have grown to be an important research field in the past 20 years. Many researchers [1, 2] have been trying to answer fundamental research questions, utilizing SA [3, 4], Tabu search [5, 6], GA [7, 8]. Although applications [9, 10] have also been developed, they are not sufficient because there are many gaps between conventional methods and complex situations K. Chen (B) Ningbo University, No.818 Fenghua Road, Ningbo, Zhejiang, China e-mail: [email protected]; [email protected] F. Dong Ningbo Intelligent Manufacturing Industry Research Institute, Ningbo City, China e-mail: [email protected]; [email protected] K. Hirota Beijing Institute of Technology, No. 5 Zhongguancun South St., Haidian Dist.„ Beijing, China e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2020 S. N. Shahbazova et al. (eds.), Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing 391, https://doi.org/10.1007/978-3-030-38893-5_9

175

176

K. Chen et al.

in the real world such as loading capacity, working balance, and time service. Furthermore, the requests for shortening the lead-time in the physical distribution are also submitted increasingly and the growth of personal computers and geographic information systems (GIS) makes it possible to solve such practical problems (large-scale, multi-objectives planning) in flexible and feasible ways. In this article, the VRSD problem and its formulization defined by fuzzy set theory are proposed in 2. The structure of the HIMS model and its operation criteria are mentioned in 3. Experimental results and their evaluation are elaborated in 4.

2 VRSDP/SD Problem and Its Formulation In this section, the VRSDP/SD problem is introduced for daily delivery activities in the real world. It is a comprehensive problem consisting of three sub-problems. There are many practical applications for the VRSDP/SD problem such as transporting food and drink product by truck and oil product delivery by tank lorry.

2.1 Description of the VRSDP/SD Problem In this subsection, a real world daily delivery problem based on the VRSD problem is described. A depot D (a delivery center which holds all goods in stock) has L vehicles {Vl } for delivery, and there are several types of vehicles with different capacity. When we get M orders {Om }, needs for some goods in a certain time window, from N users {Un }, consumers of some orders and each user’s parking space for a delivery vehicle, business hours and so on are different from others, a delivery schedule must be made for all of these M orders by R vehicles of various types until the next day. The problems are how a plan can be made with optimum routing and efficient scheduling for all delivery jobs, and with suitable dispatching for vehicles of various types while satisfying some constraints (Fig. 1). In practice, human experts usually make Q trips {Xq }, jobs which transport some orders in a round route from a depot, and construct R tours {Yr }, sets of trips which are carried out by a certain vehicle in a day, from trips. Then, they try to dispatch these tours to vehicles by considering their types (capacity). Repeating these operations by referring the initial working information of the next day and by using some knowledge base (experiences, users’ and depot’s conditions, geographic information, and so on), they manage to make a better (not necessarily optimum) plan for the next day’s delivery jobs.

A Calculation Model of Hierarchical Multiplex Structure …

177

constants U4

Depot

O103

U2

User

O102

Xq

X2 O101

U2

Order

Un

U3

Om

Yr

V

V

variables

X1

Xq

O2 XQ U1

T rip

Yr Tour

conditions

UN

O1

OM

V1

Y1

Vehicle

YR

VL

Fig. 1 General description for daily delivery jobs

2.2 Constants and Variables in VRSD Problem Table 1 shows the constant set in the VRSD problem. A detailed definition of each constant is as follows. D = ([Bt D , Et D ])

(1)

Here, BtD (the Beginning Time of Depot) and EtD (the End Time of Depot) are integer values of {0,1,…,1440 (= 24*60)} in minutes. The integer interval [BtD , EtD ] is a time window during which loading works can be done by vehicles. Un = ([Btn , Etn ], SUn )

(2)

The integer interval [Btn , Etn ] is a time window of the business hours and SUn is a max size of the vehicles that can enter in Un . Om = (Cm , [Btm , Etm ], Unm ), Cm ≤ SUnm Table 1 Constant set in VRSD problem

(3)

Item

Symbol

Number

Universal set

Depot

D

1

D

User

Un

N

U = {U1 , . . . , Un , . . . , UN }

Order

Om

M

O = {O1 , . . . , Om , . . . , OM }

Vehicle

Vl

L

V = {V1 , . . . , Vl , . . . , VL }

178

K. Chen et al.

Table 2 Variables used in VRSD problem

Item

Symbol

Number

Universal set

Trip

Xq

Q

X = {X1 , …, Xq , …, XQ }

Tour

Yr

R

Y = {Y1 , …, Yr , …, YR }

The Cm indicates the capacity of an order Om by a user Um n and the integer interval [Btm , Etm ] is a desired time window for delivery. Vl = (SVl , RW tl , E W tl )

(4)

Equation (4) shows each vehicle information used in the delivery plan, where SVl is a max capacity that the vehicle can load, RWtl is a restriction time until when Vl must work full, and EWtl is a max extended time until when Vl can work more. The SUn , Cm, and SVl have positive integer values of the same unit, a weight of solid or a volume of liquid. The Cm can be an optional integer value, while the SUn and SVl are some fixed integer values correspond to types of vehicles. Table 2 shows the variables for trips and tours in the VRSD problem. They represent a solution to the VRSD problem. The detailed definition of each variable is as follows; Jm = (Om , Unm ), q

q

q

(5) q

X q = ([D, Jm1 , Jm2 , . . . , Jm K q , D], Yr , kq ); q K q ≥ 1, 1 ≤ kq ≤ K r ,

(6)

r r Yr = ([X q1 , X q2 , . . . , X qr Kr ], Vlr ); K r ≥ 1.

(7)

where, Jm is a sub-job with order/user pair (a total number is Kq ) which represents order information and related user’s constraints, Yqr is the tour to which Xq belongs, kq indicates an index number of Xq in the tour Yqr , and Kqr is a total number of trips q r in (7) shows a tour carried out by vehicle Vl in a day. Where, the tour Yr . Equation Xrq1 , Xrq2 , . . . , XrqKr is a sequence of trips in Yr . All these constant information described above are given in advance, so a solution represented by {X} and {Y} to a certain initial information can be calculated based on both of these constant information and some constraint conditions, which are discussed in the next Sect. 2.3.

2.3 Constraints for VRSD Problem There are some constraints that must be satisfied in the VRSD problem.

A Calculation Model of Hierarchical Multiplex Structure …

179

(1) Vehicle Capacity

q Ccnst

=

⎧ ⎪ ⎨

0 if

Kq

q

q

Cm j ≤ SVl

j=1 ⎪ ⎩ 1 other wise

Ccnst =

Q

q

Ccnst = 0

(8)

(9)

q=1

Equations (8) and (9) represent the constraint to vehicles that the total capacity of orders in a trip cannot be over the capacity of the corresponding vehicle. (2) User Condition q Ucnst

=

q

f or ∀ j = 1, . . . , K q

mj

0 i f SVl ≤ SUn 1 other wise Ucnst =

Q

q

Ucnst = 0

(10)

(11)

q=1

Equations (10) and (11) represent a constraint about relations between vehicles and users. That is, a vehicle size must be equal to or less than the max size that can enter in all users’ parking spaces in a trip. (3) Depot Condition q Dcnst

=

q

0 i f Bt D ≤ At D ≤ Et D 1 other wise

Dcnst =

Q

q

Dcnst = 0

(12)

(13)

q=1

Equations (12) and(13) indicate that all loading works must be done while the q depot is open. Where, AtD is the time when the vehicle of a trip Xq arrives at the depot. (4) Dispatching Condition

N ≤ M;

R ≤ L;

R≤Q

(14)

180

K. Chen et al.

The constraints in Eq. (14) indicate: (I) The user must have at least one order in a day; (II) The number of tours (R) must be equal or less than the total number of vehicles (L). (III) Each tour must have at least one trip.

2.4 Evaluation for VRSDP/SD Problem (1) Total Working Time

Tcost =

Q

q

Tcost ⇒ min

(15)

q=1 q

q

q

q

Tcost = TRunning + TLoading + TU nloading

(16)

K q −1 q

q

TRunning = t D,Jm1 +

q

q

t Jmi ,Jmi+1 + t Jm Kq ,D

(17)

i=1 q TLoading

=

Vl t D,r eady

+ TD ∗

Kq

Cmi

(18)

i=1 q

TU nloading =

Kq

Jmi (twait + TU ∗ Cmi )

(19)

i=1

Equation (15) defines the total working cost. Equation (16) is the working cost of Xq , which have 3 parts of running cost Eq. (17), loading cost Eq. (18), and unloading cost Eq. (19). TD and TU are the coefficients about loading and unloading cost of unit order. The twait is the cost including addition cost of running (from user entrance to unloading spot) and unloading cost of order Omi . (2) Average Loading Capacity

Ccost

⎛ ⎞ Q q =⎝ Ccost ⎠/Q ⇒ max q=1

(20)

A Calculation Model of Hierarchical Multiplex Structure …

q

Ccost

181

⎞ ⎛ Kq q q SVl (∈ [0, 1]) =⎝ Cm j ⎠

(21)

j=1

Equations (20) and (21) represent an average loading capacity rate for all trips. q Higher value is always preferable for transporting efficiency. Where, Cmj is a capacity q q of an order Omj in Xq , and SVl is the capacity of the vehicle carrying out the tour q Yr . (3) Working Balance

Bcost =

R

r mean |Bcost − Bcost |

R ⇒ min

(22)

r =1 r Bcost =

Kr

qi

Tcost

i=1

mean Bcost =

R

(23)

r Bcost

R

(24)

r =1

The working balance of each vehicle is also considered, which is important for equality in a labor condition among all drivers. The Brcost is a working time for a tour qi Yr , and Tcost is a working time of a trip Xrqi and is defined by Eq. (16). (4) Working Capacity

Vcost = L − R ⇒ max

(25)

To decrease the transport cost, a suitable (less) number of vehicles would like to be used according to the number of orders. This Eq. (25) is called the working capacity. (5) Service 1 (User Time Service)

ST =

Q

q

ST /Q ⇒ max

(26)

q=1 q ST

=

Kq j=1

j μm o

Kq

(27)

182

K. Chen et al.

Fig. 2 Fuzzy membership function for service 1

O

m

1.0

0.3 t Bt n

Bt m

Et m

Et n

An order from a user usually has an appointed time for delivery, which must be satisfied as much as possible. μm o is assumed as the satisfaction degree of the user about the delivered time of Om , which can be calculated by a fuzzy memberUm n ship function shown in Fig. 2. Although Eq. (26) must not be always satisfied, it is preferable to improve this value as much as possible. (6) Service 2 (Driver Working Service) In a modern society, the driver’s labor condition has also to be considered, which is called service 2, the constraints about vehicle working, which is formulated by Eq. (28). A driver’s labor condition for a vehicle Vl is represented as a fuzzy membership function μlV (Fig. 3). Also, this constraint is not so important as in the case with service 1, but it can be improved by considering a trade off between the number of used vehicles and the setting value of restriction working time. SV =

L

μlV

L ⇒ max

(28)

l=1

Fig. 3 Fuzzy membership function for service 2

V

1.0 restriction

RWt

extension

EWt 0.2

t

A Calculation Model of Hierarchical Multiplex Structure …

183

2.5 Objectives Solving VRSDP/SD Problem The solution of the VRSDP/SD problem has to be examined based on the following criteria: (1) (2) (3) (4) (5) (6) (7)

Total working time ⇒ Min Average loading capacity ⇒ Max Working balance ⇒ Min Working capacity ⇒ Max Service 1 ⇒ Improved Service 2 ⇒ Improved Constraints ⇒ Satisfied.

3 HIMS Calculation Model The VRSD problem has been studied and discovered that there are several levels in which one of the main elements (orders, trips, tours) is very active. This fact is utilized to make a calculation model of HIerarchical Multiplex Structure for the VRSD problem, which is called the HIMS model described in Fig. 4. There are three levels in the HIMS model: The Atomic level is an active area of system energy (e.g. running cost). The Molecular level is a reflection area of system properties (e.g. the base jobs). Individual level is a forming area of a system architecture (a state with best planning), in which the objective character can be adjusted exactly by means of inference from knowledge base.

V1

D U n1 (1)

O m1

U nq O mq

Atom Level

V

D U n1 (1)

O m1

U nq O mq

D

D Molecular Level

D U n1

D

(k )

O m1

U nq O mq

D

Individual Level

VL

D U n1 (1)

O m1

U nq O mq

D

Fig. 4 A concept of hierarchical multiplex structure

D

184

K. Chen et al.

Many experts who handle the VRSD problem in the real world also make plans in hierarchical procedures. For example, they first make trips from orders and construct tours from trips, and then make a qualitative analysis of the relations among vehicletour, tour-trip, and so on. They make these processes iteratively until an ideal objective value is satisfied. The HIMS model imitates such a hierarchical calculation and reasoning properties of human experts and solves a VRSD problem by dividing it into sub-problems, each of which is solved separately in a different level. It is a flexible and efficient calculation model. In this section the operating strategy is described in each level of the HIMS model and HIMS model is constructed by an object-oriented programming. The formulas are also defined for the evaluation and selection of all objectives, and an optimization algorithm based on meta-programming is proposed.

3.1 Strategy in Atomic Level of HIMS The main purpose in Atomic level is to make trips. The criteria for operations are: (1) Minimizing the working time. (2) Improving the loading capacity. (3) Improving a satisfactory degree of Service 2. T S P(X q ) = ∃ (D, [Jm1 , Jm2 , . . . , Jm K q ]∗ , D) ⇒ minTcost q

q

q

q

(29)

At first, M trips (one order in one trip) are made as an initial state, and the order Move/Exchange operations are performed between trips. After each operation, the shortest route calculation [TSP: Traveling Salesman Problem (Eq. 29)] is done for the updated trips and the feasible solution is searched, which can decrease the working time. The operations and criteria in Atomic level are described in Fig. 5. Fig. 5 Operations and criteria in atomic level

A Calculation Model of Hierarchical Multiplex Structure …

185 Job Chart

Btq1 q1

D Un1

O m1

q1

Un2 O m2

q1

Un3 O m3

q1

Un4 O m4

D

Et’q1

X Tq1

X’q1 Btq2 q2

D Un1

O m1

q2

Un2 O m2

q2

Un3 O m3

Etq2

Et’q2 Tq2

Xq2

D

Btq1 q1

Om1

q1

Un2 O m2

q1

Un3 O m3

q1

Un4 Om4

D

Et’q1

D Un1

Om1

q2

Un2 O m2

q2

Tq

X’q1

Un3 O m3

q2

Un4 Om4

D

Etq1

Xq1

Btq2 q2

T ’cost < Tcost C ’cost = Ccost

X’q2

D Un1

Etq1

Et’q2

Etq2

Xq2 X’q2

Tq

T ’cost

< Tcost

C ’cost

< Ccost

Fig. 6 Operating examples in atomic level

The heuristic method like Tabu search [5, 6, 13] is used for order Move/Exchange operations, and SA (Simulated Annealing) [11, 12] for shortest routing as an optimization algorithm in Atomic level. Tabu search can be used as a feasible method for the trip construction and SA as a fast and efficient method for searching a shortest route in a trip. Figure 6 shows operating examples, which can decrease the working time and raise the loading capacity in the VRSD problem.

3.2 Strategy in Molecular Level In Molecular level, the aim is to construct tours. Because there are different types of vehicles, the trip Move/Exchange operations have to be performed for an efficient job scheduling and vehicle working balance. The criteria of operations are: (1) Improving Loading Capacity. (2) Improving Working Balance. (3) Improving a satisfactory degree of Service 1. As an initial state, L tours (the same as the number of working vehicles) are made from trips (constructed in Atomic level). Then, the trip Move/Exchange operations between tours can be done. After each operation, the updated tour will be readjusted in

186

K. Chen et al.

order to rationalize trips’ job schedule in it. The operations and criteria are described in Fig. 7. Tabu search is also used for trip Move/Exchange operations similarly in Atomic level. Furthermore, a global search is also used for improving Service 1 because the number of trips is not so many in a tour. As noted in the previous subsection, the Tabu search is also a feasible method for tour construction, and the recursive programming for a global search is useful to adjust time schedule of each tour for the rational state. Figure 8 shows the examples in Molecular level.

Fig. 7 Operations and criteria in molecular level Job

Ya

Yb

X a1

X b1

X a2

X a3

Ya

AM

Y’

AM

a

Yb

X b2

Y ’b C’

Ya

Yb

X a1

X b1

X a2

X b2

X a3

X b3

Chart

AM

Δ

A1

PM

A1

AM

C, B’ A1

AM

Y’

A1

AM

Yb

P1

Y ’b

X a2

X a3

PM

Ya

AM AM

Δ T Δ

PM PM

AM

Y ’a

PM

B, S ’ T

PM P1

ST

P1 PM

S T’

Fig. 8 Operating examples in molecular level

SV

P1

’

X a1

S T , S ’V AM

B’

Ya

Δ T

PM

B, S ’ T

Ya a

PM PM

ST

A Calculation Model of Hierarchical Multiplex Structure …

187

3.3 Fuzzy Inference in Individual Level The aims of operations in this level are as follows: (1) Improving working balance. (2) Assigning appropriate number of vehicles. The trip and vehicle balance are important elements for vehicle dispatching with different type. The trip balance (trips assignment for different type vehicles) is involved in delivery efficient and the vehicle balance (vehicles allotment within the same type) has the influence in delivery cost. In individual level, the experience of experts is absorbed and the technology of fuzzy inference is used for improving this two operations. Figure 9 is a fuzzy membership function for trip balance operation to different vehicle types. The datum lines of trip balance, μL (lower boundary) and μU (upper boundary), are given. μT (the current value of trip balance) can be calculated and is used to decide whether to adjust trip balance or not. Figure 10 is also a fuzzy membership function for vehicle balance operation in same type. The datum lines of vehicle balance, λL (lower boundary) and λU (upper boundary), are also given. λV (the current value of vehicle balance) can be got and is used to decide whether to adjust vehicle balance or not. Figure 11 represents the inference criteria and operations in Individual level. By trips ± and vehicles ± operations, not only the relations among trips and tours can be adjusted, but also the vehicles with different type can be dispatched appropriately. The role of Individual level is like a central control system, which can adjust the balance between {X} and {Y}, and invoke the operations of lower level towards a better going. Figure 12 shows examples of inference operations in order to adjust the trip working balance and the vehicles balance. Fig. 9 Trip balance operation

T

1.0 N (X A) N (X B) U

/2 optimal state

0.5

L

/2 No

Op

er

ati

ng

N (X A) N (X B)

T Acost / T Bcost

0.0 L

1.0

U

188

K. Chen et al.

Fig. 10 Vehicle balance operation

1.0

V

N (YA) U

/2 optimal state

0.5

L

/2 No

a er Op

g tin

N (YA) T

0.0 L

1.0

A cost

/T

A R

U

Fig. 11 Fuzzy inference operations

3.4 Evaluation and Selection In this subsection, the evaluation of the temporary state in the HIMS model is introduced. The selection criteria are also described for local optimum state of {X} and {Y}.

3.4.1

Evaluation of Objective State

(1) Evaluation of Total Working Cost [k] g1 (X k ) = 1 − Tcost

[0] Tcost

∈ [0, 1]

(30)

(2) Evaluation of Average Loading Capacity [k] g2 (X k ) = Ccost

∈ [0, 1]

(31)

A Calculation Model of Hierarchical Multiplex Structure …

189

Fig. 12 Inference examples in individual level

(3) Evaluation of Working Balance [k] g3 (Y k ) = 1 − Bcost

[0] Bcost

∈ [0, 1]

(32)

(4) Evaluation of Working Capacity [k] ) g4 (Y k ) = 1 − 1 (1 + Vcost

∈ [0, 1]

(33)

(5) Evaluation of Service 1 g5 (X k ) = ST[k]

∈ [0, 1]

(34)

g6 (Y k ) = SV[k]

∈ [0, 1]

(35)

(6) Evaluation of Service 2

Equations (30)–(35) describe the evaluation criteria of each objective state, where the superscript [0] indicates an initial value of each objective state and the superscript [k] represents a state value of kth generation (a temporary result). The higher value of gx becomes, the better the objective state goes.

190

3.4.2

K. Chen et al.

Total Quality Evaluation

g(X k , Y k ) =

6

ρi gi (X k |Y k ); ρi ∈ R +

(36)

i=1

Equation (36) describes a general evaluation formula for all objective items, where ρi is a weighted coefficient which represents an importance of each objective item.

3.4.3

Selection of Local Optimum State

eval(k) = g(X ∗ , Y ∗ ) =

g(X k , Y k ) − g(X 0 , Y 0 ) g(X ∗ , Y ∗ ) − g(X 0 , Y 0 )

(37)

g(X k , Y k ) i f eval(k) > 1 g(X ∗ , Y ∗ ) otherwise

s.t.g1 ≥ a1 , g2 ≥ a2 , g3 ≥ a3 , g4 ≥ a4 , g5 ≥ a5 , g6 ≥ a6

(38)

Equation (37) defines a fitness of the HIMS model, which can be used to capture a local optimum state of the VRSDP/SD problem. Equation (38) indicates how to make a decision to record the better state of X and Y, where ai (i = 1,…,6) is the lowest goal value for each objective item.

3.5 The Implementation of HIMS Model A data structure for the HIMS model, which is suitable for object-oriented programming, is proposed. An optimization algorithm with multiplex heuristic method is also described, which is proved to be efficient for the VRSDP/SD problem in the next Sect. 3.5.1. 3.5.1

Data Structure of HIMS Model

Figure 13 shows the a data structure for the HIMS model. The object-oriented programming technique is adopted to implement hierarchical calculations of three levels. Through object classes and doubly linked pointers, the main elements in the HIMS model have hidden link relations among each other so that the information can be got directly. The operations in each level (Atomic, Molecular, and Individual) are performed by a meta-programming and the fuzzy inference. The elements operated in each level are a sequence of data object, i.e., an order sequence in Atomic level, a

A Calculation Model of Hierarchical Multiplex Structure …

191

Construction of HIMS -- (Using Object-oriented Programming) Individual

Molecular Level

Fuzzy Inference

Tabu Search / Global Search

V1 V2

Y1 Y2

X11 X21

V

Yr

Xr1

V

Y

r-1

XR 1

VL

YR

XR1

L-1

Tabu Search / Simulated annealing

X22

X1 X2

O21

Xr2

Xq

Oq1

Oq2

X

O

O

XQ

OQ1

-1

Q-1

XR1

Tour Object Vehicle Object

Atomic Level

Trip Object Depot Object

O11

Q-11

O1 O2

O12

Q-12

Oq3

Om O

M-1

OM Order User Object

Initialize Base Object

Fig. 13 The data structure of HIMS model

trip sequence in Molecular level, and a tour sequence in Individual level, respectively. These elements have hierarchical link relations by doubly linked pointers. The data structure for the HIMS model serves us to resolve a synthetic problem into sub-problems through hierarchical level and multiplex data structures. Consequently, the NP problems can be solved in lower levels by meta-programming and the relations between elements in upper levels can be adjusted by fuzzy inference. It is a fast, efficient, and flexible solver for the VRSDP/SD problem.

3.5.2

Optimization Algorithm of HIMS

The optimization algorithm for the data structure described in the previous subsection is proposed. Figure 14 shows its flow chart. The heuristic method like Tabu search is used to construct {X} in Atomic level (routing problem) from the information of {O}, {U}, and D. Through the information of {X} and {V}, {Y} can also be made by Tabu search in Molecular level (scheduling problem). Based on the analysis to the relations among {X}, {Y}, and {V}, a synthetic adjustment is done by fuzzy inference in Individual level (dispatching problem). As the result of these operations, the best state for the VRSDP/SD problem can be obtained. The calculation is evolved step by the repetition of four processes shown in Fig. 14. A set of these four processes is considered as a generation of the best state. In first process a search is made in atomic level NA (k) times to get a local optimum set of trips. Then a set of tours are constructed and adjusted with NM (k) times search

192

K. Chen et al.

Fig. 14 Algorithm of HIMS model

N(O) N(U)

O Trip X (control)

N(D) = 1

U D

N(X)

Tour Y (adjust) g(Y)

N(Y) N(V)

V

g(X)

Measuring (X,Y) Selection (Opt*)

Fuzzy Inference (modify)

Knowledge Base

from the result of the trip set in the second process. When having got a new local optimum state of X and Y, it is evaluated by Eq. (36) and then a decision is done whether the local optimum state is selected or not by Eqs. (37) and (38) in the third process. In the fourth process, the fuzzy inference is performed to readjust the balance between {X} and {Y}, and turn to the next generation. These processes are repeated until there is no change to the direction of the best state (X* , Y* ).

4 Experiments and Discussion The component of the HIMS model is applied to the test data set used in an actual oil company, where 39 gasoline stations in Izu peninsula area is serviced from an oil center by a transport company with 10 tank lorries. A database is constructed to store the static information such as order/user, vehicle, depot, and base/initial information. The intermediate states of all objects are stored as tabular forms in the database. The transportation network in the Izu peninsula of Japan are also known by a digital map (GIS), from which the cost information can be calculated by Dijkstra algorithm. In this section, two experiments are introduced. Then, the efficiency and flexibility of the HIMS model for the VRSDP/SD problem are evaluated from several viewpoints. Experiments are performed as follows. First, the vehicle working condition and some initial information are given as input, and then a calculation is done until 200 Table 3 Parameters used in HIMS model Lowest value Datum value

a1

a2

a3

a4

a5

a6

0.2

0.7

0.2

0

0.8

0.6

μL

μU

λL

λU

0.9

1.1

0.9

1.1

A Calculation Model of Hierarchical Multiplex Structure …

193

Table 4 Initial values of objective functions [0] Tcost (min)

[0] Ccost (%)

[0] Bcost (min)

[0] Vcost (unit)

[0] ST (%)

[0] SV (%)

7937

50.65

107.4

0

87.4

13.26-#1

57.03-#2

#1: value for exp. (1); #2: value for exp. (2)

Fig. 15 Job chart of initial state for trips and tours

generations. Table 3 shows the parameters setting in the experiments. The {ai (i = 1~6)} are the desired lowest values for objectives defined in Eqs. (30)–(35). Table 4 shows the initial values of six objective functions defined by Eqs. (15)– (28). The visual job chart (Fig. 15) corresponds to the initial state of trips and tours in the case of Table 4.

4.1 Experiment (1): Working with All Vehicles In this experiment, the restriction time (RWt) is set to 8 h and the max extension time (EWt) to 4 h. The aim of this experiment is to make a plan with all vehicles working. Table 5 shows the experimental result when the weighted coefficients {ρi } are different from each other. The job chart in this case is shown in Fig. 16. The best state is recorded in 161th generation. Table 6 shows the experimental result when uniform weighted coefficients {ρi } are used. The job chart in this case is shown in Fig. 17. The best state is recorded in 169th generation.

194

K. Chen et al.

Table 5 Result of experiment (1) with different ρi ’s T[∗] cost (min)

C[∗] cost (%)

B[∗] cost (min)

[∗] Vcost (unit)

S[∗] T (%)

S[∗] V (%)

5250

95.57

28.40

0

98.74

86.26

ρ1

ρ2

ρ3

ρ4

ρ5

ρ6

g*

800

300

100

0

100

100

816.12

Fig. 16 Job chart of exp. (1) with different ρi ’s Table 6 Result of experiment (1) with uniform ρi ’s T[∗] cost (min)

C[∗] cost (%)

B[∗] cost (min)

[∗] Vcost (unit)

S[∗] T (%)

S[∗] V (%)

5371

91.64

24.54

0

98.8

82.94

ρ1 = ρ2 = ρ3 = ρ5 = ρ6 = 100, ρ4 = 0

g* = 382.15

4.2 Experiment (2): Working with just Number In this experiment, the restriction time (RWt) is set to 12 h and the max extension time (EWt) to 2 h for all vehicles. A plan is made with less vehicles while keeping all criteria in higher state.

A Calculation Model of Hierarchical Multiplex Structure …

195

Fig. 17 Job chart of exp. (1) with uniform ρi ’s Table 7 Result of experiment (2) with different ρi ’s T[∗] cost (min)

C[∗] cost (%)

B[∗] cost (min)

[∗] Vcost (unit)

S[∗] T (%)

S[∗] V (%)

5095

93.91

14.69

3

96.88

93.25

ρ1

ρ2

ρ3

ρ4

ρ5

ρ6

g*

800

300

100

150

100

100

957.14

Table 7 shows the experimental result when the weighted coefficients {ρi } are different from each other. The corresponding job chart is shown in Fig. 18. The best state is recorded in 153th generation. Table 8 shows the experimental result when the uniform weighted coefficients {ρi } are used. The corresponding job chart is shown in Fig. 19. The best state is recorded in 145th generation.

4.3 Analysis of Experimental Results Two types of experiments have been done to evaluate the HIMS model for the VRSDP/SD problem. In this subsection, the calculating processes recorded through

196

K. Chen et al.

Fig. 18 Job chart of exp. (2) with different ρi ’s Table 8 Result of experiment (2) with uniform ρi ’s T[∗] cost (min)

C[∗] cost (%)

B[∗] cost (min)

[∗] Vcost (unit)

S[∗] T (%)

S[∗] V (%)

5143

95.92

9.67

3

96.96

91.25

ρ1 = ρ2 = ρ3 = ρ4 = ρ5 = ρ6 = 100

Fig. 19 Job chart of exp. (2) with uniform ρi ’s

g* = 485.33

A Calculation Model of Hierarchical Multiplex Structure …

197

Table 9 Comparison between HIMS and experts Operator

Working cost

Loading rate

1

HIMS

5250

Experts

5381

HIMS Experts

2

Working balance

Working capacity

Service one

Service two

95.57

28.40

0

98.74

86.26

87.32

43.20

0

83.26

81.63

5095

93.91

14.69

3

96.88

93.25

5273

85.62

58.34

1

85.68

77.26

job charts (this paper shows only the first and final charts because of the limitation of pape size) are analyzed in detail, and then the advantages of the HIMS model from the viewpoints of experimental results, algorithms, and system applications are discussed.

4.3.1

Discussion About Experimental Results

The two experiments are mainly for transport companies in some common case (full working when the needs is too higher, or partly working when a periodical inspection, repairs, etc. are considered). Table 9 shows the results got by HIMS model and experts in the two experiments. The results in six evaluation items by HIMS model are better than the results by experts, which are known in the conditions of Tables 5 and 7. By the way, HIMS is stronger in dispatching ability (working capacity), working balance, and loading capacity.

4.3.2

Discussion About Algorithms

Table 10 shows the difference about algorithms among the HIMS, conventional method, and expert technique. The HIMS model is constructed by object-oriented programming, in which all objectives and constraints is represented by a fuzzy set relation. The main elements are constructed as object classes so that all information of a certain object can be found directly. Therefore, it is convenient for meta-programming or fuzzy inference in the HIMS model. Table 10 Comparison with algorithms Algorithm

Process

Data type

Data structure

Optimal means

HIMS

Synthesis

Fuzzy [0,1]

Object-oriented

Meta + Fuzzy programming

Conventional

Individual

Crisp {0,1}

Linear

Meta or inter programming

Expert

Individual

Crisp {0,1}

Experience ——

198

K. Chen et al.

Table 11 Comparison with applications Application

Input parameters

Objective functions

Objective balance

Process time

Flexibility

HIMS

Few

Many

Good

Fast

Good

Conventional

Many

Few

Bad

Slow

Bad

The HIMS model use fuzzy inference based on the expert’s experiences so that the search of a new solution and the adjustment of the current balance can be done through intellectual and rational operations.

4.3.3

Discussion About System Application

Another interest to the experiments is how the HIMS model can be used as a software component to application system. Table 11 shows the advantages of HIMS model in the practical system applications. The HIMS model can solve the VRSD problem faster than human experts. For example, veteran experts generally spend over half day on planning the problem of 300 orders with 30 vehicles, while the HIMS model can find a better plan to the same problem in 30 min by a personal computer.

5 Conclusion A concept of the VRSD problem in the real world, including both the Combinatorial Optimization Problem (COP) and the Constraint Satisfaction Problem (CSP), has been introduced and is formalized in terms of fuzzy set theory. A calculation model with hierarchical multiplex structure, called the HIMS model, and its operation strategies based on heuristics, optimization, and fuzzy inference, have also been proposed. The HIMS model is constructed as a software component using objectoriented programming, and the corresponding optimization algorithm is represented through meta-programming and fuzzy inference. Four experiments with two cases (full or partial working) using two measures (different or uniform weighted coefficients) are performed, where 39 gasoline stations are serviced by 10 or 7 tank lorries with daily deliveries for an oil company in the Izu peninsula area of Japan. Following that, a detailed analysis from the viewpoints of theory and practice is done. The results and the evaluations by experts working in the field confirm that the HIMS model is feasible, fast, and efficient and can be applied to the planning support system for the real world VRSD problem. Although the convergence of the HIMS model has not been demonstrated by mathematical analysis because of the complex relations between objectives and constraints, the operating and selecting criteria for the HIMS model guarantee that the value of total

A Calculation Model of Hierarchical Multiplex Structure …

199

quality evaluation continuously increases (as the running cost decreases) with an upper bound. The stability of the HIMS model in practice is also verified by the process information recorded in the experiments. The HIMS model gives the foundation to some other problems such as problems involving both COP and CSP. It will be able to cover the real world similarity transportation problems by land (daily), air cargo (weekly), or shipping (monthly).

References 1. M.O. Ball, T.L. Magnanti, C.L. Monma, G.L. Nemhauser, Network routing, vol. 8 (Elsevier, Amsterdam, 1995) 2. G. Laporte, I.H. Osman, Routing problems: A bibliography. Ann. Oper. Res. 61, 227–262 (1995) 3. A.V. Breedam, Improvements heuristics for the vehicle routing problem based on simulated annealing. Eur. J. Oper. Res. 86, 480–490 (1995) 4. A.S. Alfa, S.S. Heragu, M. Chen, A 3-opt based simulated annealing algorithm for the vehicle routing problem. Comp. Ind. Eng. 21, 635 (1991) 5. E. Taillard, P. Badeau, A tabu search heuristic for the vehicle routing problem with soft time windows. Transp. Sci. 31, 170–186 (1997) 6. C. Duhamel, J.Y. Potvin, J.M. Rousseau, A tabu search heuristic for the vehicle routing problem with backhauls and time windows. Transp. Sci. 31, 49–59 (1997) 7. F. Leclerc, J.Y. Potvin, Genetic algorithms for vehicle dispatching. Int. Trans. Opl. Res. 4(5/6), 391–400 (1997) 8. R. Cheng, M. Gen, Vehicle routing problem with fuzzy due-time using genetic algorithms. J. Fuzzy Theory Syst. 7(5), 1050–1061 (1995) 9. L.D. Bodin, Twenty years of routing and scheduling. Oper. Res. 38, 571–579 (1990) 10. P.K. Bagchi, B.N. Nag, Dynamic vehicle scheduling: An expert systems approach. Int. J. Physical Dist. Log. Manag. 21(2), 10–18 (1991) 11. G. Laporte, The traveling salesman problem: An overview of exact and approximate algorithms. Eur. J. Oper. Res. 59, 231–247 (1992) 12. L. Ingber, Simulated annealing: Practice versus theory. Mathl. Comput. Modelling 18(11), 29–57 (1993) 13. F. Glover, E. Taillard, D.D. Werra, A user’s guide to tabu search. Ann. Oper. Res. 41, 3–28 (1993)

Designing the Researchers’ Management Decision Support System Based on Fuzzy Logic Masuma Mammadova and Zarifa Jabrayilova

Abstract The article outlines the principles of designing management decision support system for the employees dealing with mental activity. The most important point of these decisions making is to evaluate the employees’ performance. This problem is described as a multi-criteria ranking issue formed in an uncertain environment. Its fuzzy relation model is also presented. By solving additive aggregation methods, solution technique is proposed. The functional blocks of the researchers’ management decision support system by referring to the evaluation results, their functioning principles and decision mechanisms are described. Keywords Researcher · Activity assessment · Uncertainty · Fuzzy relation model · Additive aggregation · Staff evaluation

1 Introduction Objectives of human resource management (HRM) are the basis of personnel policy. The correct solution to these problems, making objective and transparency decisions on HRM allows the organization to achieve its global goals [1, 2]. In general, today the human resource management becomes the strategy of the company or firm. In this case, the funds invested in the development of human resources, transform into an investment, not expenditure [1]. The changes, occurred in the labor market, require major changes in the relationship with employees, in the policy of their recruitment, retention and motivation. In this regard, human resource management at the professional level has become a strong modern means used in HR. Fundamentally new attitude towards the personnel as valuable resource of the organization actualizes the importance of developing new conceptual approaches and technologies for HRM [2, 3]. Therefore, in recent years, computer technology is increasingly used for the HRM problem solutions. M. Mammadova · Z. Jabrayilova (B) Institute of Information Technology of Azerbaijan National Academy of Science, Baku, Azerbaijan e-mail: [email protected] © Springer Nature Switzerland AG 2020 S. N. Shahbazova et al. (eds.), Recent Developments in Fuzzy Logic and Fuzzy Sets, Studies in Fuzziness and Soft Computing 391, https://doi.org/10.1007/978-3-030-38893-5_10

201

202

M. Mammadova and Z. Jabrayilova

The success of the twenty-first century, the age of information society and knowledge-based economy, refer to the increasing productivity of mental activity and the employees engaged in mental activity. The valuable asset of any commercial or non-profit organization is its mental work forces and their productivity. The productivity of the employees engaged in the mental activity is determined by the following six factors [4, 5]: 1. A clear answer to the question: What is “the essence of the production task?”, i.e. How the “result” of the activity of the relevant organization (firm, enterprise, field)is defined?; 2. Responsibility for the productivity depends of each employee himself/herself, that is, they are their own managers; 3. Uninterrupted innovative activity should be an integral part of the mental activity and must be included in the production task of any employee engaged in mental activity; 4. Employees engaged in mental activity always have to learn from and teach one another; 5. Productivity of employees engaged in mental activity is not measured quantitatively or quantitatively, i.e., its quality covers a broader scope and is defined by many parameters; 6. Finally, to increase the productivity of employees involved in mental activity, they need to be reviewed as “capital” rather than “expense” and should be treated accordingly. In this regard, employees should be encouraged to work in that organization and should regard it as the best option. These factors represent the multi-dimensional, multi-criteria, and quantitative and qualitative character of the employees involved in mental activity. Depending on the fields of the organization, the productivity of its employees engaged in mental activity is characterized by different parameters. Increase in productivity primarily requires appropriate selection of these parameters and precise assessment of activity. Assessment of mental activity with the consideration of these aspects and requirements entails the objectiveness and transparency of the decisions made for the solution of the issues such as employees’ satisfaction, promotion, rewarding, stimulation, deployment and redeployment, etc. All this requires the development of more innovative approaches to the evaluation of the activity of the employees involved in mental activity, and the development of decision mechanisms that meet the requirements set forth. The article defines the specific features of the assessment of researchers’ activity and proposes a fuzzy relation model as a multi-criteria ranking problem formed in an uncertain environment. The solution technique of the problem is developed by taking into account the hierarchic nature of criteria based on the additive aggregation method. Referring to the evaluation results, the architectural principles and functional structure of the employees’ management decision support system are developed and the realization stages of the proposed approach are presented.

Designing the Researchers’ Management Decision …

203

2 Modeling of Researchers’ Activity Assessment Scientific employees, in particular, researchers who are professionally engaged in scientific and scientific-technical activities, are regarded as the resources of special importance in the industrialization of the country, the intellectual revival of the nation, and in the interstate competitiveness in economic, political and technological fields, and in enhancing the innovation capabilities of the state. Establishment and strengthening of the knowledge society and knowledge-based economy, ensuring its sustainable, dynamic and competitive development, and the realization of the prioritized role of science are determined by the scientific achievements [6]. Thus, the study of the problems related to the management of the employees engaged in scientific activity, the identification of the features characterizing their activities, and the objectiveness and transparency of the employees’ management decisions require the development of the mechanisms for activity assessment. Objective and truthful assessment of the productivity of mental activity, which provides the professional progress of scientific employees and achieving their goals in accordance with prospective strategy, constitutes the basis for the personnel policy in this segment [5, 7, 8]. As it is mentioned, when evaluating the researchers’ performance, the direct assessment subject is activity, and its quantitative assessment becomes problematic. The activity of scientific employees differs according to scientific areas (fundamental, applied and experimental research, etc.) and types (humanities, nature, medicine, social and technical sciences). This difference is particularly represented in the analysis of the objective and subjective conditions of scientific and technical activity and complicates the assessment of the performance of scientific employees. As it is mentioned, the most important feature characterizing the activity of scientific employee is the fact that their achievements are difficult and sometimes impossible to be measured with certain quantities. Thus, sometimes, the results of activity are not revealed at once, but after some time, and even much later [4]. Therefore, the parameters selected to evaluate the activity of employees should ensure democracy, transparency and objectivity of the assessment system, and provide the same and fair attitude to all employees enabling the managerial decisions based on the evaluation results [9, 10]. The diversity of the activity of employees is determined by the use of quantitative and qualitative criteria in the assessment, by their usual hierarchic character and their relevance. Analysis and evaluation of the activity based on such criteria requires certain time. The changes and external influences during this period do not allow for exact depiction of the problem. The issue is realized in uncertain circumstances and becomes the problem of decision-making in a fuzzy environment [11–17]. Thus, the evaluation of the activity of scientific employees requires referring to the fuzzy apparatus taking into account the hierarchical and fuzzy nature of the criteria, and the linguistic uncertainties regarding the formalization of the expert knowledge. Such an expression of the problem ensures the problem to be solved by being reduced to the problem of fuzzy multicriteria optimization and ranking. Here,

204

M. Mammadova and Z. Jabrayilova

the optimization is not a matter of mathematical optimization, but a selection of the alternative options [13, 14]. Fuzzy multicriteria optimization techniques are based on the aggregation of an affiliation function by referring to the fuzzy According to the fuzzy relation model. relation model, X = {x1 , x2 , . . . , xn } = xi , i = 1, n is a set of the alternatives, out of which the best one should be selected, whereas K = {k1 , k2 , . . . , km } = k j , j = 1, m is a set of criteria characteristic to the alternatives (K is a summarizing criterion), then the correspondence of these alternatives with the criteria can be presented by a two-dimensional matrix. The element of this matrix is determined by the affiliation function, which represents the extent of correspondence of the alternative x i with the criterion k j : ϕk j (xi ): X × K → [0, 1]. Here, ϕk j (xi ) represents the extent of correspondence of the alternative x i with the criterion k j [15].

3 The Technique for the Assessment of Researchers’ Activity Based on the Fuzzy Relation Model The key stages of the solution of the problem of activity assessment of scientific employees are as follows: (1) Formation of the structural scheme of the evaluation system, namely, the alternatives: the list of employees, whose activity is evaluated, the evaluation criteria system, imposed restrictions and objectives; (2) Selection of methods for the acquisition (selection of experts, expert evaluation, selection of quantitative and qualitative levels of criteria) and processing (mathematical representation of criteria, determination of relative importance coefficient of criterion) of initial information; (3) Selection of a method that enables an integral evaluation of the results for the criteria out of certain evaluation set. Stage 1 identifies the followings: (1) X = xi , i = 1, n is a set of alternatives, more precisely, the employees engaged in scientific-theoretical, scientific-practical, practical and educational activities institution; in a research (2) K = K j , j = 1, m is a set of criteria with different weight characteristic to the alternatives (K-summarizing criterion); (3) Each criterion K j , j = 1, m is determinedbased on the different weighted sub-criteria that can be evaluated, i.e., K j = k jt , t = 1, T . Stage 2. A single quality measurement scale (SQMS) approach can be attributed to an expert evaluation of the mathematical description of the sub-criteria characterizing the activity. This approach: a. selects SQMS representing the transition of linguistic values (for example, 3level—“good”, “normal”, “weak”), which corresponds to the intensity level (3,

Designing the Researchers’ Management Decision …

205

5, 7, 9 levels are available) of the quality indicator of our natural language and is defined within the interval [0, 1], to the fuzzy numeral; b. approves sub-criteria as linguistic variables, and divides them into intensity levels in accordance with SQMS, and adopts appropriate linguistic values and fuzzy numbers per level. To determine importance coefficients of the criteria, 10-score system uses the expert evaluation method or a pairwise comparison criterion [18]. This stage defines the followings: (1) affiliation function of alternatives k jt , t = 1, T , j = 1, m :

to

the

assessible

alternatives

ϕk j1 (xi ), ϕk j2 (xi ), . . . , ϕk j T (xi ) = ϕk jt (xi ), t = 1, T , j = 1, m

(1)

(2) importance coefficient of the criteria, i.e. {w1 , w2 , . . . , wm } = w j , j = 1, m

(2)

and importance coefficient of the sub-criteria included in the same group:

w j1 , w j2 , . . . , w j T = w jt , t = 1, T , j = 1, m

(3)

T and the condition t=1 w jt = 1 is provided to the sub-criteria that characterizes the same criterion. The goal is to obtain a ranked list of employees based on the affiliation of the scientific employee to the summarizing criterion K, i.e., the determination of ϕ K (xi ) → [0, 1]. That is: X : K → X * , where X * is the adjusted list of employees. Stage 3. The method proposed for the evaluation of activity of scientific employees requires an assessment of the alternatives, taking into account the hierarchical structure of the criteria and their diverse importance. In this regard, the additive aggregation method is referred and the problem is solved in the following sequence [19, 20]: 1. According to (1) and (3), the affiliation function of the alternative x i to K j is defined as: ϕ K j (xi ) =

T

w jt ϕk jt (xi ).

(4)

t=1

2. According to ϕ K j (xi ), j = 1, m and (2), the affiliation function of all alternatives x i , i = 1, n to the summarizing criterion K is defined: ϕ K (xi ) =

m j=1

w j ϕ K j (xi ).

(5)

206

M. Mammadova and Z. Jabrayilova

3. The alternative, the affiliation function to the summarizing criterion K of which gains the maximum value: ϕ x ∗ = max ϕ K (xi ), i = 1, n Selected alternative is the “best” alternative out of n number of alternatives, and ranked the first in the list of alternatives ranking for the value of their affiliation function to the summarizing criterion K.

4 Functional Blocks of Researchers’ Performance Assessment System As noted above, the precise assessment of activity of employees ensures the objectivity and transparency of their management decisions [10]. Based on the proposed technique, the functional scheme of decision support system for the assessment of the activity of scientific employees and their rewarding, promotion and re-deployment based on the evaluation results is illustrated in Fig. 1. Interface ensures the communication between the system and the user. A user can select the following operating modes in the system via the interface: – Submitting the initial information on activity assessment; – evaluating the activity, obtaining results; – submitting the rules that shape the knowledge base for appropriate decision support; – making appropriate management decisions for each employee. Initial information processing defines the importance coefficient of criteria and sub-criteria, and generates their mathematical formulation based on the correspondence of employees’ activity to sub-criteria. The mathematical representation of sub-criteria in this block is illustrated in Table 1 in accordance with the proposed technique. Here, using the 3-level of SQMS, a sequence of mathematical description of the sub-criteria “Participation in the implementation of research” of the criterion “Scientific-theoretical activity”, which characterizes the activity of scientific employees. The database (DB) includes the essential data required for the assessment of employees’ activity: the affiliation functions of each employee’s activity to subcriteria, the importance coefficients of criteria and sub-criteria, the final estimations of employees’ activity (including the results of the periodic assessments), and the decisions related to rewarding, promotion or re-deployment of each employee. Evaluation of Activity implements a mechanism for evaluating employees’ activity based on the proposed technique. The sequence of the evaluation process in this block is described in the following tables in accordance with the proposed technique.

Designing the Researchers’ Management Decision …

207 Evaluation of activity

Database İmportance coefficients of criteria and subcriteria

Affiliation of employees’ activity to sub-criteria

Evaluation results Data Analytics

Knowledge base

Decisions made Decisionmaking

Initial data processing block Results of an expert Results of the assessment associated employees’ survey with the relative importance of criteria and sub-criteria Mathematical

Expert knowledge processing

description of subcriteria

Interface

User

Fig. 1 Functional scheme of decision support system for the assessment of the activity of scientific employees

Table 2 builds a two-dimensional fuzzy relation matrix, which represents the degree of correspondence of each worker to the sub-criteria. The next step is to calculate the affiliation function of the employees’ activity to the subcriteria in DB based on formula (4) by using the relative importance coefficient of (Table 3). Referring to the results obtained and the relative importance coefficients of the criteria in DB, the final value of the employees’ activity is found based on formula (5), and then the results are forwarded to DB. An employee providing the condition ϕ K (x ∗ ) = max ϕ K (xi ), i = 1, n (n is the number of employees) is the most progressive employee of the institute according to the value of his/her performance corresponding to the criterion x * , and a list of employees ranked in a decreasing order is obtained similarly. Data analytics (logical outcome) detects the facts ensuring the results obtained from the evaluation method (including the previous periodical evaluations), and

208

M. Mammadova and Z. Jabrayilova

Table 1 Mathematical description of the sub-criteria “participation in the implementation of research” characterizing the criteria “scientific-theoretical activity” Linguistic variable—granulation of the sub-criteria “participation in the implementation of research” characterizing the criteria “Scientific-theoretical activity

Linguistic value

Fuzzy subset within the interval [0, 1]

Fuzzy number

(a) Takes an active part in scientific research

Good

[0.9–1]

0.98

(b) Takes a part in scientific research

Normal

[0.66–0.89]

0.70

(c) Takes part in scientific research partially

Weak

[0.40–0.65]

0.40

Table 2 The membership function of employees’ activity to sub-criteria Alternatives

к11

…

ϕ k11 (x1 ) …

х1 … xi ... xn

К …

К1

… … … … ϕ k11 (xn ) … …

ϕ k11 (xi )

Кm

к1L

кm1

ϕ k1L (x1 )

ϕ k m1 (x1 )

… … … ϕ kL (xi ) … … … ϕ k1L (xn ) …

…

… … … ϕ k m1 (xi ) … … … ϕ k m1 (xn ) …

ϕ K1 (xi ) , i = 1, n

кmT

ϕ k mT (x1 )

… ϕ k mT (xi ) …

ϕ k mT (x n )

ϕ K M (xi ) , i = 1, n

Table 3 The membership function of alternatives to the generalized criterion K Alternatives

х1 … х1 …

хn

К1

ϕ K1 (x1 )

… ϕ K1 (xi ) … ϕ K1 ( x n )

…

… … … ... ...

К Kj

ϕ K j (x1 )

… ϕ K1 (xi ) …

ϕ K j (xn )

ϕ K ( xi ), i = 1, n

Кm

… … … … …

ϕ K m (x1 ) … ϕ K m (xi )

…

ϕ K m (x n )

Designing the Researchers’ Management Decision …

209

providing the employees’ management decisions referring to other data stored in DB. Knowledge base consists of the rules representing the managerial decisions in accordance with the affiliation of the employees’ activity to criteria (or sub-criteria) and summarizing criterion. The first part of the production rules, which are based on expert knowledge and described as “if …, then …”, corresponds to the specific fact based on the values of the criterion (or summarizing criterion, sub-criteria) that characterizes the activity for a particular decision. Whereas the “result” represents the management decision appropriate to the same fact. The system capabilities. Employees’ activity evaluation system ensures: (1) evaluating and ranking the results of each employee’s activity by each subcriterion, each criterion, and finally, by the summarizing criterion (value of the activity); (2) identifying the most progressed (or least progressed) employees of each department (laboratory, group, etc.); (3) identifying the most progressed (or least progressed) department (laboratory, group, etc.) of the institute (research center, organization etc.); (4) identifying the most progressed (or least progressed) employee of the institute. In accordance with these opportunities, intellectual decision support can be provided on the following issues: 1. Staff rewarding; 2. Staff re-positioning; 3. Staff discipline, etc. The rules for staff rewarding are based on the proposed limits related to the amount of the award. It should be noted that the amount of award to be presented to the employees corresponds to the linguistic values on 4 levels, as “very high, high, medium, low”. In this case, the rules for awarding can be described as follows: Rule 1. If ϕ(xi ) ∈ [0.9, 1], then the employee may be awarded a “very high” award; Rule 2. If ϕ(xi ) ∈ [0.75, 0.9), then the employee may be awarded a “high” award; Rule 3. If ϕ(xi ) ∈ [0, 60, 0.75), then the employee may be awarded a “medium” award; Rule 4. If ϕ(xi ) ∈ [0.50, 0.60), then the employee may be awarded a “low” award; Rule 5. If ϕ(xi ) ∈ [0.30, 0.50), then the employee is not awarded; Rule 6. If ϕ(xi ) ∈ [0.00, 0.30), then the employee must be reviewed. To support decisions on Staff re-positioning, a profile of the department, where the employee is employed, i.e., the department dealing with scientific research, a department dealing with scientific-practical activities, a serving department (library, multi-media, consulting service, etc.) or education department, should be determined when the regulatory framework of rules base is formed. Research department of the institute is conditionally denoted by S s , the department dealing with scientific-practical activities—by S p , the serving department—by S x ,

210

M. Mammadova and Z. Jabrayilova

and the education department—by S t . In this case, the following rules may be used to support the re-positioning of employees: Rule 1. If ϕ K 1 (xi ) ∈ [0.00, 0, 30] and ϕ K 3 (xi ) ≥ 0.5 and xi ∈ S s , then the repositioning of this employee to the scientific-practical department can be reviewed; Rule 2. If (ϕ K 1 (xi ) ∈ [0.00, 0, 30] and ϕk33 (xi ) ≥ 0, 75 and xi ∈ S s ), then the re-positioning of this employee to the education department can be reviewed, and so forth. Here, ϕ K 1 (xi ) is the affiliation function of the employee to the scientific-theoretical activity criteria, ϕ K 3 (xi )—to the scientific-practical activity criteria, and ϕk33 (xi ) ≥ 0.75—to the pedagogical activity criteria. The formation and perfection of the knowledge base of the system based on the relevant rules is resolved within the framework of the relevant organizational norms, human resource management standards, and under the supervision of the Trade Union in accordance with the protection of the rights and reputations of the employees in the research institution [21, 22].

5 Conclusion The article outlined the scientific and methodological foundations of the system for activity assessment of scientific employees. Proposed methodological approach allows evaluating all the indicators characterizing the activity of employees engaged in scientific activity and the importance of these indicators, and evaluating their performance. This methodological approach can also be successfully applied to assess the activity of employees working in government and commercial organizations, enterprises and offices, and to increase the efficiency of their management. This requires in-depth and systematic study of the personnel’s potential and the consideration of all parameters, criteria and sub-criteria that characterize its activity. This methodological approach guarantees the objectivity and transparency of the decisionmaking process and the satisfaction of the employees by assessing their activity. As the conclusion of the evaluation, obtaining a ranked list of employees may increase their promotion, their positioning (to another department) and providing qualifications courses for them, etc. may support the enterprise managers in decision-making regarding management issues.

References 1. R.L. Mathis, J.H. Jackon, S.R. Valentine, Human Resource Management (Cengage Learning, 2014), 696p 2. M.H. Mammadova, Z.Q. Jabrayilova, F.R. Mammadzada, Fuzzy multi-scenario approach to decision-making support in human resource management, in Studies in Fuzziness and Soft Computing, vol. 342 (Springer International Publishing Switzerland, 2016), pp. 19–36

Designing the Researchers’ Management Decision …

211

3. M.H. Mammadova, Z.G. Jabrayilova, Fuzzy multi-criteria method to support group decision making in human resource management, in Studies in Fuzziness and Soft Computing, vol. 361 (Springer International Publishing Switzerland, 2018), pp. 223–234 4. P.F. Drucker, Knowledge-worker productivity: the biggest challenge. Calif. Manage. Rev. 41(2):79–94 (1999) 5. F.U. Taylor, The Principles of Scientific Management (Moscow, 1991). Electronic Publication: Center for Humanitarian Technologies. 03 June 2010. http://gtmarket.ru/laboratory/basis/3631 6. Law of the Republic of Azerbaijan on Science, 09 Aug 2016. https://president.az/articles/20785 7. V.E. Zlotnitsky, Factors of effective management of human resources of organization. Thesis in social sciences, specialty code HAC 22.00.08, Sociology of Management, 2008, p 190. www.dissercat.com/content/faktory-effektivnogo-upravleniya-chelovecheskimiresursami-organizatsii 8. M.H. Mammadova, Z.G. Jabrayilova, Application of fuzzy optimization method in decisionmaking for personnel selection. Intell. Control Autom. 5(4), 190–204 (2014) 9. Y.G. Odegov, K.K. Abdurakhmanov, L.R. Kotova, Evaluation of the Effectiveness of Personnel Work: A Methodological Approach (Publishing House AlfaPress, Moscow, 2011), 752p 10. I.F. Zaynetdinova, Evaluation of the Employees’ Activities of the Organization: Textbook (Yekaterinburg University Publishing House, 2016), 120p 11. L.A. Zadeh, Fuzzy sets. Inf. Control 8(3), 338–351 (1965) 12. L.A. Zadeh, The Concept of a Linguistic Variable and Its Application to Approximate DecisionMaking (Mir, Moscow, 1976), 168p 13. O.I. Larichev, Theory and Methods of Decision-Making, Including the Chronicle of Events in the Magic Countries: Textbook, 2nd edn (Logos, Moscow, 2002), 392p 14. Miconi, S.V., Multi-criteria Selection Based on a Finite Set of Alternatives (SPb Publishing House Lan, 2009), 272p 15. M.G. Mammadova, Decision-Making Based on Knowledge Bases with a Fuzzy Relational Structure (Elm, Baku, 1997), p. 296 16. R. Bellman, L.A. Zadeh, Decision-making in fuzzy environment. Manage. Sci. 17, 141–164 (1970) 17. M.H. Mammadova, Z.G. Jabrayilova, Decision-making support in human resource management based on multi-objective optimization. TWMS J. Pure Appl. Math. 8(1), 53–73 (2018) 18. Z.G. Jabrayilova, S.N. Nobari, Processing methods of information about the importance of the criteria in the solution of personnel management problems and contradiction detection. Probl. Inf. Technol. Baku 1, 57–66 (2011) 19. J.V. Neumann, O. Morgenstern, Theory of Games and Economic Behavior (One of Princeton University Presses, Notable Centenary Titles, 2007), 776p 20. M.G. Mammadova, Z.G. Jabrayilova, Fuzzy logic in the assessment of human resources potential, in Management in Russia and Abroad, vol. 5, pp. 111–117 (2004) 21. HR management standards, Public Sector Standards in Human Resource Management 2001— Grievance Resolution and Performance Management Standards”, Department of Education. http://hrcouncil.ca/resource-centre/hr-standards/documents/HRC-HR_Standards_Web.pdf 22. Public Sector Standards in Human Resource Management. Effective on and from 21 February 2011. Government of Western Australia. https://publicsector.wa.gov.au/sites/default/files/ documents/hrm_standards_3.pdf