1. Introduction
The notions of entropy and mutual information are fundamental concepts in information theory [
1]; they are used as measures of information obtained from a realization of the considered experiments. The standard approach in information theory is based on the Shannon entropy [
2]. Consider a finite measurable partition
of probability space
with probabilities
of the corresponding elements of
. We recall that the Shannon entropy of
is the number
where the function
is defined by
if
and
. Perhaps a crucial point in applications of the Shannon entropy in another scientific field presents the discovery of Kolmogorov and Sinai [
3] (see also [
4,
5]). They showed an existence of non-isomorphic Bernoulli shifts describing independent repetition of random spaces with finite numbers of results. If two dynamical systems are isomorphic, they have the same Kolmogorov-Sinai entropy. So Kolmogorov and Sinai constructed two Bernoulli shifts with different entropies, hence non-isomorphic. It is natural that the mentioned modification of entropy has been used in many mathematical structures. In [
6], we have generalized the notion of Kolmogorov–Sinai entropy to the case when the considered probability space is a fuzzy probability space
defined by Piasecki [
7]. This structure can serve as an alternative mathematical model of probability theory for the situations where the observed events are described unclearly, vaguely (so called fuzzy events). Other fuzzy generalizations of Shannon’s and Kolmogorov–Sinai’s entropy can be found e.g., in [
8,
9,
10,
11,
12,
13,
14,
15,
16,
17]. It is known that there are many possibilities for defining operations with fuzzy sets; an overview can be found in [
18]. It should be noted that while the model presented in [
6] was based on the Zadeh connectives [
19], in our recently published paper [
14], the Lukasiewicz connectives were used to define the fuzzy set operations. In [
20], the mutual information of fuzzy partitions of a given fuzzy probability space
has been defined. It was shown that the entropy of fuzzy partitions introduced and studied in [
6] can be considered as a special case of their mutual information.
In classical information theory the mutual information is a special case of a more general quantity called Kullback–Leibler divergence (K–L divergence for short), which was originally introduced by Kullback and Leibler in 1951 [
21] (see also [
22]) as the divergence between two probability distributions. It plays an important role, as a mathematical tool, in the stability analysis of master equations [
23] and Fokker–Planck equations [
24], and in isothermal equilibrium fluctuations and transient nonequilibrium deviations [
25] (see also [
24,
26]). In [
27], we have introduced the concept of K–L divergence for the case of fuzzy probability spaces.
A natural generalization of some family of fuzzy sets is the notion of an MV algebra introduced by Chang [
28]. An MV algebra is an algebraic structure which models the Lukasiewicz multivalued logic, and the fragment of that calculus which deals with the basic logical connectives “and”, “or”, and “not”, but in a multivalued context. MV algebras play a similar role in the multivalued logic as Boolean algebras in the classical two-valued logic. Recall also that families of fuzzy sets can be embedded to suitable MV algebras. MV algebras have been studied by many authors (see e.g., [
29,
30,
31,
32,
33]) and, of course, there are also many results about the entropy on this structure (cf. [
34,
35]). The theory of fuzzy sets is a rapidly and massively developing area of theoretical and applied mathematical research. In addition to MV algebras, generalizations of MV algebras as D-posets (cf. [
36,
37,
38]), effect algebras (cf. [
39]), or A-posets (cf. [
40,
41]) are currently subject of intensive research. Some results about the entropy on these structures can be found e.g., in [
42,
43,
44].
A special class of MV algebras is a class of product MV algebras. They have been introduced independently in [
45] from the point of view of probability theory, and in [
46] from the point of view of mathematical logic. Product MV algebras have been studied e.g., in [
47,
48]. A suitable theory of entropy of Kolmogorov type for the case of product MV algebras has been constructed in [
35,
49,
50].
The purpose of this contribution is to define, using the results concerning the entropy in product MV algebras, the concepts of mutual information and Kullback–Leibler divergence for the case of product MV algebras and to study properties of the suggested measures. The main results of the contribution are presented in
Section 3 and
Section 4. In
Section 3 the notions of mutual information and conditional mutual information in product MV algebras are introduced and basic properties of the suggested measures are proved, inter alia, the data processing inequality for conditionally independent partitions. In
Section 4 we define the Kullback–Leibler divergence in product MV algebras and its conditional version and examine the algebraic properties of the proposed measures. Our results are summarized in the final section.
2. Basic Definitions, Notations and Facts
In this section, we recall some definitions and basic facts which will be used in the following ones. An MV algebra [
30] is a system
where
is a non-empty set,
,
are binary operations on
,
is a unary operation on
and 0, 1 are fixed elements of
, such that the following conditions are satisfied:
- (i)
;
- (ii)
- (iii)
- (iv)
- (v)
- (vi)
- (vii)
- (viii)
- (ix)
An example of MV algebra is the real interval
equipped with the operations
. It is interesting that any MV algebra has a similar structure. In fact, by the Mundici theorem [
33] any MV algebra can be represented by a lattice-ordered Abelian group (shortly Abelian
l-group). Recall that an Abelian
l-group is an algebraic system
, where
is an Abelian group,
is a partially ordered set being a lattice and
implies
.
Let be an Abelian l-group, 0 be a neutral element of and , . On the interval we define the following operations: , ; . Then the system becomes an MV algebra. The Mundici theorem states that to any MV algebra there exists an Abelian l-group with a strong unit u (i.e., to every there exists with the property ) such that .
In this contribution we shall consider MV algebras with a product. We recall that the definition of product MV algebra is based on Mundici’s categorical representation of MV algebra by an Abelian
l-group, i.e., the sum in the following definition of product MV algebra, and subsequently in the next text, means the sum in the Abelian
l-group associated to the given MV algebra. Similarly, the element
u is a strong unit of this group. More details can be found in [
45,
46].
Definition 1. A product MV algebra is a couple where is an MV algebra and is a commutative and associative operation on satisfying the following conditions:
- (i)
for any , ;
- (ii)
if , , then and .
In addition, we shall consider a finitely additive state defined on a product MV algebra.
Definition 2 [
30].
Let be a product MV algebra. A map is said to be a state if the following properties are satisfied:- (i)
- (ii)
if then .
In product MV algebras a suitable entropy theory has been provided in [
35,
49,
50]. In the following we present the main idea and some results of this theory which will be used in the contribution.
Definition 3. By a partition in a product MV algebra we mean a finite collection such that
Let m be a state on a product MV algebra In the set of all partitions of the relation is defined in the following way: Let and be two partitions of We say that is a refinement of (with respect to m), and write , if there exists a partition of the set such that for every Given two partitions and of their join is defined as the system if and Since the system is a partition of too. If are partitions in a product MV algebra then we put .
Let
be a partition in a product MV algebra
and
m be a state on
Then the entropy of
with respect to
m is defined by Shannon’s formula:
where:
If
and
are two partitions of
then the conditional entropy of
given
is defined by:
In accordance with the classical theory the log is to the base 2 and the entropy is expressed in bits. Note that we use the convention (based on continuity arguments) that if and if .
Example 1. Consider any product MV algebra and a state defined on M. Then the set is a partition of such that for any partition of Its entropy is . Let such that where Evidently, and the set is a partition of . The entropy . In particular, if then 1 bit.
The entropy and the conditional entropy of partitions in a product MV algebra satisfy all properties analogous to properties of Shannon’s entropy of measurable partitions in the classical case; the proofs can be found in [
35,
49,
50]. We present those that will be further exploited. Let
be any partitions of a product MV algebra
Then the following properties hold: (E1)
(E2)
implies
; (E3)
; (E4)
=
; (E5)
3. Mutual Information of Partitions in Product MV Algebras
In this section the results concerning the entropy in product MV algebras are used in developing information theory for the case of product MV algebras. We define the notions of mutual information and conditional mutual information of partitions in a product MV algebra and prove basic properties of the proposed measures.
Definition 4. Let be partitions in a given product MV algebra . Then we define the mutual information of and by the formula: Remark 1. As a simple consequence of (E4) we get:Subsequently we see that i.e., the entropy of partitions in product MV algebras can be considered as a special case of their mutual information. Moreover, we see that and hence we can also write: Example 2. Consider the measurable space where is the unit interval and is the
-algebra of all Borel subsets of Let F be the family of all -measurable functions (
i.e., ).
F is the so called full tribe of fuzzy sets [30] (see also [14,29]); it is closed also under the natural product of fuzzy sets and represents a special case of product MV algebras. On the product MV algebra F we define a state m by the formula for every F. Evidently, the sets and are two partitions of F with the m-state values and of the corresponding elements of and ,
respectively. By simple calculations we obtain the entropy 1
bit, and the entropy bit. The join of and is the system with the m-state values of the corresponding elements. The entropy of is the number: Since:the mutual information of and
is the number: We can also see that Equation (3) is fulfilled: In the following we will use the assertions of Propositions 1 and 2.
Proposition 1. If and are two partitions of then we have:
- (i)
for ;
- (ii)
for
Proof. By the assumption
therefore, according to Definitions 1 and 2, we get:
The equality (ii) could be obtained in the same way.
From the following proposition it follows that, for every partitions of the set is a common refinement of and .
Proposition 2. for every partitions of
Proof. Assume that
and
Since the set
is indexed by
we put
In view of Proposition 1, we have:
However, this indicates that
Theorem 1. For any partitions and in a product MV algebra we have: Proof. By Equation (2) and the properties (E3) and (E4), we get:
According to Proposition 2
, and therefore by (E2)
It follows the inequality:
Proposition 3. If and are two partitions of then:
Proof. Since by Proposition 1 it holds:
we get:
Definition 5. Two partitions and of are called statistically independent, if for
Theorem 2. Let be partitions in a product MV algebra Then with the equality if and only if the partitions are statistically independent.
Proof. Assume that
and
Then using the inequality
which is valid for all real numbers
with the equality if and only if
we get:
The equality holds if and only if
i.e., when
Therefore using Equation (5) and Proposition 1 we have:
It follows that with the equality if and only if for i.e., when the partitions are statistically independent.
From Theorem 2 it follows subadditivity and additivity of entropy in a product MV algebra, as shown by the following theorem.
Theorem 3 (Subadditivity and additivity of entropy). For arbitrary partitions in a product MV algebra it holds + with the equality if and only if the partitions are statistically independent.
Proof. It follows by Equation (3) and Theorem 2.
Theorem 4. For arbitrary partitions in a product MV algebra it holds with the equality if and only if the partitions are statistically independent.
Proof. The assertion is a simple consequence of Equation (2) and Theorem 2.
Definition 6. Let and be partitions in a given product MV algebra Then the conditional mutual information of and given is defined by the formula Remark 2. Notice that the conditional mutual information is nonnegative, because by the property (E2) .
Theorem 5. For any partitions and in a product MV algebra we have: Proof. In a similar way we obtain also the second equality.
Theorem 6 (Chain rules). Let and be partitions in a product MV algebra Then, for the following equalities hold:
- (i)
- (ii)
- (iii)
Proof. (i) By the property (E4) we have:
Now let us suppose that the result is true for a given
Then:
(ii) For
using (E3) we obtain:
Suppose that the result is true for a given
Then:
(iii) By Equation (2), the equalities (i) and (ii) of this theorem, and Equation (6), we obtain:
Definition 7. Let and be partitions in a product MV algebra We say that is conditionally independent to given (and write ) if
Theorem 7. For partitions and in a product MV algebra if and only if .
Proof. Let
Then
Therefore by (E4) we get:
The results means that The reverse implication is evident.
Remark 3. According to the above theorem, we may say that and are conditionally independent given and write instead of .
Theorem 8. Let and be partitions in a given product MV algebra such that Then we have:
- (i)
- (ii)
- (iii)
- (iv)
(data processing inequality).
Proof. (i) By the assumption we have
. Hence using the chain rule for the mutual information (Theorem 6 (iii)), we obtain:
(ii) By the equality (i) of this theorem and Theorem 5, we can write:
(iii) From (ii) it follows the inequality
Interchanging
and
(we can do it based on Theorem 7) we obtain:
(iv) By the assumption we have
. Therefore by Theorem 5 we get:
Thus by the same theorem we can write:
Since it holds
In the following, a concavity of entropy and concavity of mutual information as functions of m are studied. We recall, for the convenience of the reader, the definitions of convex and concave function:
A real-valued function
is said to be convex over an interval
if for every
and for any real number
:
A real-valued function
is said to be concave over an interval
if for every
and for any real number
:
In the following, we will use the symbol to denote the family of all states on a given product MV algebra It is easy to prove the following proposition:
Proposition 4. If then, for every real number
Theorem 9 (Concavity of entropy). Let be a partition in a given product MV algebra Then, for every and every real number the following inequality holds: Proof. Assume that
Since the function
F is convex, we get:
which proves that the entropy
is a concave function on the family
.
In the proof of concavity of mutual information
we will need the assertion of Proposition 5. First, we introduce the following notation. Let
be a state on a product MV algebra
Then we denote:
Proposition 5. If and are two partitions of then Proof. In the last step, we used the implication which follows from the equality shown in Proposition 1.
Remark 4. By Proposition 5 there exists such that Definition 8. Let be two partitions of Put Theorem 10 (Concavity of mutual information). The mutual information is a concave function on the family .
Proof. By Equation (4) we can write:
In view of Theorem 9 and Remark 4, the function is the sum of two concave functions on the family : and Since the sum of two concave functions is itself concave, we have the statement.
4. Kullback–Leibler Divergence in Product MV Algebras
In this section we introduce the concept of Kullback–Leibler divergence in product MV algebras. We prove basic properties of this measure; in particular, Gibb’s inequality. Finally, using the notion of conditional Kullback–Leibler divergence we establish a chain rule for Kullback–Leibler divergence with respect to additive states defined on a given product MV algebra. In the proofs we use the following known log-sum inequality: for non-negative real numbers
, it holds:
with the equality if and only if
is constant. Recall that we use the convention that
if
and
if
.
Definition 9. Let be states defined on a given product MV algebra and be a partition of Then we define the Kullback–Leibler divergence ‖ by: Remark 5. It is obvious that ‖ . The Kullback–Leibler divergence is not a metric in a true sense since it is not symmetric, i.e., the equality ‖ ‖ is not necessarily true (as shown in the following example), and does not satisfy the triangle inequality.
Example 3. Consider any product MV algebra and two states defined on M. Let such that and where Evidently, and the set is a partition of Let us calculate:If then ‖ ‖
. If then we have:and:The result means that ‖ ‖ in general.
Theorem 11. Let be states defined on a product MV algebra and be a partition of Then ‖ (Gibb’s inequality) with the equality if and only if for
Proof. If we put
and
for
then
are non-negative real numbers such that
and
. Indeed,
analogously we obtain
. Thus, using the log-sum inequality we can write:
with the equality if and only if
for
where
is constant. Taking the sum for all
we obtain
which implies that
. This means that
‖
if and only if
for
Theorem 12. Let be a partition of and be a state on uniform over . Then, for the entropy of with respect to any state from we have:
Proof. Assume that
Then
for
Let us calculate:
As a consequence we obtain the following property of entropy of partitions in product MV algebras.
Corollary 1. For any partition of it holds with the equality if and only if m is uniform over the partition .
Proof. Assume that
and consider a state
on
uniform over
i.e., it holds
for
Then, by Theorem 12 we get:
Since by Theorem 11
‖
it holds the inequality:
Further, by Theorem 11 ‖ if and only if for This means that the equality holds if and only if for
Theorem 13 (Convexity of K–L divergence). Let be a partition in a product MV algebra The K–L divergence ‖ is convex in the pair i.e., if are pairs of states from , then, for any real number the following inequality holds: Proof. Assume that
and fix
Putting
in the log-sum inequality, we obtain:
Summing these inequalities over we obtain the inequality (9).
The result of Theorem 13 is illustrated in the following example.
Example 4. Consider the product MV algebra F from Example 2 and the real functions defined by for every On the product MV algebra F we define the states by the following formulas: In addition, we will consider the partition of F. It is easy to calculate that it has the -state values the -state values the -state values and the -state values of the corresponding elements. In the previous theorem we put . We will show that:
Since the inequality (10) holds.
In the final part, we define the conditional Kullback–Leibler divergence and, using this notion, we establish the chain rule for Kullback–Leibler divergence.
Definition 10. Let be states on a given product MV algebra and be two partitions of Then we define the conditional Kullback–Leibler divergence ‖ by:
Theorem 14 (Chain rule for K–L divergence). Let be states on a given product MV algebra If , are two partitions of then: Proof. Assume that
and
We will consider the following two cases: (i) there exists
such that
(ii)
for
In the first case, both sides of Equation (11) are equal to
thus the equality holds. Let us now assume that
for
We get:
In the last step, analogously as in the proof of Proposition 5, we used the implication which follows from the equality shown in Proposition 1.
In the following example, we illustrate the result of the previous theorem.
Example 5. Consider the product MV algebra F and the partitions of the product MV algebra F from Example 2. In addition, let be the states on F, defined in Example 4. Then the partitions and have the -state values and of the corresponding elements, respectively, and the -state values and of the corresponding elements, respectively. The join of partitions and is the system it has the -state values and the -state values of the corresponding elements. By simple calculations we obtain:It is possible to verify that ‖ ‖ ‖ . 5. Discussion
In this paper, we have extended the study of entropy in product MV algebras. The main aim of the paper was to introduce, using known results concerning the entropy in product MV algebras, the concepts of mutual information and Kullback–Leibler divergence for the case of product MV algebras and examine algebraic properties of the proposed measures. Our results have been presented in
Section 3 and
Section 4.
In
Section 3 we have introduced the notions of mutual information and conditional mutual information of partitions of product MV algebras and proved some basic properties of the suggested measures. It was shown that the entropy of partitions of product MV algebras can be considered as a special case of their mutual information. Specifically, it was proved that from the properties of mutual information it follows subadditivity and additivity of entropy (Theorem 3). Theorem 6 provides the chain rule for mutual information. In addition, the data processing inequality for conditionally independent partitions in product MV algebras is proved. Moreover, a concavity of mutual information has been studied.
In
Section 4 the notion of Kullback–Leibler divergence in product MV algebras was introduced and the basic properties of this measure were shown. In particular, a convexity of Kullback–Leibler divergence with respect to additive states defined on a given product MV algebra is proved. Theorem 11 admits interpretation of Kullback–Leibler divergence as a measure of how different two states on a common product MV algebra (over the same partition) are. The relationship between KL-divergence and entropy is provided in Theorem 12: the more a state
diverges from the state
uniform over
(over the same partition
) the lesser the entropy
is and vice versa. Finally, a conditional version of the Kullback–Leibler divergence in product MV algebras has been defined and the chain rule for Kullback–Leibler divergence with respect to additive states defined on a given product MV algebra has been established.
Notice that in [
14] (see also [
29,
30]) the entropy on a full tribe
F of fuzzy sets has been studied. The tribe
F is closed also under the natural product of fuzzy sets and it represents a special case of product MV algebras. Accordingly, the theory presented in this contribution can also be applied for the mentioned case of tribes of fuzzy sets.
In [
51,
52,
53,
54,
55] a more general fuzzy theory—intuitionistic fuzzy sets (IF-sets for short) has been developed. While a fuzzy set is a mapping
(where the considered fuzzy set is identified with its membership function
), the Atanassov IF-set is a pair
of functions
with
. The function
is interpreted as a membership function of IF-set
and the function
as a non-membership function of IF-set
Evidently, any fuzzy set
can be considered as an IF-set
Any result holding for IF-sets is applicable also to fuzzy sets. Of course, the opposite implication is not true; the theory of intuitionistic fuzzy sets presents a non-trivial generalization of the fuzzy set theory. So IF-sets present possibilities for modeling a larger class of real situations. Note that some results about the entropy on IF-sets can be found e.g., in [
56,
57,
58,
59]. These results could be used in developing information theory for the case of IF-sets.
To give a possibility to applied MV algebra results also to families of IF-experiments, one can use the Mundici characterization of MV algebras. In the family of IF-sets it is natural to define the partial ordering relation
in the following way: if
and
are two IF-sets, then
if and only if
and
Namely, in the fuzzy case
implies
Therefore we can consider the Abelian
l-group
putting
with the zero element
(In fact,
.) The partial ordering
in the
l-group
is defined by the prescription
if and only if
and
Then a suitable MV algebra is e.g., the system
. Moreover, this MV algebra is a product MV algebra with the product defined by
The presented MV algebra approach gives a possible elegant and practical way for obtaining new results also in the intuitionistic fuzzy case. We note that this approach was used to construct the Kolmogorov-type entropy theory for IF systems in [
58], drawing on entropy results for product MV-algebras published in [
35,
49,
50]. In this way it is also possible to develop the theory of information and K–L divergence for IF-sets.