Next Article in Journal
Robust Aggregation Operators for Intuitionistic Fuzzy Hypersoft Set with Their Application to Solve MCDM Problem
Next Article in Special Issue
On Representations of Divergence Measures and Related Quantities in Exponential Families
Previous Article in Journal
Spectrum Slicing for Multiple Access Channels with Heterogeneous Services
Previous Article in Special Issue
Stochastic Order and Generalized Weighted Mean Invariance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Information Geometry of the Exponential Family of Distributions with Progressive Type-II Censoring

1
Center of Statistical Research, School of Statistics, Southwestern University of Finance and Economics, Chengdu 611130, China
2
School of Electronics Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China
3
Department of Statistical Science, Southern Methodist University, Dallas, TX 75275-0332, USA
*
Author to whom correspondence should be addressed.
Entropy 2021, 23(6), 687; https://doi.org/10.3390/e23060687
Submission received: 29 April 2021 / Revised: 21 May 2021 / Accepted: 25 May 2021 / Published: 28 May 2021
(This article belongs to the Special Issue Measures of Information)

Abstract

:
In geometry and topology, a family of probability distributions can be analyzed as the points on a manifold, known as statistical manifold, with intrinsic coordinates corresponding to the parameters of the distribution. Consider the exponential family of distributions with progressive Type-II censoring as the manifold of a statistical model, we use the information geometry methods to investigate the geometric quantities such as the tangent space, the Fisher metric tensors, the affine connection and the α -connection of the manifold. As an application of the geometric quantities, the asymptotic expansions of the posterior density function and the posterior Bayesian predictive density function of the manifold are discussed. The results show that the asymptotic expansions are related to the coefficients of the α -connections and metric tensors, and the predictive density function is the estimated density function in an asymptotic sense. The main results are illustrated by considering the Rayleigh distribution.

1. Introduction

From the geometrical viewpoint, a parametric statistical model can be considered a differentiable manifold, and the parameter space can be regarded as a coordinate system of the manifold [1,2]. Let F = { f ( x ; θ ) , θ Θ } be a parametric statistical model with respect to some σ -finite reference measure μ , where θ is a real k-dimensional parameter vector belonging to some open subset Θ of the k-dimensional real space R k . For simplicity, a random variable X and its observed value x are uniformly denoted by x in this paper.
When the density function f ( x ; θ ) is sufficiently smooth in θ and it is differentiable as a function of θ , it is natural to introduce the structure of an k-dimensional manifold in the statistical model F , where θ plays the role of a coordinate system. The geometrical quantities, such as connection, divergence, flatness, curvature and tangent space, play a fundamental role in the statistical inference and asymptotic theory (see, for example, Komaki [3,4] and Harsha and Moosath [5]).
In reliability engineering, a life testing experiment is one of the effective ways to obtain reliability information of a product. To save time and reduce the cost of a life testing experiment, censoring methodologies are often applied so that the experiment is terminated before all the items on the life-testing experiment fail. Some commonly used censoring schemes include the Type-I and Type-II censoring schemes, where the life-testing experiment will be terminated at a prefixed time point and the life-testing experiment will be terminated as soon as the m-th (m is prefixed) failure is observed, respectively. In other words, the experimental time is prefixed for Type-I censoring scheme and the number of observed failures is prefixed for the Type-II censoring scheme (see, for example, Ng [6]). The Type-I and Type-II censoring schemes have been generalized to a more complicated and flexible censoring scheme such as progressive censoring schemes [7,8,9] and hybrid censoring schemes [10,11]. For progressive Type-II censoring schemes, the conventional Type-II censoring scheme is extended to situations wherein censoring occurs in multiple stages. A progressive Type-II censored life-testing experiment will be carried out in the following manner. Suppose n items are placed on a life-testing experiment and we assume that these n items have lifetimes following distribution with density function f ( x ; θ ) . It is planned that m failures will be observed and R r items are randomly removed (i.e., censored) from the experiment at the time of the r-th failure. More specifically, at the time of the first failure (denoted by X 1 : m : n ), R 1 randomly selected items from the n 1 surviving items are removed from the life testing experiment; then, the experiment continues and at the time of the second failure (denoted by X 2 : m : n ), R 2 randomly selected items from the ( n R 1 2 ) surviving items are removed from the experiment, and so on; finally, at the time of the m-th item failure (denoted by X m : m : n ), the experiment terminates and all the remaining R m = n m r = 1 m 1 R r surviving items are censored. Here, R = ( R 1 , R 2 , , R m ) is the progressive Type-II censoring scheme for the life testing experiment with r = 1 m R r = n m . Note that, when R 1 = R 2 = = 0 , R m = n m , the progressive Type-II censoring scheme reduces to the conventional Type-II censoring scheme.
Since the comprehensive studies of information geometry by Amari [1], information geometry has been productively used in different research fields including statistical learning, machine learning, neural networks, signal processing, information theory and so on (see, for example, Amari et al. [2] and Amari [12].) The information geometry methods are also widely used in statistics and reliability engineering. For example, Zhang et al. [13] discussed the Amari-Chentsov structure on the accelerated life test model with applications to optimal designs with different optimal criteria. The methods of information geometry are also employed to investigate the Bayesian prediction by taking α -divergences as loss functions [14]. In degradation modeling, a robust parameter estimation method was proposed in [15] by minimizing the f-divergence between the true model and suggested models.
In this paper, we investigate the tangent space, affine connection, α -connection, torsion and Riemann-Christoffel curvature of the manifold of the exponential family of distributions with progressive Type-II censoring scheme. These geometric quantities can be applied to different areas in statistics such as Bayesian analysis. Note that one of the challenges of Bayesian analysis is to calculate the integrals for obtaining the posterior distribution, especially when the number of parameters is large. Instead of using numerical methods to approximate those integrals, the geometric quantities developed in this paper can provide an efficient theoretical method to approximate those integrals involved in Bayesian prediction. The main contributions and the organization of this paper are described as follows:
  • Asymptotic theory plays an important role in statistical inference, which consider the properties of statistical procedures as the sample size increases. Geometrically, an approximation to a manifold is a local linearization by the tangent space. Thus, the tangent space of the manifold of the exponential family of distributions with progressively Type-II censored data is discussed in Section 2.
  • The local linearization accounts only for local properties of a statistical model. It is necessary to investigate the Fisher metric tensors, affine connection, and α -connection of the manifold in order to study the global or large-scale properties of the model. Therefore, these important geometric quantities are studied in Section 3.
  • As an application of the geometric quantities, the asymptotic expansions of the posterior density and the posterior Bayesian predictive density of the model are provided in Section 4.
  • To illustrate the results presented in this paper, the Rayleigh distribution is considered as an example in Section 5. Moreover, Monte Carlo simulation results and a real data analysis are presented in Section 6 to illustrate the main results.

2. The Statistical Model and Tangent Space

In this paper, we adopt the Einstein summation convention, that is, if an index occurs both as a superscript and as a subscript in a single expression, then the summation over that index is implied. For a density function f ( x ; θ ) F , let l ( x ; θ ) = log f ( x ; θ ) , we introduce the following definitions (see [1,2] for more details):
  • g i j = def E [ i l ( x ; θ ) j l ( x ; θ ) ] : the Fisher metric tensors, the inverse of g i j is denoted by g i j , where i = / θ i ;
  • T i j k = def E [ i l ( x ; θ ) j l ( x ; θ ) k l ( x ; θ ) ] : the skewness tensor;
  • Γ i j k = def E [ i j l ( x ; θ ) k l ( x ; θ ) ] : the affine connection;
  • Γ i j k α = def Γ i j k + 1 α 2 T i j k : the α -connection.
The 1 -connection and 1-connection are said to be the m-connection and e-connection, denoted by Γ i j k m and Γ i j k e , respectively. We also abbreviate some geometric terms by multiplication the metric tensors, i.e., T i = T i j k g j k , Γ i j l = Γ i j k g k l , Γ i j α , l = Γ i j k α g k l .
Suppose that F = { f ( x , θ ) , θ Θ } is an exponential family of distributions (see, for example, Barndorff-Nielsen [16]) with density function
f ( x , θ ) = exp i = 1 τ α i ( θ ) c i ( x ) ψ ( θ )
and reliability function
R ( x , θ ) = 1 F ( x , θ ) = exp i = 1 τ β i ( θ ) d i ( x ) ϕ ( θ ) ,
where τ is the number of functions of the parameter vector θ , F ( x , θ ) is the cumulative distribution function, and ψ ( θ ) is the cumulant generating function defined as
exp { ψ ( θ ) } = exp i = 1 τ α i ( θ ) c i ( x ) μ ( d x ) ,
with α i ( θ ) and β i ( θ ) are smooth functions of the parameter vector θ , and c i and d i are smooth functions of the random variable x. Here are two examples, the exponential and the Rayleigh distributions, of the members in the exponential family of distributions:
  • Exponential distribution with density function
    f ( x ; λ ) = exp { λ x + ln λ } , x > 0 , λ > 0 ,
    and reliability function
    R ( x ; λ ) = exp { λ x } , x > 0 , λ > 0 ,
    we have τ = 1 and the functions α 1 ( λ ) = β 1 ( λ ) = λ and c 1 ( x ) = d 1 ( x ) = x , ψ ( θ ) = ln λ and ϕ ( θ ) = 0 . The dimension of the parameter vector θ is k = 1 .
  • Rayleigh distribution with density function
    f ( x ; λ ) = exp { λ x 2 + ln x + ln 2 λ } , x > 0 , λ > 0 ,
    and reliability function
    R ( x ; λ ) = exp { λ x 2 } , x > 0 , λ > 0 ,
    We have τ = 2 and the functions α 1 ( λ ) = β 1 ( λ ) = λ , α 2 ( λ ) = 1 , β 2 ( λ ) = 0 , c 1 ( x ) = d 1 ( x ) = x 2 , c 2 ( x ) = ln x , d 2 ( x ) = 0 , ψ ( θ ) = ln 2 λ and ϕ ( θ ) = 0 . The dimension of the parameter vector θ is k = 1 .
Consider the life-testing experiment with progressive Type-II censoring described in Section 1 with n items placed on the life testing experiment and m failures are planned to be observed, let the set of all admissible Type-II PCSs as
P C ( m , n ) = R = ( R 1 , , R m ) N 0 m | i = 1 m R i = n m ,
where N 0 is the set of the non-negative integers. Under a given censoring scheme R = ( R 1 , , R m ) P C ( m , n ) , the set of progressively Type-II censored order statistics is denoted as x m : n R = { x 1 : m : n R , , x m : m : n R } . The PCS R = ( R 1 , , R m ) is prefixed prior to starting the life testing experiment.
Suppose the lifetime distribution of the items in the life testing experiment follows a distribution in the exponential family of distributions with density function in Equation (1), the joint density function of the observed data, x m : n R , can be expressed as [8,10]
L ( x m : n R ; θ ) = c ( R ) r = 1 m f ( x r : m : n R , θ ) 1 F ( x r : m : n R , θ ) R r = c ( R ) r = 1 m exp i = 1 τ α i ( θ ) c i ( x r : m : n R ) + i = 1 τ R r β i ( θ ) d i x r : m : n R ψ ( θ ) R r ϕ ( θ ) = def c ( R ) r = 1 m exp i = 1 τ θ i e i x r : m : n R φ ( θ ) = c ( R ) exp r = 1 m i = 1 τ θ i e i x r : m : n R m φ ( θ ) ,
where θ i e i ( x r : m : n R ) = def α i ( θ ) c i ( x r : m : n R ) + R r β i ( θ ) d i ( x r : m : n R ) , φ ( θ ) = def ψ ( θ ) + R r ϕ ( θ ) , and
c ( R ) = n ( n R 1 1 ) ( n R 1 R 2 R m 1 m + 1 )
is the normalizing constant. By defining a new random variables
x i ; r : m : n R = e i ( x r : m : n R ) ,
the joint density function in Equation (4) can be expressed as
L ( x m : n R ; θ ) = c ( R ) exp r = 1 m i = 1 τ θ i x i ; r : m : n R m φ ( θ ) .
The parameter θ of this form is called the natural parameter of the joint density function of the exponential family of distributions with progressive Type-II censoring.
The tangent vector T θ of the manifold of the function L ( x m : n R ; θ ) is spanned by the vectors i = / θ i , and the set { i } is called the natural basis associated with the coordinate system θ . Let
l ( x m : n R ; θ ) = log L ( x m : n R ; θ ) = log c ( R ) + r = 1 m i = 1 τ θ i e i x r : m : n R m φ ( θ ) ,
and the set
T θ ( 1 ) = A ( x m : n R ) | A ( x m : n R ) = s p a n i l ( x m : n R ; θ )
be the linear space of random variables spanned by i l ( x m : n R ; θ ) . The space T θ ( 1 ) is called the 1-representation of the tangent space with progressively Type-II censored data. Here, the basis i l ( x m : n R ; θ ) of the 1-representation is given by
i l ( x m : n R ; θ ) = r = 1 m e i x r : m : n R m i φ ( θ ) ,
and the second and the third order derivatives of l ( x m : n R ; θ ) are given by
i j l ( x m : n R ; θ ) = m i j φ ( θ ) , i j k l ( x m : n R ; θ ) = m i j k φ ( θ ) .

3. The α -Connections of Manifold Model

In this section, we investigate the α -connection of the manifold of the statistical model for the exponential family of distributions with progressively Type-II censored data. From Equation (4), the normalization factor φ ( θ ) can be defined as
φ ( θ ) = 1 m log c ( R ) exp r = 1 m i = 1 τ θ i e i x r : m : n R μ d x r : m : n R .
Since the function under the integral is assumed to be continuously differentiable, the order of integration and differentiation can be switched, and hence, the first three derivatives of the function φ ( θ ) with respect to the natural parameter θ i are given by
i φ ( θ ) = 1 m r = 1 m E L e i x r : m : n R ,
i j φ ( θ ) = 1 m E L r = 1 m e i x r : m : n R m i φ ( θ ) r = 1 m e j x r : m : n R m j φ ( θ ) = 1 m E L i l ( x m : n R ; θ ) j l ( x m : n R ; θ ) ,
i j k φ ( θ ) = 1 m E L [ r = 1 m e i x r : m : n R m i φ ( θ ) r = 1 m e j x r : m : n R m j φ ( θ ) × r = 1 m e k x r : m : n R m k φ ( θ ) ] = 1 m E L i l ( x m : n R ; θ ) j l ( x m : n R ; θ ) k l ( x m : n R ; θ ) ,
where the expectations E L [ · ] are taken with respect to the joint density function in Equation (4). The derivatives in Equations (7)–(9) can be considered as the expected value, the covariance and the third-order central moments of r = 1 m e i x r : m : n R , respectively. The derivative in Equation (7) can also be obtained from the condition
E L [ i l ( x m : n R ; θ ) ] = 0 ,
that is
0 = i L ( x m : n R ; θ ) μ d x m : n R = i L ( x m : n R ; θ ) μ d x m : n R = E L [ i l ( x m : n R ; θ ) ] .
The derivatives in Equations (8) and (9) can be obtained by calculating, respectively,
E L [ i j l ( x m : n R ; θ ) ] and E L [ i j k l ( x m : n R ; θ ) ] .
Equations (8) and (9) show that the ( i , j ) element of the metric tensors is given by
g i j ( θ ) = E L i l ( x m : n R ; θ ) j l ( x m : n R ; θ ) = m i j φ ( θ ) ,
the ( i , j , k ) element of the skewness tensor is given by
T i j k ( θ ) = E L i l ( x m : n R ; θ ) j l ( x m : n R ; θ ) k l ( x m : n R ; θ ) = m i j k φ ( θ ) ,
and the ( i , j , k ) element of the affine connection is given by
Γ i j k ( θ ) = E L i j l ( x m : n R ; θ ) k l ( x m : n R ; θ ) = m i j φ ( θ ) E L k l ( x m : n R ; θ ) = 0 .
Therefore, based on the joint density function L ( x m : n R ; θ ) , the α -connection of the manifold of an exponential family of distributions is given by
Γ i j k α ( θ ) = ( 1 α ) m 2 i j k φ ( θ ) ,
which means that the natural parameter θ is 1-affine, i.e., Γ i j k = 0 . Based on the information carried by the joint density function in Equation (4), we can obtain the following results.
Theorem 1.
The metric tensors and the α-connection of the exponential family of distributions are given by
g i j ( θ ) = m i j φ ( θ ) a n d Γ i j k α ( θ ) = ( 1 α ) m 2 i j k φ ( θ ) ,
respectively.
From the α -connection, we can obtain the torsion and the Riemann-Christoffel curvature of the manifold. The torsion is represented by the torsion tensor whose components are given by [1,2]
S i j k ( θ ) = Γ i j k ( θ ) Γ j i k ( θ ) ,
which is a tensor anti-symmetric with respect to indices i , j . Note that the coefficients Γ i j k α ( θ ) of the α -connections are symmetric with respect to the first two indices i and j, then the tensor S i j k ( θ ) vanishes for any α -connection. This shows that the manifold of the statistical model of the exponential family of distributions with progressively Type-II censored data is torsion-free.
The Riemann-Christoffel curvature of the manifold can be obtained as [1,2]
R i j k m = ( i Γ j k s j Γ i k s ) g s m + ( Γ i r m Γ j k r Γ j r m Γ i k r ) ,
where Γ i j k = g k m Γ i j m . The Riemann-Christoffel curvature based on the α -connection is called the α -Riemann-Christoffel curvature and its tensor is given by
R i j k m α = ( i Γ j k α , s j Γ i k α , s ) g s m + ( Γ i r m α Γ j k α , r Γ j r m α Γ i k α , r ) ,
where Γ i j α , k = g k m Γ i j m α . The tangent space of the manifold is said to be α -flat if the α -Riemann-Christoffel curvature R i j k m α = 0 . We can also obtain the α -covariant derivative and the Laplace operator based on the α -connection and the metric tensors.
In the above process for obtaining those geometric quantities, we only use the information from the joint density function L ( x m : n R ; θ ) . There is, in fact, another kind of information in the progressively Type-II censored order statistics x r : m : n R ( r = 1 , , m ) . We can consider the marginal density function of the r-th progressively Type-II censored order statistic, x r : m : n R (see, for example, Kamps and Cramer [17], Balakrishnan [18], Balakrishnan and Aggarwala [8], and Balakrishnan and Cramer [10])
f x r : m : n R ( x ) = c r 1 s = 1 r a s , r f ( x ) ( 1 F ( x ) ) γ s 1 , x > 0 , r = 1 , , m ,
where
γ s = n s + 1 + r = 1 s 1 R r   for   s = 1 , , m , c r 1 = s = 1 r γ s   for   r = 1 , , m , a s , r = k = 1 , k s r 1 γ k γ s   for   1 s r m   with   a 1 , 1 = 1 .
Based on the marginal density in Equation (10), the expectations of the random variables e i ( x r : m : n R ) ( r = 1 , , m ) in Equation (5) can be obtained as
h i , r ( θ ) = def E f [ e i ( x r : m : n R ) ] = e i ( x ) f x r : m : n R ( x ) μ ( d x ) = c r 1 s = 1 r a s , r e i ( x ) exp i = 1 τ θ i e i , γ s x φ γ s ( θ ) μ ( d x ) ,
where θ i e i , γ s x φ γ s ( θ ) = α i ( θ ) c i ( x ) + ( γ s 1 ) β i ( θ ) d i ( x ) ψ ( θ ) ( γ s 1 ) ϕ ( θ ) , the expectation E f [ · ] is taken with respect to the density function in Equation (10). Suppose that the random variables e i ( x ) ( r = 1 , , m ) are independent, and let
h i ( θ ) = r = 1 m h i , r ( θ ) = r = 1 m s = 1 r c r 1 a s , r e i ( x ) exp i = 1 τ θ i e i , γ s x φ γ s ( θ ) μ ( d x ) ,
we can obtain the following results.
Theorem 2.
The Fisher metric tensors and the α-connection of the exponential family of distribution with progressively Type-II censored data are given by
g ˜ i j ( θ ) = g i j ( θ ) = m i j φ ( θ ) , Γ ˜ i j k α ( θ ) = g i j ( θ ) ( m k φ ( θ ) h k ( θ ) ) + 1 α 2 l = i , j , k ( h l ( θ ) m l φ ( θ ) ) ,
respectively.
Proof. 
For the metric tensors, they can be obtained directly from the definition of g i j ( θ ) . For the affine connection, from E [ i l ( x m : n R ; θ ) ] = r = 1 m E f [ e i ( x r : m : n R ) ] m i φ ( θ ) = h i ( θ ) m i φ ( θ ) , Equations (7) and (8), we can obtain
Γ ˜ i j k ( θ ) = E [ i j l ( x m : n R ; θ ) k l ( x m : n R ; θ ) ] = m i j φ ( θ ) E [ k l ( x m : n R ; θ ) ] = g i j ( θ ) ( m k φ ( θ ) h k ( θ ) ) .
Then, the third-order tensor T i j k ( θ ) can be specified as
T ˜ i j k ( θ ) = E [ i l ( x m : n R ; θ ) j l ( x m : n R ; θ ) k l ( x m : n R ; θ ) ] = l = i , j , k ( h l ( θ ) m l φ ( θ ) ) .

4. Applications in Bayesian Predictive Inference and Asymptotic Expansions

In Bayesian inference for the exponential family of distributions, the parameter vector θ is considered as a random variable. Given a prior density function for θ , π ( θ ) , the joint posterior density function of the exponential family of distributions with progressively Type-II censored data can be expressed as
f π ( θ | x m : n R ) = L ( x m : n R ; θ ) π ( θ ) L ( x m : n R ; θ ) π ( θ ) d θ ,
and the posterior Bayesian predictive distribution is given by
f ^ π ( x | x m : n R ) = f ( x ; θ ) f π ( θ | x m : n R ) d θ ,
where x is an unobserved set of observations to be predicted and it is independently distributed according to the same density f ( x ; θ ) F . The predictive density f ^ ( x | θ ^ ) is called the plug-in density function or the estimative density function, where θ ^ = θ ^ ( x m : n R ) is an estimate of θ based on the observed progressively Type-II censored sample x m : n R (see, for example, Geisser [19]). Consider the Kullback-Leibler divergence as the loss function, the predictive distribution in Equation (12) is the best predictive distribution in the sense that it minimizes the Bayes risk defined as [20]
π ( θ ) f ( x m : n R ; θ ) f ( x ; θ ) log f ( x ; θ ) f ^ ( x | x m : n R ) μ ( d x ) μ ( d x m : n R ) d θ .
The integral defined in the predictive density in Equation (12) can be difficult to integrate or the form is too complicated to be used in practice. In these situations, asymptotic or large-sample theory (see, for example, Barndorff-Nielsen and Cox [21]) can be considered. In this section, we adopt the metric tensors and the α -connection introduced in Section 2 and Section 3 to study the asymptotic expansions of the posterior joint density and the Bayesian predictive density of the exponential family of distributions with progressively Type-II censored data. A similar asymptotic expansion of Bayesian prediction based on a full sample can be found in Zhang et al. [14]. For simplicity, we only consider the information carried by the joint density function in Equation (4), a similar process can be applied for the situation where the information obtained from the joint density function in Equation (4) and the marginal density function in Equation (10) together.
Theorem 3.
Given a prior distribution π ( θ ) for θ, the posterior distribution in Equation (11) can be expressed asymptotically as
f π ( θ | x m : n R ) = det ( g i j ( θ ) ) ( 2 π ) k / 2 exp 1 2 g i j ( θ ^ ) θ ˜ i θ ˜ j 1 + 1 6 T i j k ( θ ^ ) θ ˜ i θ ˜ j θ ˜ k + ( i log π ( θ ^ ) ) θ ˜ i + o 1 ( n ) ,
where θ ˜ i = θ i θ ^ i and θ ^ is an estimator of parameter set θ.
Proof. 
Using the Laplace method suggested by Nielsen and Cox [21], the posterior distribution can be expressed asymptotically as
f π ( θ | x m : n R ) = det i j l ( x m : n R ; θ ^ ) ( 2 π ) k / 2 exp 1 2 i j l ( x m : n R ; θ ^ ) θ ˜ i θ ˜ j × 1 + 1 6 i j k l ( x m : n R ; θ ^ ) θ ˜ i θ ˜ j θ ˜ k + ( i log π ( θ ^ ) ) θ ˜ i + o 1 ( n ) .
We have
i j l ( x m : n R ; θ ) = m i j φ ( θ ) = g i j ( θ ) , i j k l ( x m : n R ; θ ) = m i j k φ ( θ ) = T i j k ( θ ) ,
which implies that
f π ( θ | x m : n R ) = det ( g i j ( θ ^ ) ) ( 2 π ) k / 2 exp 1 2 g i j ( θ ^ ) θ ˜ i θ ˜ j 1 1 6 T i j k ( θ ^ ) θ ˜ i θ ˜ j θ ˜ k + ( i log π ( θ ^ ) ) θ ˜ i + o 1 ( n ) .
Based on the asymptotic expansion presented in Theorem 3, we can obtain the following result.
Theorem 4.
Given a prior distribution π ( θ ) for θ, the predictive distribution in Equation (12) can be expressed asymptotically as
f ^ π x | x m : n R = f ( x ; θ ^ ) + 1 2 n g i j ( θ ^ ) i j ψ ( θ ^ ) + Γ i j m , l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + 1 n i log π ( θ ^ ) Γ i j e , j ( θ ^ ) g i l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + o ( n 1 ) .
Proof. 
The proof is similar to the proof of Theorem 2 in Komaki [3]. The proof can be completed by substituting i j f ( x ; θ ^ ) and i f ( x ; θ ^ ) with i j ψ ( θ ^ ) and c i ( x ) i ψ ( θ ^ ) , respectively. □
If the prior distribution π ( θ ) is the Jeffreys prior π J ( θ ) | g i j | 1 2 , then from the relationship
i log π ( θ ) = i log | g i j ( θ ) | 1 2 = 1 2 i g i j ( θ ) g i j ( θ ) = Γ i j j ( θ ) = Γ i j e , j ( θ ) + 1 2 T i ( θ ) ,
we have
i log π ( θ ) Γ i j e , j ( θ ) = 1 2 T i ( θ ) .
The following results can be immediately obtained.
Corollary 1.
Given the Jeffreys prior π J ( θ ) | g i j ( θ ) | 1 / 2 , the posterior distribution in Equation (11) can be asymptotically expanded as
f π J ( θ | x m : n R ) = det ( g i j ( θ ^ ) ) ( 2 π ) k / 2 exp 1 2 g i j ( θ ^ ) θ ˜ i θ ˜ j ( 1 + 1 6 T i j k ( θ ^ ) θ ˜ i θ ˜ j θ ˜ k + Γ i j e , j ( θ ^ ) + 1 2 T i ( θ ^ ) θ ˜ i + o 1 ( n ) ) ,
Corollary 2.
Given the Jeffreys prior π J ( θ ) | g i j ( θ ) | 1 / 2 , the prediction (12) can be asymptotically expanded as
f ^ π J x | x m : n R = f ( x ; θ ^ ) 1 2 n g i j ( θ ^ ) i j ψ ( θ ^ ) + Γ i j m , l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + 1 2 n T i ( θ ^ ) g i l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + o ( n 1 ) .
These results show that the predictive density function, when the sample size n approaches infinity, is the estimative density function in the asymptotic sense.

5. Illustration Example

The illustration of the geometric quantities for exponential distribution has been provided in the literature (see, for example, [12]). In this section, we use the Rayleigh distribution, a member of the exponential family of distributions, presented in Section 2 as an example to illustrate our results. Suppose that x m : n R is the progressively Type-II censored order statistics form items with lifetimes follow the Rayleigh distribution with density function in Equation (3), then the joint density function of x m : n R can be expressed as
L ( x m : n R ; θ ) = c ( R ) r = 1 m f ( x r : m : n R , θ ) 1 F ( x r : m : n R , θ ) R r = c ( R ) 2 m exp r = 1 m i = 1 τ θ i e i x r : m : n R m φ ( θ ) ,
where = 2 . θ 1 = λ , θ 2 = 1 , e 1 x r : m : n R = ( 1 + R r ) x r : m : n R 2 , e 2 x r : m : n R = ln x r : m : n R and φ ( θ ) = ψ ( θ ) = ln ( λ ) . Let
l ( x m : n R ; θ ) = log L ( x m : n R ; θ ) = log ( c ( R ) 2 m ) + r = 1 m i = 1 τ θ i e i x r : m : n R m φ ( θ ) .
Then, the first three derivatives of the function l ( x m : n R ; θ ) can be obtained as
1 l ( x m : n R ; θ ) = r = 1 m ( 1 + R r ) x r : m : n R 2 + m λ , 1 1 l ( x m : n R ; θ ) = m λ 2 , 1 1 1 l ( x m : n R ; θ ) = 2 m λ 3 .
The maximum likelihood estimator (MLE) of the parameter λ can be derived as
λ ^ = m r = 1 m ( 1 + R r ) x r : m : n R 2 .
We first consider the information carried by the joint density in Equation (4). The metric tensors have one element, that is,
g 11 ( θ ) = m 1 1 φ ( θ ) = m λ 2 .
The skewness tensor can be written as
T 111 ( θ ) = m 1 1 1 φ ( θ ) = 2 m λ 3 .
The affine connection and the α -connection can be obtained as
Γ 111 ( θ ) = m 1 1 φ ( θ ) E L 1 l ( x m : n R ; θ ) = 0 , Γ 111 α ( θ ) = ( 1 α ) m 2 1 1 1 φ ( θ ) = ( α 1 ) m λ 3 ,
respectively. The coefficients of the m-connection and e-connection are
G 11 m , 1 ( θ ) = Γ 111 m ( θ ) g 11 ( θ ) = 2 / λ , G 11 e , 1 ( θ ) = Γ 111 e ( θ ) g 11 ( θ ) = 0 ,
respectively. For Bayesian inference, we consider the Jeffreys prior for the parameter λ , i.e.,
π J ( θ ) m / λ ,
then the posterior distribution of λ is
f π J ( θ | x m : n R ) = exp r = 1 m i = 1 τ θ i e i x r : m : n R ( m 1 ) φ ( θ ) exp r = 1 m i = 1 τ θ i e i x r : m : n R ( m 1 ) φ ( θ ) d θ ,
which can be written as
f π J ( θ | x m : n R ) = det ( g i j ( θ ^ ) ) ( 2 π ) k / 2 exp 1 2 g i j ( θ ^ ) θ ˜ i θ ˜ j ( 1 1 6 T i j k ( θ ^ ) θ ˜ i θ ˜ j θ ˜ k + Γ i j e , j ( θ ^ ) + 1 2 T i ( θ ^ ) θ ˜ i + o 1 ( n ) ) = m λ ^ 2 ( 2 π ) 1 / 2 exp m 2 λ ^ 2 ( λ λ ^ ) 2 1 + m 3 λ ^ 3 ( λ λ ^ ) 3 1 λ ^ ( λ λ ^ ) + o ( n 1 ) .
Here, the predictive distribution is
f ^ π J x | x m : n R = c ( R ) 2 m exp r = 1 m i = 1 τ θ i e i x r : m : n R θ 1 x 2 + ln ( x ) ( m + 1 ) φ ( θ ) d θ ,
which can be expanded asymptotically as
f ^ π J x | x m : n R = f ( x ; θ ^ ) 1 2 n g i j ( θ ^ ) i j ψ ( θ ^ ) + Γ i j m , l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + 1 2 n T i ( θ ^ ) g i l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + o ( n 1 ) = f ( x , λ ^ ) λ ^ 2 2 n m 1 2 λ ^ 2 2 λ ^ 1 2 λ ^ x 2 + 1 2 n 2 m λ ^ 3 λ ^ 2 m λ ^ 2 m 1 2 λ ^ x 2 + o ( n 1 ) = f ( x , λ ^ ) + 1 4 m n + o ( n 1 ) .
In the following, we consider the information obtained from the marginal density function in Equation (10) and the joint density function in Equation (4) together. Notice that
f x r : m : n R ( x ) = c r 1 s = 1 r a s , r f ( x ) ( 1 F ( x ) ) γ s 1 = c r 1 s = 1 r a s , r 2 λ x exp λ γ s x r : m : n R 2 , r = 1 , , m .
The n-order moment about the origin of the r-th progressively Type-II censored order statistic is given by
E f x r : m : n R n = x n f x r : m : n R ( x ) μ ( d x ) = c r 1 s = 1 r a s , r λ Γ ( n 2 + 1 ) ( λ γ s ) n 2 + 1 ,
which implies
E f 1 l x m : n R ; θ = r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r λ γ s 2 + m λ .
Thus, the affine connection is specified as
Γ ˜ 111 ( θ ) = E f 1 1 l ( x m : n R ; θ ) 1 l ( x m : n R ; θ ) = m λ 3 r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r γ s 2 m 2 λ 3 .
The metric tensor g ˜ 11 ( θ ) and the skewness tensor T ˜ 111 ( θ ) are the same as the expressions in Equations (13) and (14). The α -connection is reduces to
Γ ˜ 111 α ( θ ) = m λ 3 r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r γ s 2 m 2 λ 3 ( 1 α ) m λ 3 .
The coefficients of the m-connection and the e-connection are
Γ ˜ 11 m , 1 ( θ ) = 1 λ r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r γ s 2 2 + m λ , and   Γ ˜ 11 e , 1 ( θ ) = 1 λ r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r γ s 2 m λ ,
respectively. Therefore, based on the Jeffreys prior π J ( θ ) m / λ , the Bayesian predictive density function of the Rayleigh distribution with progressively Type-II censored data can be asymptotically expanded as
f ^ π J x | x m : n R = f ( x ; θ ^ ) 1 2 n g ˜ i j ( θ ^ ) i j ψ ( θ ^ ) + Γ ˜ i j m , l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + 1 2 n T ˜ i ( θ ^ ) g i l ( θ ^ ) ( c l ( x ) l ψ ( θ ^ ) ) + o ( n 1 ) = f ( x , λ ^ ) + 1 4 m n 1 + 2 λ ^ x 2 1 r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r γ s 2 m + o ( n 1 ) .
This shows that the predictive density function, with the increase of the sample size n and the observed sample size m, is the estimative density function in the asymptotic sense. The term
2 λ ^ x 2 1 4 m n r = 1 m s = 1 r ( 1 + R r ) c r 1 a s , r γ s 2 m
can be considered the correction term due to the information carried by the density function in Equation (10).

6. Monte Carlo Simulation Study and Real Data Analysis

In this section, we present a Monte Carlo simulation study of the Bayesian prediction based on progressively Type-II censored data described in Section 4. We also present a real data analysis based on the progressive Type-II censored data discussed in the literature. In the Monte Carlo simulation study, we consider different sample sizes ( n , m ) = ( 10 , 30 ) , (10, 35), (15, 40) and (20, 40) and three different censoring schemes:
R 1 :
R 1 = R 2 = = R m 1 = 0 , R m = n m ;
R 2 :
R 1 = n m , R 2 = R 3 = = R m = 0 ;
R 3 :
R 1 = = R m 1 = 1 , R m = n 2 m + 1 .
The progressively Type-II censored data, x m : n R , are generated based on the Rayleigh distribution in Equation (3) with parameter λ = 2 for different sample sizes and censoring schemes. For the proposed Bayesian prediction (BP), we consider two different priors: (i) the Jeffreys prior π J ( θ ) m / λ ; and (ii) the uniform prior π I on interval ( 0 , 3 ) . For comparative purposes, we also consider the plug-in prediction (PP) approach in which the estimative density function f ^ ( x , λ ^ ) is also considered. For the plug-in approach, the parameter is estimated by using the maximum likelihood method based on the simulated progressive Type-II censored sample x m : n R . The estimated biases and mean square errors (MSEs) of different prediction approaches for predicting the probability density at x = 2.5 based on 10,000 simulations are presented in Table 1.
From Table 1, we observe that the performances of all prediction methods improve in terms of MSEs as the sample sizes m and n increase. In other words, the number of items being removed during the progressively Type-II censored experiment affects the performance of prediction. Moreover, we observe that the Bayesian prediction method with the Jeffreys prior can provide smaller biases and smaller MSEs compared to the plug-in prediction method in some cases.
To illustrate the practical applications of the approximate methods based on geometric quantities proposed in this paper, we analyze a real data set which contains the times to breakdown of an insulating fluid at 34 kV originally presented in Nelson [22] (Table 6.1). A progressively Type-II censored sample of size m = 8 generated from the n = 19 observations by Balakrishnan et al. [9] is analyzed here. The progressively censored sample and the progressive censoring scheme are presented in Table 2.
Suppose that we assume the lifetimes of the insulating fluid tested at 34 kV follow a Rayleigh distribution and we are interested in predicting the probability density, based on the progressive Type-II censored data presented in Table 2, the predicted density curves obtained from the plug-in prediction approach and the proposed Bayesian prediction approach with two different priors are presented in Figure 1. From Figure 1, we observe that the three prediction methods provide similar predicted density curves in this case. For instance, if we are interested in predicting density at x = 2.8 , based on the progressive Type-II censored data presented in Table 2, the predicted values of plug-in prediction density f ^ ( x , λ ^ ) is 0.230 , and the Bayesian prediction densities f ^ π J x | x m : n R with Jeffreys prior π J and f ^ π I x | x m : n R uniform prior π I are 0.229 and 0.232 , respectively.

7. Conclusions

In this paper, we discussed the tangent space, affine connection, α -connection, torsion and Riemann-Christoffel curvature of statistical manifold induced by the exponential family of distributions. As applications of these geometric quantities, the asymptotic expansions of the Bayesian posterior distribution and prediction function with progressively Type-II censored data were discussed. The results showed that the asymptotic expansions are related to the geometric quantities. We also illustrated the main results by studying the Rayleigh distribution. Note that more theoretical results and applications of information geometry in reliability in addition to the main results of this paper can be found in the Ph.D. thesis [23].

Author Contributions

Conceptualization, F.Z. and X.S.; Methodology, F.Z., X.S. and H.K.T.N.; Writing—original draft, F.Z.; Writing—review & editing, H.K.T.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (No. 12071372, 11528102 and 11571282), the Fundamental Research Funds for the Central Universities (No. JBK2001001 and JBK1806002) of China. H. K. T. Ng’s work was supported by a grant from the Simons Foundation (#709773 to Tony Ng).

Acknowledgments

The authors sincerely thank the guest editor, Maria Longobardi, for the invitation to contribute an article and the three anonymous reviewers for their comments and suggestions which greatly improved this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Amari, S. Differential-Geometrical Methods in Statistics; Lecture Notes in Statistics; Springer: New York, NY, USA, 1985; Volume 28. [Google Scholar]
  2. Amari, S.; Barndorff-Nielsen, O.E.; Kass, R.E.; Lauritzen, S.L.; Rao, C.R. Differential Geometry in Statistical Inference. IMS Lecture Notes: Monograph Series 10; Institute of Mathematical Statistics; IMS: Hayward, CA, USA, 1987. [Google Scholar]
  3. Komaki, F. On asymptotic properties of predictive distributions. Biometrika 1996, 83, 299–313. [Google Scholar] [CrossRef]
  4. Komaki, F. Asymptotic Properties of Bayesian Predictive Densities When the Distributions of Data and Target Variables are Different. Bayesian Anal. 2015, 10, 31–51. [Google Scholar] [CrossRef]
  5. Harsha, K.V.; Moosath, K.S.S. Dually flat geometries of the deformed exponential family. Phys. A 2015, 433, 136–147. [Google Scholar]
  6. Ng, H.K.T. Censoring Methodology. In International Encyclopedia of Statistical Science; Lovric, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  7. Algarni, A.; Almarashi, A.M.; Okasha, H.; Ng, H.K.T. E-Bayesian Estimation of Chen distribution Based on Type-I Censoring Scheme. Entropy 2020, 22, 636. [Google Scholar] [CrossRef] [PubMed]
  8. Balakrishnan, N.; Aggarwala, R. Progressive Censoring: Theory, Methods, and Applications; Birkhäuser: Boston, MA, USA, 2000. [Google Scholar]
  9. Viveros, R.; Balakrishnan, N. Interval estimation of parameters of life from progressively censored data. Technometrics 1994, 36, 84–91. [Google Scholar] [CrossRef]
  10. Balakrishnan, N.; Cramer, E. The Art of Progressive Censoring. Applications to Reliability and Quality; Birkhäuser: New York, NY, USA, 2014. [Google Scholar]
  11. Balakrishnan, N.; Kundu, D. Hybrid censoring: Models, inferential results and applications. Comput. Stat. Data Anal. 2013, 57, 166–209. [Google Scholar] [CrossRef]
  12. Amari, S. Information Geometry and Its Applications; Springer: Tokyo, Japan, 2016. [Google Scholar]
  13. Zhang, F.D.; Ng, H.K.T.; Shi, Y.M.; Wang, R.B. Amari-Chentsov structure on the statistical manifold of models for accelerated life tests. TEST 2019, 28, 77–105. [Google Scholar] [CrossRef]
  14. Zhang, F.D.; Shi, Y.M.; Ng, H.K.T.; Wang, R.B. Information Geometry of Generalized Bayesian Prediction Using α-divergences as Loss Functions. IEEE Trans. Inf. Theory 2018, 64, 1812–1824. [Google Scholar] [CrossRef]
  15. Zhang, F.D.; Ng, H.K.T.; Shi, Y.M. Mis-specifcation analysis of Wiener degradation models by using f-divergence with outliers. Reliab. Eng. Syst. Saf. 2020, 195, 106751. [Google Scholar] [CrossRef]
  16. Barndorff-Nielsen, O.E. Information and Exponential Families in Statistical Theory; Wiley Series in Probability and Mathematical Statistics; John Wiley and Sons: Hoboken, NJ, USA, 1978. [Google Scholar]
  17. Kamps, U.; Cramer, E. On distributions of generalized order statistics. Statistics 2001, 35, 269–280. [Google Scholar] [CrossRef]
  18. Balakrishnan, N. Progressive censoring methodology: An appraisal (with discussions). Test 2007, 16, 211–296. [Google Scholar] [CrossRef]
  19. Geisser, S. Predictive Inference: An Introduction; Chapman and Hall: New York, NY, USA, 1993. [Google Scholar]
  20. Hartigan, J.A. The maximum likelihood prior. Ann. Stat. 1998, 26, 2083–2103. [Google Scholar] [CrossRef]
  21. Barndorff-Nielsen, O.E.; Cox, D.R. Inference and Asymptotics; Chapman and Hall: London, UK, 1994. [Google Scholar]
  22. Nelson, W. Applied Life Data Analysis; Wiley: New York, NY, USA, 1982. [Google Scholar]
  23. Zhang, F.D. On the Information Geometry and Tsallis Statistics in the Reliability Analysis and Its Applications. Ph.D. Thesis, Northwestern Polytechnical University, Xi’an, China, 2017. [Google Scholar]
Figure 1. The predicted density curves of Rayleigh distribution obtained from the Bayesian prediction approach with Jeffreys prior (BPJ), with uniform prior to interval ( 0 , 3 ) (BPU), and the plug-in prediction (PP) approach based on the data presented in Table 2.
Figure 1. The predicted density curves of Rayleigh distribution obtained from the Bayesian prediction approach with Jeffreys prior (BPJ), with uniform prior to interval ( 0 , 3 ) (BPU), and the plug-in prediction (PP) approach based on the data presented in Table 2.
Entropy 23 00687 g001
Table 1. Simulated biases and mean square errors (MSEs) of different prediction methods based on Rayleigh distribution with λ = 2 .
Table 1. Simulated biases and mean square errors (MSEs) of different prediction methods based on Rayleigh distribution with λ = 2 .
( m , n ) SchemesPP f ^ ( x ; λ ^ ) BP f ^ π J x | x m : n R BP f ^ π I x | x m : n R
BiasMSEBiasMSEBiasMSE
R 1 0.045 0.032 0.026 0.027 0.038 0.035
( 10 , 30 ) R 2 0.035 0.031 0.024 0.023 0.021 0.034
R 3 0.034 0.032 0.025 0.026 0.013 0.037
R 1 0.030 0.029 0.031 0.023 0.016 0.033
( 10 , 35 ) R 2 0.025 0.027 0.020 0.022 0.028 0.031
R 3 0.022 0.028 0.024 0.024 0.027 0.032
R 1 0.023 0.019 0.021 0.022 0.022 0.025
( 15 , 40 ) R 2 0.017 0.020 0.013 0.018 0.013 0.027
R 3 0.024 0.021 0.021 0.019 0.023 0.023
R 1 0.023 0.019 0.021 0.010 0.026 0.014
( 20 , 40 ) R 2 0.018 0.014 0.015 0.008 0.024 0.015
R 3 0.013 0.017 0.010 0.009 0.017 0.012
Table 2. Progressively Type-II censored sample of the times to breakdown data on insulating fluid tested at 34 KV with n = 19 and m = 8 obtained from Balakrishnan et al. [9].
Table 2. Progressively Type-II censored sample of the times to breakdown data on insulating fluid tested at 34 KV with n = 19 and m = 8 obtained from Balakrishnan et al. [9].
r12345678
x r : n 0.190.780.961.312.784.856.507.35
R r 00303005
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhang, F.; Shi, X.; Ng, H.K.T. Information Geometry of the Exponential Family of Distributions with Progressive Type-II Censoring. Entropy 2021, 23, 687. https://doi.org/10.3390/e23060687

AMA Style

Zhang F, Shi X, Ng HKT. Information Geometry of the Exponential Family of Distributions with Progressive Type-II Censoring. Entropy. 2021; 23(6):687. https://doi.org/10.3390/e23060687

Chicago/Turabian Style

Zhang, Fode, Xiaolin Shi, and Hon Keung Tony Ng. 2021. "Information Geometry of the Exponential Family of Distributions with Progressive Type-II Censoring" Entropy 23, no. 6: 687. https://doi.org/10.3390/e23060687

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop