# 贝叶斯非参数统计中的先验的估计

The standard parametric Bayesian analysis implies that priors should be specified inadvance, before the statistical inference. To illustrate this sampling structure, we haveto introduce some mathematical symbols to help understand Bayesian analysis.

Chapter 1 Introduction

1.1 Bayesian Nonparametrics
Bayesian nonparametrics (BNP) is a relatively young, yet fast growing field of statistics, especially compared against its two cousins  frequentist nonpar ametrics and classical parametric Bayes. Actually, the phrase "Baysian nonpar ametrics" is still an am?biguous and not clearly-defined name. Generally speaking, statisticians would regardresearch topics those beyond frequentist and parametric Bayes, as the study territory ofBNP. One will form a relatively explicit impression if we talk it contrasted to these twoclassical research fields. Compared with frequentist nonpar ametrics, BNP assumes anextra Bayesian structure and makes statistical inference based on posterior distributionsor predictive distributions, then links itself naturally with loss functions and decisiontheory. On the other hand, different from Bayesian par ametrics, BNP supposes a non-parametric sampling distribution (an unknown likelihood) to observations then possessesa larger infinite-dimensional parameter space than the finite Euclidean parameter space ofBayesian par ametrics. Theoretically, it will be very complicated to identify this infinite-dimensional space and to assign a proper prior distribution on it, and be full of challengesto verify the asymptotic properties of its performance and to determine the associatedcomputation and simulation algorithms, for example, when simulating nonparametricdensity curves or surfaces from a complicated posterior distribution. Though the infiniteparameters introduce a lot of challenges, BNP as a novel modelling approach still attractsgreat attentions because of its desirable advantages. On one hand, it avoids to dependingon parametric assumptions and then makes the parametric models more robust. On theother hand, it will embed the parametric models into a larger nonparametric framework,for example, taking a given parametric model as a prior guess of nonparametric models.These points would be illustrated again in the following paragraphs.
…………

1.2 Model Assumption and Empirical Bayes Method
The standard parametric Bayesian analysis implies that priors should be specified inadvance, before the statistical inference. To illustrate this sampling structure, we haveto introduce some mathematical symbols to help understand Bayesian analysis. Bayesiananalysts consider the parameters in the sampling distribution as random variables, andpresumably assign certain distributions to them. Before going further into the BNP assumptions in the empirical Bayes framework,we give a brief investigation to empirical Bayes. Historically, empirical Bayes (referredto as EB for short) is firstly suggested by Robbiiis (1956, 1964, 1983), then developedby many statisticians typically such as Efron and Morris (1972，1973)，Morris (1983),Casella (1985), Berger (1985)，Maritz and Lwin (1989)，Carlin and Louis (2000). WhileEB for parametric models has been fruitfully investigated for many decades, it appearsthat only limited attentions have been paid to empirical nonparametric Bayes. The exact time of combination between BNP and empirical Bayes might be ambiguous and to our knowledge, Korwar and Hollander (1973) had contributed an eminent result by providing EB estimates to two parameters in Dirichlet process prior; other similar achieve?ments including but not limited to Berry and Christensen (1979), Zehnwirth (1979, 1981),McAuliffe et al. (2006), among others.
…………

Chapter 2 A New Sufficient Condition for Identifiability ofCountably Infinite Mixtures

2.1 Introduction
For a collection ^ of parametric distributions indexed by a parametric space ,identifiability of means that different elements in 0 identify different distributions inThis notion is of fundamental importance in such statistical inference as parameterestimation and hypothesis test for ; parameter estimation problem would not makesense without identifiability and, technically, identifiability is one of the preconditionsin proving consistency of the maximum likelihood estimate, posterior consistency andthe weak convergence of probability measures, see, e.g., Doob (1949)，Reiers0l (1950)，Chandra (1977) and DasGupta (2008). In the case where  is constructed by mixinga family of distributions with a family of distributions on typically arising inempirical Bayesian analysis, the identifiability translates to the property that differentelements in give rise to different distributions and is referred to as identifiabilityof mixture distributions.
…………

2.2 Definitions and Notations
Though identifiability for finite mixtures has been examined by a great deal of au?thors, see for example, Al-Hussaini and Ahmad (1981)，Atienza et al. (2006)，Chandra (1977)，Henna (1994), Holzmann et al. (2004), Holzmann et al. (2006a), Holzmannet al. (2006b), Shamilov and Asma (2009)，Teicher (1961, 1963)，Yakowitz and Spra-gins (1968) for confirmative results and Ahmad and Al-Hussaini (1982) for negative results, it appears that general discussion on countably infinite mixtures is quit limited,for which some examples are Patil and Bildekar (1966) and Tallis (1969). Patil andBildekar (1966) explored ideiitifiability of countably infinite mixtures of discrete proba?bility distributions in terms of infinite-dimensional matrices. Tallis (1969) established asufficient and necessary condition for identifiability, which also needs to check whethercertain type of countably infinite-dimensional matrix is reciprocal. It is noteworthy thatthe infinite-dimensional matrix approach is generally rather difficult to apply so thatTallis (1969) only treated 3 examples. This chapter presents a sufficient condition represented by means of well-ordering on the sets of distributions and uniform convergenceof series, which are easier to check than the the infinite-dimensional matrix conditions byTallis (1969). The powerfulness of this new method is demonstrated by a set of examples.
……

3 Estimation of Dirichlet Process Priors........ 31
3.1 Introduction ........ 31
3.2 Model ........ 33
3.3 Estimation of Precision Parameter a........  35
3.4 Estimation of Probability Measure a........  37
3.5 Simulation  ........ 44
3.6 Conclusion ........ 46
4 On Parameter Estimation of Polya Tree Priors with Multigroup Data........ 57
4.1 Introduction ........ 57
4.2 Polya Tree Priors and Data Structure........  58
4.3 Empirical Estimation of Polya Tree Priors........  67
4.4 Simulations ........ 73
4.5 Discussion........  77
5 On Parameter Estimation of Multivariate Polya Tree Priors........ 83
5. 1 Introduction ........ 83
5.2 Multivariate Polya Tree Prior and Model Assumption........  84
5.2.1 Multivariate Polya tree priors........  84
5.2.2 Data structure ........ 88
5.3 Empirical Estimation of Multivariate Polya Tree Priors........  90
5.4 Simulations ........ 96
5.5 Discussion........  98

Chapter 5 On Parameter Estimation of Multivariate Polya TreePriors with Multigroup Data

5.1 Introduction
This chapter will discuss the problem of estimating the prior parameters of multivariate Polya tree prior based on observed samples. To our knowledge, there are lots ofLiteratures concerning the theory of univariate Polya tree prior (even the more generalunivariate tail-free prior) given by Ferguson (1974)，Mauldin et al. (1992)，Lavine (1992,1994), Paddock et al. (2003), Yang and Wu (2013b) and so forth. An attractive theoretical merit of Polya tree prior is when giving some suitable assumptions, it would takethe set of absolutely continuous distributions with respect to some -finite probabilitymeasure as its support, then equipped with desirable modeling feasibility. With additional analytical tractability, Polya trees are applied widely by statistical scientists, e.g.,see papers of Walker and Mallick (1997，1999)，Berger and Guglielmi (2001)，Hanson andJohnson (2002), Mallick and Walker (2003), etc. …………

Conclusion

This chapter is organized as follows. We give the definition and several theoret?ical properties of multivariate Polya tree priors, accompanied with data structure andmodel assumption in Section 5.2. We generate the moment estimate and the maximumlikelihood estimate to the prior parameters of Polya tree priors and study associatedtheoretical properties in Section 5.3. Furthermore, Section 5.4 provides an example ofsimulation to validate the numerical performance of our empirical Bayes estimates andSection 5.5 briefly summarizes some of our comments and discussions on potential researchorientations.
……………
Reference (omitted)  