均值用DP混合,方差是gamma分布的程序
Package ‘dpmixsim’
February 19, 2015
Version 0.0-8Date 2012-07-24
Title Dirichlet Process Mixture model simulation for clustering and
image segmentation Author Adelino Ferreira da Silva Maintainer Adelino Ferreira da Silva Depends R (>=2.10.0), oro.nifti, cluster
Description The package implements a Dirichlet Process Mixture (DPM)
model for clustering and image segmentation. The DPM model is a Bayesian nonparametric methodology that relies on MCMC
simulations for exploring mixture models with an unknown number of components. The code implements conjugate models with normal structure (conjugatenormal-normal DP mixture model). The package's applications are oriented towards the
classificationof magnetic resonance images according to tissue type or region of interest. License GPL (>=2) Repository CRAN
Date/Publication2012-07-2506:29:31NeedsCompilation yes
R topics documented:
dpmixsim . . galaxy . . . . postdataseg . postdpmixciz postimgclgrp postimgcomps postkcluster . premask . . . prescale . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. 4. 5. 6. 7. 8. 9. 10. 11
2dpmixsim
readsliceimg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12t1_pn3_rf0_slice_0092.Rd. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13t1_pn3_rf0_slice_0092_mask.Rd. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Index 15
dpmixsim
Dirichlet Process mixture model for clustering and image segmenta-tion
Description
dpmixsim implements a Dirichlet Process mixture (DPM)model. The DPM model is a Bayesian nonparametric methodology that relies on MCMC simulations for exploring mixture models with an unknown number of components. The function implements conjugate models with normal structure (conjugatenormal-normal DP mixture model). Usage
dpmixsim(x,M=1,a=1,b=1,upalpha=1,a0=2,b0=2,maxiter=4000,rec=3000,fsave=NA,kmax=30,nclinit=NA,minvar=0.001)Arguments
x M a b upalpha a0b0maxiter rec fsave kmax nclinit
scaled input data as vector in range {0,1}DP precision hyperparameter Gamma prior hyperparameter Gamma prior hyperparameter
is a logical variable for simulations with {automatic,fixed}calibration of the precision hyperparameter M (default=‘TRUE ’)Gamma prior hyperparameter for M (default2) Gamma prior hyperparameter for M (default2) maximum number of MCMC iteration steps record the last ‘rec ’iteration steps
filenamefor saving the MCMC simulation (def:‘NULL ’do not save) maximum number of clusters in the simulation, (default30)
number of initial clusters to use at the beginning of the simulation. If not speci-fied(NA)the number of initial clusters is equal to the length of x (oneelement per cluster); (default:NA)
minimum value admissible for a cluster variance (default=0.001).Decreasing ‘minval ’may improve resolution (distributionfitness),but increases the max-imum number of admissible clusters (‘kmax ’).In this case, you may have to increase (‘kmax ’)as well.
minvar
dpmixsim Details
3
Consider n observations x 1,..., x n which we regard as exchangeable. We model the distribution from which the x i are drawn as a mixture of distributions. Dirichlet process mixture models are based on Dirichlet process priors for the primary parameters θi . DP mixture models assume that the prior distribution function G itself is uncertain, drawn from a Dirichlet process G ∼DP (MG 0) , with base prior G 0and precision parameter M . This specificationmay be expressed by the hierarchical model:
x i ∼N (. |θi , σ2) θi ∼G
G ∼DP (MN (0, 1)) σ−2∼Gamma (a, b ) Value
simulation output as a list of draws containing:krec wrec phirec varrec Author(s)
Adelino Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia, Por-tugal, . References
Adelino Ferreira da Silva, A Dirichlet process mixture model for brain MRI tissue classification,Medical Image Analysis 11(2007)169-182.
Adelino Ferreira da Silva, Bayesian mixture models of variable dimension for image segmentation, Comput. Methods Programs Biomed. 94(2009)1-14. See Also
readsliceimg , postdataseg , postdpmixciz , postimgclgrp , postimgcomps , postkcluster , premask , readsliceimg Examples
##Not run:
##Example 1:simple test data("galaxy")
x0
maxiter
z
using galaxy data
cluster indicator variables cluster weights theta cluster parameters sigma cluster parameters
3000; ngrid
a=1,b=0.1,upalpha=1,maxiter=maxiter,rec=rec,res=res,rec=rec,ngrid=ngrid,plot=T)
4
##
res
z
demo(testMarronWand)##-----------------##Example 3:MRI segmentation
##Testing note:this example should reproduce the equivalent segmented ##images used in the author s references
slicedata
image(slicedata$niislice,col=gray((0:255)/256),main="originalimage") x0
res
rec=rec,nclinit=8,minvar=0.002)##post-simulation ngrid
z
x0
cx
postimgclgrp(slicedata$mask,cx, palcolor=FALSE)cat("***display all clusters:\n")postimgcomps(slicedata$mask,cx)
cat("***re-cluster with 4clusters:\n")postkcluster(slicedata$mask,cx, clk=4)##End(Notrun)
galaxy
galaxy Galaxy velocities
Description
This data set considers physical information on velocities (km/second)for 82galaxies reported by Roeder (1990).These are drawn from six well-separated conic sections of the Corona Borealis region. Usage
data(galaxy)Format
A data frame with 82observations on the following variable. speed a numeric vector giving the speed of galaxies ((km/second))
postdataseg Source
5
Roeder, K. (1990)Density estimation with confidencesets exemplifiedby superclusters and voids in the galaxies, Journal of the American Statistical Association, 85:617-624. References
Escobar, M.D. and West, M. (1995)Bayesian Density Estimation and Inference Using Mixtures. Journal of the American Statistical Association, 90:577-588. Examples
data(galaxy)
##maybe str(galaxy); plot(galaxy)...
postdataseg Data segmentation
Description
postdataseg performs data segmentation based on labelled cluster estimates. Usage
postdataseg(x,z, ngrid, dbg=FALSE)Arguments
x z ngrid dbg Details
Once the distributions of the indicator variables z i are calculated we can separate the components of the mixture. Individual components are selected according to the most probable z i value in a given region of the distributional space, leading to a partition of this space into regions. Intensity threshold values are associated with the partition of the distributional space to drive the image segmentation. In brief, the partition of the distributional space induced by the z values is used to segment the data space. From a computational point of view, the use of these two separate spaces enables us to optimize the MCMC implementation. Value
cx
vector of image cluster values
full-sized scaled image data prepared by premask cluster labels produced by postdpmixciz dimension of the grid used in estimation
logical variable to show debugging output (default=‘FALSE ’)
6Author(s)
postdpmixciz
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia, . See Also
dpmixsim , readsliceimg , premask , postdpmixciz Examples
##Not run:
##see Example 2in dpmixsim. ##End(Notrun)
postdpmixciz Summary statistics and cluster estimation
Description
postdpmixciz computes post-simulation summary statistics, and estimates cluster partition. Usage
postdpmixciz(x,res, kmax=30,rec=300,ngrid=200,plot=TRUE)Arguments
x kmax res rec ngrid plot Value
z Author(s)
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia, .
cluster partition estimation data used in the simulation maximum number of clusters output of the MCMC simulation number of recorded iteration steps
dimension of the grid used in density estimation logical variable to omit plots (default=‘TRUE ’
postimgclgrp References
7
Adelino Ferreira da Silva, A Dirichlet process mixture model for brain MRI tissue classification,Medical Image Analysis 11(2007)169-182.
Adelino Ferreira da Silva, Bayesian mixture models of variable dimension for image segmentation, Comput. Methods Programs Biomed. 94(2009)1-14. See Also
dpmixsim Examples
##Not run:
##Example:MRI brain image segmentation
slicedata
image(slicedata$niislice,col=gray((0:255)/256),main="originalimage") x0
res
rec=rec,nclinit=8)##post-simulation ngrid
z
postimgclgrp Segment image with the estimated number of components
Description
postimgclgrp displays the segmented image with the estimated number of components Usage
postimgclgrp(mask,cx, palcolor=TRUE)Arguments
mask cx palcolor Details
Display image segmentation with the estimated number of components.
full-sized scaled image data prepared by premask data segmentation prepared by postdataseg
logical variable for selecting colored/greyimage visualization (default=‘TRUE ’)
8Author(s)
postimgcomps
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia, . References
Adelino Ferreira da Silva, A Dirichlet process mixture model for brain MRI tissue classification,Medical Image Analysis 11(2007)169-182.
Adelino Ferreira da Silva, Bayesian mixture models of variable dimension for image segmentation, Comput. Methods Programs Biomed. 94(2009)1-14. See Also
dpmixsim , readsliceimg , premask , postdpmixciz , postdataseg Examples
##Not run:
##see Examples in dpmixsim . ##End(Notrun)
postimgcomps Display cluster components
Description
postimgcomps displays the components of the segmented image with the estimated number of components Usage
postimgcomps(mask,cx) Arguments
mask cx Details
Display components based on the estimated number of clusters. Author(s)
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia, .
scaled masked full-sized image data prepared by premask data segmentation prepared by postdataseg
postkcluster References
9
Adelino Ferreira da Silva, A Dirichlet process mixture model for brain MRI tissue classification,Medical Image Analysis 11(2007)169-182.
Adelino Ferreira da Silva, Bayesian mixture models of variable dimension for image segmentation, Comput. Methods Programs Biomed. 94(2009)1-14. See Also
dpmixsim , readsliceimg , premask , postdpmixciz , postdataseg , postimgclgrp Examples
##Not run:
##see Examples in dpmixsim . ##End(Notrun)
postkcluster Segmentation with a fixednumber of clusters
Description
postkcluster re-clusters the data with a user-specifiednumber of components, and displays the segmented image. Usage
postkcluster(mask,cx, clk=4,plot=TRUE)Arguments
mask cx clk
masked full-sized image data prepared by premask data segmentation prepared by postdataseg
desired fixednumber of components, including the background component, to use in the data segmentation; default ‘clk=4’:gray matter (GM),white matter (WM),CSF, and background
logical variable; enables suspension of output images (default=‘TRUE ’)
plot Details
Partitioning clustering around medoids (PAM)is applied to the classes simulated from dpmixsim as a post-processing step. This procedure may be applied to merge clusters, and reduce the number of clusters to the specifiedvalue ‘clk ’.postkcluster computes a clara object using cluster (seeStruyf et.al.), a list representing a clustering of the data into ‘clk ’clusters.
10Author(s)
premask
Adelino Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia, Por-tugal, . References
Adelino Ferreira da Silva, A Dirichlet process mixture model for brain MRI tissue classification,Medical Image Analysis 11(2007)169-182.
Adelino Ferreira da Silva, Bayesian mixture models of variable dimension for image segmentation, Comput. Methods Programs Biomed. 94(2009)1-14.
Anja Struyf, Mia Hubert &Peter J. Rousseeuw (1996):Clustering in an Object-Oriented Environ-ment. Journal of Statistical Software, 1. http://www.stat.ucla.edu/journals/jss/See Also
dpmixsim , readsliceimg , premask , postdpmixciz , postdataseg , postimgcomps Examples
##Not run:
##see Examples in dpmixsim . ##End(Notrun)
premask Data preparation
Description
premask applies a pre-definedmask to a MRI slice in order to select regions of interest (ROIs)for processing Usage
premask(slicedata,subsamp=TRUE)Arguments
slicedata subsamp
list as output by read.sliceimg
logical variable; if ‘TRUE ’a downsampled image by a factor of 2is used in the MCMC simulation, otherwise the full-sized image is taken. After parameter estimation, the full-sized image should be used for clustering and image seg-mentation. The use of downsampled images can substantially reduce runtime, with little quality degradation.
prescale
Value
xv
Author(s)
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia,
.
See Also
dpmixsim , readsliceimg
Examples
##Not run:slicedata
##End(Notrun) 11processed data vector
prescale Data preparation
Description
prescale scales data to be in the range {0,1},as a preparation for simulation.
Usage
prescale(xv)
Arguments
xv
Value
x
Author(s)
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia,
.
See Also
dpmixsim , readsliceimg scaled data vector unscaled data vector
12
Examples
##Not run:data("galaxy")x0
##End(Notrun) readsliceimg
readsliceimg Read MRI slice data
Description
readsliceimg reads MRI and mask data.
Usage
readsliceimg(fbase="t1_pn3_rf0",swap=FALSE)
Arguments
fbase Indicates the dataset prefixof the MRI dataset to use. The prefixapplies to data files:‘{fbase}_slice_0092.nii.gz’,and ‘{fbase}_slice_0092_mask.nii.gz’.
These data fileswere obtained from the BrainWeb repository of the McConnell
Brain Imaging Center at the Montreal Neurological Institute. BrainWeb anatom-
ical models uses MRI slices of dimension 181x217pixels. The datasets included
in the package for demonstration correspond to a T1BrainWeb image for slice
number 92, with 3%noise and 0%intensity non-uniformity.
logical variable (default=‘FALSE ’)for choosing the right/leftdata display con-vention consistent with FSLVIEW swap
Details
The FSL tools may be used to prepare the MRI data and the mask required as data input. The package oro.nifti is used for reading gzipped NIFTI files.
Value
a list containing
fbase
niislice
mask
nrow
ncol
swap dataset prefixof the dataset used in the analysis slice data at all timepoints slice mask number of rows number of columns relative orientation used in the data setup
t1_pn3_rf0_slice_0092.Rd
Author(s)
A. Ferreira da Silva, Universidade Nova de Lisboa, Faculdade de Ciencias e Tecnologia,
.
References 13
Brandon Whitcher, V olker Schmid and Andrew Thornton, Package oro.nifti :Rigorous -NIfTI Input /Output, 2010.
FSL/FEATAnalysis tool, FMRIB Software Library (FSL)(www.fmrib.ox.ac.uk/fsl)
See Also
dpmixsim
Examples
##Not run:slicedata
##End(Notrun)
t1_pn3_rf0_slice_0092.Rd
Example of a pre-processed MRI slice from the BrainWeb database
Description
The file‘t1_pn3_rf0_slice_0092.nii.gz’is a pre-processed image of slice ‘92’with ‘3%’noise extracted from the Brainweb database file‘t1_icbm_normal_1mm_pn3_rf0\[1\].mnc.gz’.Brain-Web simulations are based on an anatomical model of normal brain, which can serve as the ground truth for any analysis procedure. BrainWeb datasets and are provided by the McConnell Brain Imaging Center at the Montreal Neurological Institute, http://www.bic.mni.mcgill.ca/, (seeCollins et. al. 1998).
Format
The file‘t1_pn3_rf0_slice_0092.ni.gz’is in gzipped NIFTI format. The R-package oro.nifti is required to read gzipped NIFTI files.
References
D.L. Collins, et.al., Design and construction of a realistic digital brain phantom, IEEE Trans. on Medical Imaging 17~(3)(1998)463-468.
S.M. Smith, et. al., Advances in Functional and Structural MR Image Analysis and Implementation as FSL, NeuroImage , 23(S1):208-219,2004.
Brandon Whitcher, V olker Schmid and Andrew Thornton, Package oro.nifti :Rigorous -NIfTI Input /Output, 2010.
14t1_pn3_rf0_slice_0092_mask.Rdt1_pn3_rf0_slice_0092_mask.Rd
Mask filefor MRI slice
Description
The ‘t1_pn3_rf0_slice_0092_mask.nii.gz’definesthe mask for ‘t1_pn3_rf0_slice_0092.nii.gz’,as used in the examples. The mask used here is an all-brain mask; it just removes non-brain regions, as the result of applying a brain extraction tool to the specifieddataset. Other masks may be definedto select regions of interest (ROIs).
Format
The file‘t1_pn3_rf0_slice_0092_mask.nii.gz’is in gzipped NIFTI format. The R-package oro.nifti is required to read gzipped NIFTI files.
References
D.~L.Collins, A.~P.Zijdenbos, V .~Kollokian,J.~G.Sled, N.~J.Kabani, C.~J.Holmes, A.~C.Evans, Design and construction of a realistic digital brain phantom, IEEE Trans. on Medical Imag-ing 17~(3)(1998)463–468.
S.M. Smith, et. al., Advances in Functional and Structural MR Image Analysis and Implementation as FSL, NeuroImage , 23(S1):208-219,2004.
Brandon Whitcher, V olker Schmid and Andrew Thornton, Package oro.nifti :Rigorous -NIfTI Input /Output, 2010.
Index
∗Topic IO
readsliceimg , 12∗Topic cluster
postkcluster , 9∗Topic datasets
galaxy t1_pn3_rf0_slice_0092.Rd, 4
t1_pn3_rf0_slice_0092_mask.Rd, 13, 14∗Topic dplot
postimgclgrp postimgcomps , , 78∗Topic models
dpmixsim utilities , 2∗Topic
postdataseg postdpmixciz , 5
premask prescale , 10, 6
, 11
dpmixsim , 2, 6–11, 13
galaxy , 4
postdataseg postdpmixciz , 3
postimgclgrp , , 35, , 68, –610, 8
postimgcomps , 3, 7, –10postkcluster , 3, 8, 910
premask prescale , 3, , 116, , 83–, 109, 10
readsliceimg , 3, 6, 8–11, 12t1_pn3_rf0_slice_0092.Rdt1_pn3_rf0_slice_0092_mask.Rd, 13, 14
15