|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.metaqtl.algo.EMAlgorithm
public final class EMAlgorithm
This class defines methods to perform the EM-algorithm for univariate mixture of gaussians with known variances.
Field Summary | |
---|---|
static boolean |
DO_SEM
If true the best starting point is used to do Supplemental EM (SEM) in order to compute the DM Matrix.. |
static int |
EM_CONTINUE
The EM convergence status : continue |
static double |
EM_ERR
The EM convergence tolerance. |
static int |
EM_FAILURE
The EM convergence status : failure |
static int |
EM_ITER_MAX
The EM maximum number of iterations. |
static double |
EM_MIN_DISTANCE
The minimal mahalanobis distance between components. |
static int |
EM_OK
The EM convergence status : success |
static int |
EM_START
The number of replicates for a run of the EM-algorithm. |
static double |
TINY_Z_PROBA
The minimal value for Z proba in EM-clustering |
Constructor Summary | |
---|---|
EMAlgorithm()
|
Method Summary | |
---|---|
static void |
computeCOV(double[] sd,
EMResult theta)
Computes the variance-covariance matrix of the mixture component estimates. |
static void |
computeDM(double[] x,
double[] sd,
double[] mu,
EMResult theta)
Computes the matrix of derivatives of the update function. |
static EMResult |
doEM(double[] x,
double[] sd,
int k,
EMResult spoint)
Apply the EM-Algorithm on the given data set where x
is an array of observed value and sd an array of same
size than x which stores the standard deviations of
the observed values. |
static void |
emRate(EMResult theta,
double[] mu,
double[] pi)
Compute the euclidean distance between the new parameters and the old ones to obtain a simple approximation of the EM convergence rate. |
static void |
eStep(double[] x,
double[] sd,
EMResult theta)
The Expectation step : it just consists in updating the Z matrix, i.e the cluster membership probabilities. |
static void |
initEM(double[] x,
double[] sd,
EMResult theta,
java.util.Random rng)
Initializes the EM algorithm by randomly assigning observations to the clusters and do one M-Step to compute the first values of the parameters. |
static int |
iterate(double[] x,
double[] sd,
EMResult theta)
This method performs one iteration of the EM-algorithm. |
static int |
mStep(double[] x,
double[] sd,
EMResult theta,
double err)
The Maximization step : here this step is straighforward and simple analytical formula are applied to obtain the new parameter estimates. |
static void |
updateLoglikelihood(double[] x,
double[] sd,
EMResult theta)
Updates the loglikelihood value. |
static double[] |
updateMuVector(double[] x,
double[] sd,
EMResult theta)
Updates the vector of mixture components and returns the new values. |
static double[] |
updatePiVector(EMResult theta)
Updates the vector of mixture mixings and returns the new values. |
static void |
updateZMatrix(double[] x,
double[] sd,
EMResult theta)
Updates the Z matrix. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static int EM_START
public static int EM_ITER_MAX
public static double EM_ERR
public static double EM_MIN_DISTANCE
public static boolean DO_SEM
public static final double TINY_Z_PROBA
public static final int EM_OK
public static final int EM_CONTINUE
public static final int EM_FAILURE
Constructor Detail |
---|
public EMAlgorithm()
Method Detail |
---|
public static EMResult doEM(double[] x, double[] sd, int k, EMResult spoint)
x
is an array of observed value and sd
an array of same
size than x
which stores the standard deviations of
the observed values. The underlying mixture model will have
k
distinct components and its parameter estimates
will be returned as an EMResult
. If soint
is not null then the EM will start at the given point specified by
the mixture parameters of soint
. To control some
behaviours of the EM-algorithm use the global variables of the class.
x
- the observed positions.sd
- the standard deviation of the observed positions.k
- the number of mixture components.spoint
- the starting point for the EM-algortihm
EMResult
from which the parameter estimates can be obtained.EMResult
public static void initEM(double[] x, double[] sd, EMResult theta, java.util.Random rng)
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.public static int iterate(double[] x, double[] sd, EMResult theta)
EM_OK, otherwise it will return EM_CONTINUE
.
In some cases it can happen - mainly due to numerical roundoff errors -
that the likelihood value after the iteration is lower than the likelihood
before the iteration which in theory cannot happen. Then in this case the
method will return EM_FAILURE
.
- Parameters:
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.
- Returns:
- the status of the iteration.
public static void eStep(double[] x, double[] sd, EMResult theta)
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.updateZMatrix(double[], double[], EMResult)
public static int mStep(double[] x, double[] sd, EMResult theta, double err)
EM_OK, otherwise it will return EM_CONTINUE
.
In some cases it can happen - mainly due to numerical roundoff errors -
that the likelihood value after the maximization is lower than the likelihood
before the iteration which in theory cannot happen. Then in this case the
method will return EM_FAILURE
- Parameters:
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.- See Also:
updateMuVector(double[], double[], EMResult)
,
EMAlgorithm#updatePiVector(double[], double[], EMResult)
,
updateLoglikelihood(double[], double[], EMResult)
public static void emRate(EMResult theta, double[] mu, double[] pi)
theta
- the current parameter estimates.mu
- the last estimates of the means.pi
- the last estimates of the mixing.public static double[] updateMuVector(double[] x, double[] sd, EMResult theta)
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.
public static double[] updatePiVector(EMResult theta)
theta
- the current parameter estimates.
public static void updateZMatrix(double[] x, double[] sd, EMResult theta)
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.public static void updateLoglikelihood(double[] x, double[] sd, EMResult theta)
x
- the observed values.sd
- the standard deviations of the observed values.theta
- the current parameter estimates.public static void computeDM(double[] x, double[] sd, double[] mu, EMResult theta)
x
- the data points.sd
- the standard deviation of the data points.mu
- the previous estimate of the meanstheta
- the current parameter estimates.public static void computeCOV(double[] sd, EMResult theta)
DO_SEM
has been
set to false then the observed information won't be
computed and the variance-covariance matrix will be
obtained by using only the complete information.
sd
- the standard deviations.theta
- the maximum likelihood estimate.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |