Packages

c

org.apache.spark.mllib.clustering

GaussianMixture

class GaussianMixture extends Serializable

This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). A GMM represents a composite distribution of independent Gaussian distributions with associated "mixing" weights specifying each's contribution to the composite.

Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.

Annotations
@Since( "1.3.0" )
Note

This algorithm is limited in its number of features since it requires storing a covariance matrix which has size quadratic in the number of features. Even when the number of features does not exceed this limit, this algorithm may perform poorly on high-dimensional data. This is due to high-dimensional data (a) making it difficult to cluster at all (based on statistical/theoretical arguments) and (b) numerical issues with Gaussian distributions.

Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GaussianMixture
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GaussianMixture()

    Constructs a default instance.

    Constructs a default instance. The default parameters are {k: 2, convergenceTol: 0.01, maxIterations: 100, seed: random}.

    Annotations
    @Since( "1.3.0" )

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def getConvergenceTol: Double

    Return the largest change in log-likelihood at which convergence is considered to have occurred.

    Return the largest change in log-likelihood at which convergence is considered to have occurred.

    Annotations
    @Since( "1.3.0" )
  11. def getInitialModel: Option[GaussianMixtureModel]

    Return the user supplied initial GMM, if supplied

    Return the user supplied initial GMM, if supplied

    Annotations
    @Since( "1.3.0" )
  12. def getK: Int

    Return the number of Gaussians in the mixture model

    Return the number of Gaussians in the mixture model

    Annotations
    @Since( "1.3.0" )
  13. def getMaxIterations: Int

    Return the maximum number of iterations allowed

    Return the maximum number of iterations allowed

    Annotations
    @Since( "1.3.0" )
  14. def getSeed: Long

    Return the random seed

    Return the random seed

    Annotations
    @Since( "1.3.0" )
  15. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  16. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  17. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  18. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  19. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  20. def run(data: JavaRDD[Vector]): GaussianMixtureModel

    Java-friendly version of run()

    Java-friendly version of run()

    Annotations
    @Since( "1.3.0" )
  21. def run(data: RDD[Vector]): GaussianMixtureModel

    Perform expectation maximization

    Perform expectation maximization

    Annotations
    @Since( "1.3.0" )
  22. def setConvergenceTol(convergenceTol: Double): GaussianMixture.this.type

    Set the largest change in log-likelihood at which convergence is considered to have occurred.

    Set the largest change in log-likelihood at which convergence is considered to have occurred.

    Annotations
    @Since( "1.3.0" )
  23. def setInitialModel(model: GaussianMixtureModel): GaussianMixture.this.type

    Set the initial GMM starting point, bypassing the random initialization.

    Set the initial GMM starting point, bypassing the random initialization. You must call setK() prior to calling this method, and the condition (model.k == this.k) must be met; failure will result in an IllegalArgumentException

    Annotations
    @Since( "1.3.0" )
  24. def setK(k: Int): GaussianMixture.this.type

    Set the number of Gaussians in the mixture model.

    Set the number of Gaussians in the mixture model. Default: 2

    Annotations
    @Since( "1.3.0" )
  25. def setMaxIterations(maxIterations: Int): GaussianMixture.this.type

    Set the maximum number of iterations allowed.

    Set the maximum number of iterations allowed. Default: 100

    Annotations
    @Since( "1.3.0" )
  26. def setSeed(seed: Long): GaussianMixture.this.type

    Set the random seed

    Set the random seed

    Annotations
    @Since( "1.3.0" )
  27. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  28. def toString(): String
    Definition Classes
    AnyRef → Any
  29. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped