API Reference

Utility functions

All functions used by all algorithms.

Cluster.init_centroidsMethod
init_centroids(X::Matrix{Float64}, K::Int64, mode::Symbol)

Initializes centroids for the clustering algorithm based on the specified mode.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • K::Int64: Number of clusters.
  • mode::Symbol: Initialization mode, either :random or :kmeanspp.

Output

  • Returns a matrix of initialized centroid coordinates.

Algorithm

  1. If mode is :random:
    • Randomly select K data points from X as initial centroids.
  2. If mode is :kmeanspp:
    • Initialize the first centroid randomly.
    • For each subsequent centroid: a. Compute the distance from each data point to the nearest centroid. b. Select the next centroid with probability proportional to the squared distance.

Examples

julia> X = rand(100, 2)
julia> centroids = init_centroids(X, 3, :kmeanspp)
3×2 Matrix{Float64}:
 0.386814  0.619566
 0.170768  0.0176449
 0.38688   0.398064
source

KMeans / KMeans++ Clustering Algorithm

  • Initializes centroids using either random selection or KMeans++.
  • Iteratively assigns points to the nearest centroid.
  • Updates centroids based on the mean of assigned points.
  • Stops when centroids converge or after a maximum number of iterations.

References:

Cluster.KMeansMethod
KMeans(; k::Int=3, mode::Symbol=:kmeanspp, max_try::Int=100, tol::Float64=1e-4)

Constructor for the KMeans struct.

Input

  • k::Int: Number of clusters (default: 3).
  • mode::Symbol: Initialization mode, either :random or :kmeanspp (default: :kmeanspp).
  • max_try::Int: Maximum number of iterations (default: 100).
  • tol::Float64: Tolerance for convergence (default: 1e-4).

Output

  • Returns an instance of KMeans.

Examples

julia> model = KMeans(k=3, mode=:kmeanspp, max_try=100, tol=1e-4)
KMeans(3, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.KMeansType
mutable struct KMeans

A mutable struct for the KMeans clustering algorithm.

Fields

  • k::Int: Number of clusters.
  • mode::Symbol: Initialization mode, either :random or :kmeanspp.
  • max_try::Int: Maximum number of iterations.
  • tol::Float64: Tolerance for convergence.
  • centroids::Array{Float64,2}: Matrix of centroid coordinates.
  • labels::Array{Int,1}: Vector of labels for each data point.

Examples

julia> model = KMeans(k=3, mode=:kmeanspp, max_try=100, tol=1e-4)
KMeans(3, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.assign_centerMethod
assign_center(D::Matrix{Float64})

Assigns each data point to the nearest centroid based on the distance matrix D.

Input

  • D::Matrix{Float64}: Distance matrix where element (i, j) is the distance between the i-th data point and the j-th centroid.

Output

  • Returns a vector of labels where each element is the index of the nearest centroid for the corresponding data point.

Examples

julia> D = rand(100, 3)
julia> labels = assign_center(D)
source
Cluster.compute_distanceMethod
compute_distance(X::Matrix{Float64}, centroids::Matrix{Float64})

Computes the distance between each data point in X and each centroid.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • centroids::Matrix{Float64}: Matrix of centroid coordinates.

Output

  • Returns a distance matrix where element (i, j) is the distance between the i-th data point and the j-th centroid.

Examples

julia> X = rand(100, 2)
julia> centroids = rand(3, 2)
julia> D = compute_distance(X, centroids)
100×3 Matrix{Float64}:
 0.181333  0.539578  0.306867
 0.754863  0.48797   0.562147
 0.205116  0.360735  0.127107
 0.154926  0.552747  0.323433
 ⋮
 0.434321  0.321914  0.261909
 0.773258  0.291669  0.513668
 0.607547  0.310411  0.38714
source
Cluster.fit!Method
fit!(model::KMeans, X::Matrix{Float64})

Fits the KMeans model to the data matrix X.

Input

  • model::KMeans: An instance of KMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Modifies the model in-place to fit the data.

Algorithm

  1. Initialize centroids.
  2. Iterate up to max_try times: a. Compute distances between data points and centroids. b. Assign each data point to the nearest centroid. c. Update centroids based on the mean of assigned data points. d. Check for convergence based on tol.

Examples

julia> model = KMeans(k=3)
julia> X = rand(100, 2)
julia> fit!(model, X)
source
Cluster.predictMethod
predict(model::KMeans, X::Matrix{Float64})

Predicts the cluster labels for new data points based on the fitted model.

Input

  • model::KMeans: An instance of KMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Returns a vector of predicted labels for each data point.

Examples

julia> model = KMeans(k=3)
julia> X_train = rand(100, 2)
julia> fit!(model, X_train)
julia> X_test = rand(10, 2)
julia> labels = predict(model, X_test)
source
Cluster.update_centroidsMethod
update_centroids(X::Matrix{Float64}, label_vector::Vector{Int64}, model::KMeans)

Updates the centroids based on the current assignment of data points to centroids.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • label_vector::Vector{Int64}: Vector of labels for each data point.
  • model::KMeans: An instance of KMeans.

Output

  • Returns a matrix of updated centroid coordinates.

Examples

julia> X = rand(100, 2)
julia> labels = rand(1:3, 100)
julia> model = KMeans(k=3)
julia> centroids = update_centroids(X, labels, model)
source

Bisecting KMeans Clustering Algorithm

  • Starts with a single cluster containing all data points.
  • Recursively splits clusters based on the highest SSE until k clusters are obtained.
  • Uses standard KMeans for cluster splitting.

References:

Cluster.BKMeansMethod
BKMeans(; k::Int=3, kmeans::KMeans=KMeans(k=2, mode=:kmeanspp))

Constructor for the BKMeans struct.

Input

  • k::Int: Number of clusters (default: 3).
  • kmeans::KMeans: An instance of the KMeans struct used for bisecting (default: KMeans(k=2, mode=:kmeanspp)).

Output

  • Returns an instance of BKMeans.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
BKMeans(3, KMeans(2, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[]), Int64[], Matrix{Float64}(undef, 0, 0))
source
Cluster.BKMeansType
mutable struct BKMeans

A mutable struct for the Bisecting KMeans clustering algorithm.

Fields

  • k::Int: Number of clusters.
  • kmeans::KMeans: An instance of the KMeans struct used for bisecting.
  • labels::Array{Int,1}: Vector of labels for each data point.
  • centroids::Matrix{Float64}: Matrix of centroid coordinates.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
BKMeans(3, KMeans(2, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[]), Int64[], Matrix{Float64}(undef, 0, 0))
source
Cluster.fit!Method
fit!(model::BKMeans, X::Matrix{Float64})

Fits the BKMeans model to the data matrix X.

Input

  • model::BKMeans: An instance of BKMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Modifies the model in-place to fit the data.

Algorithm

  1. Initialize clusters with the entire dataset.
  2. While the number of clusters is less than k: a. Compute the sum of squared errors (SSE) for each cluster. b. Select the cluster with the highest SSE. c. Apply KMeans to bisect the selected cluster. d. Replace the selected cluster with the two resulting clusters.
  3. Assign labels and centroids based on the final clusters.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
julia> X = rand(100, 2)
julia> fit!(model, X)
source
Cluster.predictMethod
predict(model::BKMeans, X::Matrix{Float64})

Predicts the cluster labels for new data points based on the fitted BKMeans model.

Input

  • model::BKMeans: An instance of BKMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Returns a vector of predicted labels for each data point.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
julia> X_train = rand(100, 2)
julia> fit!(model, X_train)
julia> X_test = rand(10, 2)
julia> labels = predict(model, X_test)
source

Distributional Clustering Method

References:

Cluster.DCMethod
DC(; k::Int=3, mode::Symbol=:kmeanspp, max_try::Int=100, tol::Float64=1e-4)

Constructor for the DC struct.

Input

  • k::Int: Number of clusters (default: 3).
  • mode::Symbol: Initialization mode, either :random or :kmeanspp(default: :kmeanspp).
  • max_try::Int: Maximum number of iterations (default: 100).
  • tol::Float64: Tolerance for convergence (default: 1e-4).

Output

  • Returns an instance of DC.

Examples

julia> model = DC(k=3, mode=:kmeanspp, max_try=100, tol=1e-4)
DC(3, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.DCType
mutable struct DC

A mutable struct for the Density-based Clustering (DC) algorithm.

Fields

  • k::Int: Number of clusters.
  • mode::Symbol: Initialization mode, either :random or :kmeanspp.
  • max_try::Int: Maximum number of iterations.
  • tol::Float64: Tolerance for convergence.
  • centroids::Array{Float64,2}: Matrix of centroid coordinates.
  • labels::Array{Int,1}: Vector of labels for each data point.

Examples

julia> model = DC(k=3, mode=:random, max_try=100, tol=1e-4)
DC(3, :random, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.compute_objective_functionMethod
compute_objective_function(X::Matrix{Float64}, centroids::Matrix{Float64}; p=2, delta=0.0001)

Computes the objective function for the DC algorithm.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • centroids::Matrix{Float64}: Matrix of centroid coordinates.
  • p: Power parameter for the distance metric (default: 2).
  • delta: Small constant to avoid division by zero (default: 0.0001).

Output

  • Returns a distance matrix where element (i, j) is the distance between the i-th data point and the j-th centroid.

Examples

julia> X = rand(100, 2)
julia> centroids = rand(3, 2)
julia> D = compute_objective_function(X, centroids)
source
Cluster.fit!Method
fit!(model::DC, X::Matrix{Float64})

Fits the DC model to the data matrix X.

Input

  • model::DC: An instance of DC.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Modifies the model in-place to fit the data.

Algorithm

  1. Initialize centroids.
  2. Iterate up to max_try times: a. Compute the objective function between data points and centroids. b. Assign each data point to the nearest centroid. c. Update centroids based on the current assignment. d. Check for convergence based on tol.

Examples

julia> model = DC(k=3)
julia> X = rand(100, 2)
julia> fit!(model, X)
source
Cluster.predictMethod
predict(model::DC, X::Matrix{Float64})

Predicts the cluster labels for new data points based on the fitted DC model.

Input

  • model::DC: An instance of DC.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Returns a vector of predicted labels for each data point.

Examples

julia> model = DC(k=3)
julia> X_train = rand(100, 2)
julia> fit!(model, X_train)
julia> X_test = rand(10, 2)
julia> labels = predict(model, X_test)
source
Cluster.update_centroidsMethod
update_centroids(X::Matrix{Float64}, label_vector::Vector{Int64}, model::DC; delta=0.0001)

Updates the centroids based on the current assignment of data points to centroids.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • label_vector::Vector{Int64}: Vector of labels for each data point.
  • model::DC: An instance of DC.
  • delta: Small constant to avoid division by zero (default: 0.0001).

Output

  • Returns a matrix of updated centroid coordinates.

Examples

julia> X = rand(100, 2)
julia> labels = rand(1:3, 100)
julia> model = DC(k=3)
julia> centroids = update_centroids(X, labels, model)
source

Full list of available functions

Cluster.BKMeansType
mutable struct BKMeans

A mutable struct for the Bisecting KMeans clustering algorithm.

Fields

  • k::Int: Number of clusters.
  • kmeans::KMeans: An instance of the KMeans struct used for bisecting.
  • labels::Array{Int,1}: Vector of labels for each data point.
  • centroids::Matrix{Float64}: Matrix of centroid coordinates.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
BKMeans(3, KMeans(2, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[]), Int64[], Matrix{Float64}(undef, 0, 0))
source
Cluster.BKMeansMethod
BKMeans(; k::Int=3, kmeans::KMeans=KMeans(k=2, mode=:kmeanspp))

Constructor for the BKMeans struct.

Input

  • k::Int: Number of clusters (default: 3).
  • kmeans::KMeans: An instance of the KMeans struct used for bisecting (default: KMeans(k=2, mode=:kmeanspp)).

Output

  • Returns an instance of BKMeans.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
BKMeans(3, KMeans(2, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[]), Int64[], Matrix{Float64}(undef, 0, 0))
source
Cluster.DCType
mutable struct DC

A mutable struct for the Density-based Clustering (DC) algorithm.

Fields

  • k::Int: Number of clusters.
  • mode::Symbol: Initialization mode, either :random or :kmeanspp.
  • max_try::Int: Maximum number of iterations.
  • tol::Float64: Tolerance for convergence.
  • centroids::Array{Float64,2}: Matrix of centroid coordinates.
  • labels::Array{Int,1}: Vector of labels for each data point.

Examples

julia> model = DC(k=3, mode=:random, max_try=100, tol=1e-4)
DC(3, :random, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.DCMethod
DC(; k::Int=3, mode::Symbol=:kmeanspp, max_try::Int=100, tol::Float64=1e-4)

Constructor for the DC struct.

Input

  • k::Int: Number of clusters (default: 3).
  • mode::Symbol: Initialization mode, either :random or :kmeanspp(default: :kmeanspp).
  • max_try::Int: Maximum number of iterations (default: 100).
  • tol::Float64: Tolerance for convergence (default: 1e-4).

Output

  • Returns an instance of DC.

Examples

julia> model = DC(k=3, mode=:kmeanspp, max_try=100, tol=1e-4)
DC(3, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.KMeansType
mutable struct KMeans

A mutable struct for the KMeans clustering algorithm.

Fields

  • k::Int: Number of clusters.
  • mode::Symbol: Initialization mode, either :random or :kmeanspp.
  • max_try::Int: Maximum number of iterations.
  • tol::Float64: Tolerance for convergence.
  • centroids::Array{Float64,2}: Matrix of centroid coordinates.
  • labels::Array{Int,1}: Vector of labels for each data point.

Examples

julia> model = KMeans(k=3, mode=:kmeanspp, max_try=100, tol=1e-4)
KMeans(3, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.KMeansMethod
KMeans(; k::Int=3, mode::Symbol=:kmeanspp, max_try::Int=100, tol::Float64=1e-4)

Constructor for the KMeans struct.

Input

  • k::Int: Number of clusters (default: 3).
  • mode::Symbol: Initialization mode, either :random or :kmeanspp (default: :kmeanspp).
  • max_try::Int: Maximum number of iterations (default: 100).
  • tol::Float64: Tolerance for convergence (default: 1e-4).

Output

  • Returns an instance of KMeans.

Examples

julia> model = KMeans(k=3, mode=:kmeanspp, max_try=100, tol=1e-4)
KMeans(3, :kmeanspp, 100, 0.0001, Matrix{Float64}(undef, 0, 0), Int64[])
source
Cluster.assign_centerMethod
assign_center(D::Matrix{Float64})

Assigns each data point to the nearest centroid based on the distance matrix D.

Input

  • D::Matrix{Float64}: Distance matrix where element (i, j) is the distance between the i-th data point and the j-th centroid.

Output

  • Returns a vector of labels where each element is the index of the nearest centroid for the corresponding data point.

Examples

julia> D = rand(100, 3)
julia> labels = assign_center(D)
source
Cluster.compute_distanceMethod
compute_distance(X::Matrix{Float64}, centroids::Matrix{Float64})

Computes the distance between each data point in X and each centroid.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • centroids::Matrix{Float64}: Matrix of centroid coordinates.

Output

  • Returns a distance matrix where element (i, j) is the distance between the i-th data point and the j-th centroid.

Examples

julia> X = rand(100, 2)
julia> centroids = rand(3, 2)
julia> D = compute_distance(X, centroids)
100×3 Matrix{Float64}:
 0.181333  0.539578  0.306867
 0.754863  0.48797   0.562147
 0.205116  0.360735  0.127107
 0.154926  0.552747  0.323433
 ⋮
 0.434321  0.321914  0.261909
 0.773258  0.291669  0.513668
 0.607547  0.310411  0.38714
source
Cluster.compute_objective_functionMethod
compute_objective_function(X::Matrix{Float64}, centroids::Matrix{Float64}; p=2, delta=0.0001)

Computes the objective function for the DC algorithm.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • centroids::Matrix{Float64}: Matrix of centroid coordinates.
  • p: Power parameter for the distance metric (default: 2).
  • delta: Small constant to avoid division by zero (default: 0.0001).

Output

  • Returns a distance matrix where element (i, j) is the distance between the i-th data point and the j-th centroid.

Examples

julia> X = rand(100, 2)
julia> centroids = rand(3, 2)
julia> D = compute_objective_function(X, centroids)
source
Cluster.fit!Method
fit!(model::BKMeans, X::Matrix{Float64})

Fits the BKMeans model to the data matrix X.

Input

  • model::BKMeans: An instance of BKMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Modifies the model in-place to fit the data.

Algorithm

  1. Initialize clusters with the entire dataset.
  2. While the number of clusters is less than k: a. Compute the sum of squared errors (SSE) for each cluster. b. Select the cluster with the highest SSE. c. Apply KMeans to bisect the selected cluster. d. Replace the selected cluster with the two resulting clusters.
  3. Assign labels and centroids based on the final clusters.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
julia> X = rand(100, 2)
julia> fit!(model, X)
source
Cluster.fit!Method
fit!(model::DC, X::Matrix{Float64})

Fits the DC model to the data matrix X.

Input

  • model::DC: An instance of DC.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Modifies the model in-place to fit the data.

Algorithm

  1. Initialize centroids.
  2. Iterate up to max_try times: a. Compute the objective function between data points and centroids. b. Assign each data point to the nearest centroid. c. Update centroids based on the current assignment. d. Check for convergence based on tol.

Examples

julia> model = DC(k=3)
julia> X = rand(100, 2)
julia> fit!(model, X)
source
Cluster.fit!Method
fit!(model::KMeans, X::Matrix{Float64})

Fits the KMeans model to the data matrix X.

Input

  • model::KMeans: An instance of KMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Modifies the model in-place to fit the data.

Algorithm

  1. Initialize centroids.
  2. Iterate up to max_try times: a. Compute distances between data points and centroids. b. Assign each data point to the nearest centroid. c. Update centroids based on the mean of assigned data points. d. Check for convergence based on tol.

Examples

julia> model = KMeans(k=3)
julia> X = rand(100, 2)
julia> fit!(model, X)
source
Cluster.init_centroidsMethod
init_centroids(X::Matrix{Float64}, K::Int64, mode::Symbol)

Initializes centroids for the clustering algorithm based on the specified mode.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • K::Int64: Number of clusters.
  • mode::Symbol: Initialization mode, either :random or :kmeanspp.

Output

  • Returns a matrix of initialized centroid coordinates.

Algorithm

  1. If mode is :random:
    • Randomly select K data points from X as initial centroids.
  2. If mode is :kmeanspp:
    • Initialize the first centroid randomly.
    • For each subsequent centroid: a. Compute the distance from each data point to the nearest centroid. b. Select the next centroid with probability proportional to the squared distance.

Examples

julia> X = rand(100, 2)
julia> centroids = init_centroids(X, 3, :kmeanspp)
3×2 Matrix{Float64}:
 0.386814  0.619566
 0.170768  0.0176449
 0.38688   0.398064
source
Cluster.predictMethod
predict(model::BKMeans, X::Matrix{Float64})

Predicts the cluster labels for new data points based on the fitted BKMeans model.

Input

  • model::BKMeans: An instance of BKMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Returns a vector of predicted labels for each data point.

Examples

julia> kmeans_model = KMeans(k=2, mode=:kmeanspp)
julia> model = BKMeans(k=3, kmeans=kmeans_model)
julia> X_train = rand(100, 2)
julia> fit!(model, X_train)
julia> X_test = rand(10, 2)
julia> labels = predict(model, X_test)
source
Cluster.predictMethod
predict(model::DC, X::Matrix{Float64})

Predicts the cluster labels for new data points based on the fitted DC model.

Input

  • model::DC: An instance of DC.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Returns a vector of predicted labels for each data point.

Examples

julia> model = DC(k=3)
julia> X_train = rand(100, 2)
julia> fit!(model, X_train)
julia> X_test = rand(10, 2)
julia> labels = predict(model, X_test)
source
Cluster.predictMethod
predict(model::KMeans, X::Matrix{Float64})

Predicts the cluster labels for new data points based on the fitted model.

Input

  • model::KMeans: An instance of KMeans.
  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.

Output

  • Returns a vector of predicted labels for each data point.

Examples

julia> model = KMeans(k=3)
julia> X_train = rand(100, 2)
julia> fit!(model, X_train)
julia> X_test = rand(10, 2)
julia> labels = predict(model, X_test)
source
Cluster.update_centroidsMethod
update_centroids(X::Matrix{Float64}, label_vector::Vector{Int64}, model::DC; delta=0.0001)

Updates the centroids based on the current assignment of data points to centroids.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • label_vector::Vector{Int64}: Vector of labels for each data point.
  • model::DC: An instance of DC.
  • delta: Small constant to avoid division by zero (default: 0.0001).

Output

  • Returns a matrix of updated centroid coordinates.

Examples

julia> X = rand(100, 2)
julia> labels = rand(1:3, 100)
julia> model = DC(k=3)
julia> centroids = update_centroids(X, labels, model)
source
Cluster.update_centroidsMethod
update_centroids(X::Matrix{Float64}, label_vector::Vector{Int64}, model::KMeans)

Updates the centroids based on the current assignment of data points to centroids.

Input

  • X::Matrix{Float64}: Data matrix where rows are data points and columns are features.
  • label_vector::Vector{Int64}: Vector of labels for each data point.
  • model::KMeans: An instance of KMeans.

Output

  • Returns a matrix of updated centroid coordinates.

Examples

julia> X = rand(100, 2)
julia> labels = rand(1:3, 100)
julia> model = KMeans(k=3)
julia> centroids = update_centroids(X, labels, model)
source