fcm
Using the Fuzzy C-Means algorithm, calculate and return the soft partition of a set of unlabeled data points.
Also, if display_intermediate_results is true, display intermediate results after each iteration. Note that because the initial cluster prototypes are randomly selected locations in the ranges determined by the input data, the results of this function are nondeterministic.
The required arguments to fcm are:
The optional arguments to fcm are:
The default values are used if any of the optional arguments are missing or evaluate to NaN.
The return values are:
Three important matrices used in the calculation are X (the input points to be clustered), V (the cluster centers), and Mu (the membership of each data point in each cluster). Each row of X and V denotes a single point, and Mu(i, j) denotes the membership degree of input point X(j, :) in the cluster having center V(i, :).
X is identical to the required argument input_data; V is identical to the output cluster_centers; and Mu is identical to the output soft_partition.
If n denotes the number of input points and k denotes the number of clusters to be formed, then X, V, and Mu have the dimensions:
1 2 ... #features 1 [[ ] X = input_data = 2 [ ] ... [ ] n [ ]] 1 2 ... #features 1 [[ ] V = cluster_centers = 2 [ ] ... [ ] k [ ]] 1 2 ... n 1 [[ ] Mu = soft_partition = 2 [ ] ... [ ] k [ ]]
See also: gustafson_kessel, partition_coeff, partition_entropy, xie_beni_index
## This demo: ## - classifies a small set of unlabeled data points using ## the Fuzzy C-Means algorithm into two fuzzy clusters ## - plots the input points together with the cluster centers ## - evaluates the quality of the resulting clusters using ## three validity measures: the partition coefficient, the ## partition entropy, and the Xie-Beni validity index ## ## Note: The input_data is taken from Chapter 13, Example 17 in ## Fuzzy Logic: Intelligence, Control and Information, by ## J. Yen and R. Langari, Prentice Hall, 1999, page 381 ## (International Edition). ## Use fcm to classify the input_data. input_data = [2 12; 4 9; 7 13; 11 5; 12 7; 14 4]; number_of_clusters = 2; [cluster_centers, soft_partition, obj_fcn_history] = ... fcm (input_data, number_of_clusters) ## Plot the data points as small blue x's. figure ('NumberTitle', 'off', 'Name', 'FCM Demo 1'); for i = 1 : rows (input_data) plot (input_data(i, 1), input_data(i, 2), 'LineWidth', 2, ... 'marker', 'x', 'color', 'b'); hold on; endfor ## Plot the cluster centers as larger red *'s. for i = 1 : number_of_clusters plot (cluster_centers(i, 1), cluster_centers(i, 2), ... 'LineWidth', 4, 'marker', '*', 'color', 'r'); hold on; endfor ## Make the figure look a little better: ## - scale and label the axes ## - show gridlines xlim ([0 15]); ylim ([0 15]); xlabel ('Feature 1'); ylabel ('Feature 2'); grid hold ## Calculate and print the three validity measures. printf ("Partition Coefficient: %f\n", ... partition_coeff (soft_partition)); printf ("Partition Entropy (with a = 2): %f\n", ... partition_entropy (soft_partition, 2)); printf ("Xie-Beni Index: %f\n\n", ... xie_beni_index (input_data, cluster_centers, ... soft_partition)); Iteration count = 1, Objective fcn = 48.836786 Iteration count = 2, Objective fcn = 28.958626 Iteration count = 3, Objective fcn = 28.758695 Iteration count = 4, Objective fcn = 28.757469 Iteration count = 5, Objective fcn = 28.757461 Iteration count = 6, Objective fcn = 28.757460 Iteration count = 7, Objective fcn = 28.757460 Iteration count = 8, Objective fcn = 28.757460 cluster_centers = 4.2023 11.2805 12.2859 5.3691 soft_partition = 0.965400 0.939806 0.888774 0.020467 0.033486 0.031290 0.034600 0.060194 0.111226 0.979533 0.966514 0.968710 obj_fcn_history = 48.837 28.959 28.759 28.757 28.757 28.757 28.757 28.757 hold is now off for current axes Partition Coefficient: 0.909483 Partition Entropy (with a = 2): 0.267539 Xie-Beni Index: 0.095582 |
## This demo: ## - classifies three-dimensional unlabeled data points using ## the Fuzzy C-Means algorithm into three fuzzy clusters ## - plots the input points together with the cluster centers ## - evaluates the quality of the resulting clusters using ## three validity measures: the partition coefficient, the ## partition entropy, and the Xie-Beni validity index ## ## Note: The input_data was selected to form three areas of ## different shapes. ## Use fcm to classify the input_data. input_data = [1 11 5; 1 12 6; 1 13 5; 2 11 7; 2 12 6; 2 13 7; 3 11 6; 3 12 5; 3 13 7; 1 1 10; 1 3 9; 2 2 11; 3 1 9; 3 3 10; 3 5 11; 4 4 9; 4 6 8; 5 5 8; 5 7 9; 6 6 10; 9 10 12; 9 12 13; 9 13 14; 10 9 13; 10 13 12; 11 10 14; 11 12 13; 12 6 12; 12 7 15; 12 9 15; 14 6 14; 14 8 13]; number_of_clusters = 3; [cluster_centers, soft_partition, obj_fcn_history] = ... fcm (input_data, number_of_clusters, [NaN NaN NaN 0]) ## Plot the data points in two dimensions (using features 1 & 2) ## as small blue x's. figure ('NumberTitle', 'off', 'Name', 'FCM Demo 2'); for i = 1 : rows (input_data) plot (input_data(i, 1), input_data(i, 2), 'LineWidth', 2, ... 'marker', 'x', 'color', 'b'); hold on; endfor ## Plot the cluster centers in two dimensions ## (using features 1 & 2) as larger red *'s. for i = 1 : number_of_clusters plot (cluster_centers(i, 1), cluster_centers(i, 2), ... 'LineWidth', 4, 'marker', '*', 'color', 'r'); hold on; endfor ## Make the figure look a little better: ## - scale and label the axes ## - show gridlines xlim ([0 15]); ylim ([0 15]); xlabel ('Feature 1'); ylabel ('Feature 2'); grid hold ## Plot the data points in two dimensions ## (using features 1 & 3) as small blue x's. figure ('NumberTitle', 'off', 'Name', 'FCM Demo 2'); for i = 1 : rows (input_data) plot (input_data(i, 1), input_data(i, 3), 'LineWidth', 2, ... 'marker', 'x', 'color', 'b'); hold on; endfor ## Plot the cluster centers in two dimensions ## (using features 1 & 3) as larger red *'s. for i = 1 : number_of_clusters plot (cluster_centers(i, 1), cluster_centers(i, 3), ... 'LineWidth', 4, 'marker', '*', 'color', 'r'); hold on; endfor ## Make the figure look a little better: ## - scale and label the axes ## - show gridlines xlim ([0 15]); ylim ([0 15]); xlabel ('Feature 1'); ylabel ('Feature 3'); grid hold ## Calculate and print the three validity measures. printf ("Partition Coefficient: %f\n", ... partition_coeff (soft_partition)); printf ("Partition Entropy (with a = 2): %f\n", ... partition_entropy (soft_partition, 2)); printf ("Xie-Beni Index: %f\n\n", ... xie_beni_index (input_data, cluster_centers, ... soft_partition)); cluster_centers = 3.1989 3.6232 9.5521 2.0937 11.9016 6.0942 11.0424 9.5332 13.3569 soft_partition = Columns 1 through 6: 3.7871e-02 1.3572e-02 3.0172e-02 2.5327e-02 3.2448e-04 2.0488e-02 9.4461e-01 9.7904e-01 9.5109e-01 9.6197e-01 9.9948e-01 9.6487e-01 1.7523e-02 7.3841e-03 1.8740e-02 1.2705e-02 1.9250e-04 1.4638e-02 Columns 7 through 12: 2.3598e-02 2.1516e-02 2.8591e-02 8.6766e-01 9.1223e-01 9.1464e-01 9.6332e-01 9.6457e-01 9.4834e-01 7.6424e-02 5.6743e-02 4.6202e-02 1.3086e-02 1.3915e-02 2.3066e-02 5.5911e-02 3.1032e-02 3.9162e-02 Columns 13 through 18: 9.0697e-01 9.8825e-01 9.0909e-01 9.7506e-01 7.6774e-01 8.2343e-01 5.1152e-02 6.5170e-03 5.0542e-02 1.4244e-02 1.5868e-01 1.0410e-01 4.1879e-02 5.2361e-03 4.0373e-02 1.0700e-02 7.3582e-02 7.2479e-02 Columns 19 through 24: 6.2230e-01 6.7200e-01 6.7469e-02 7.4870e-02 9.2741e-02 1.6712e-02 2.2741e-01 1.4085e-01 6.2864e-02 9.0815e-02 1.1768e-01 1.2265e-02 1.5029e-01 1.8715e-01 8.6967e-01 8.3432e-01 7.8958e-01 9.7102e-01 Columns 25 through 30: 8.4745e-02 5.1715e-03 3.9804e-02 1.3556e-01 7.4614e-02 2.7318e-02 1.2048e-01 4.3134e-03 4.4784e-02 7.1964e-02 4.3901e-02 1.9996e-02 7.9477e-01 9.9052e-01 9.1541e-01 7.9248e-01 8.8149e-01 9.5269e-01 Columns 31 and 32: 1.2256e-01 6.7204e-02 7.2840e-02 4.8500e-02 8.0460e-01 8.8430e-01 obj_fcn_history = Columns 1 through 10: 408.08 240.46 184.85 181.00 180.66 180.61 180.61 180.61 180.61 180.61 Columns 11 through 16: 180.61 180.61 180.61 180.61 180.61 180.61 hold is now off for current axes hold is now off for current axes Partition Coefficient: 0.813224 Partition Entropy (with a = 2): 0.541401 Xie-Beni Index: 0.207218 |