Stronger Together
Implement k-means clustering from scratch.
You are given a set of 2D points and a target number of clusters k. Your task is to partition the points into k clusters by iteratively assigning points to their nearest centroid and recomputing centroids until convergence.
Input Format
n k
x_1 y_1
x_2 y_2
...
x_n y_n
nis the number of points.kis the number of clusters.- Each subsequent line is a 2D point with floating-point coordinates.
Output Format
c_1
c_2
...
c_n
- One line per point, in input order.
c_iis the cluster label (0-indexed) assigned to pointiafter convergence.
Initialization
Centroids must be initialized using the first k points in the input, in order.
That is, centroid j is initialized to point j for j = 0, 1, ..., k-1.
Algorithm
Repeat the following until no point changes its assignment:
Assign — For each point, compute the Euclidean distance to every centroid. Assign the point to the cluster of the nearest centroid. If a point is equidistant to two or more centroids, assign it to the one with the smallest index.
Update — For each cluster, recompute its centroid as the arithmetic mean of all points currently assigned to it.
If a cluster has no points assigned to it after an assignment step, its centroid remains unchanged.
Behavioral Requirements
- You must not use any clustering library or built-in function.
- Distance must be computed as standard Euclidean distance (L2 norm). Do not use squared distance for assignment decisions — the ordering is the same, but be precise.
- The algorithm must terminate. It is guaranteed to converge for the given test cases.
- Output cluster labels must reflect the final stable assignment.
Notes
- Floating-point precision: use
double. Output labels are integers, so precision loss in label assignment is not a concern — only in centroid computation. - You do not need to output centroids, only labels.
3 3
0.0 0.0
10.0 0.0
0.0 10.00
1
2