Upper and lower bounds for complete linkage in general metric spaces

In a hierarchical clustering problem the task is to compute a series of mutually compatible clusterings of a finite metric space ($P$, dist). Starting with the clustering where every point forms its own cluster, one iteratively merges two clusters until only one cluster remains. Complete linkage is a well-known and popular algorithm to compute such clusterings: in every step it merges the two clusters whose union has the smallest radius (or diameter) among all currently possible merges. We prove that the radius (or diameter) of every k-clustering computed by complete linkage is at most by factor $O(k)$ (or $O(k^{ln (3)/ln(2)})=O(k^{1.59}$) worse than an optimal k-clustering minimizing the radius (or diameter). Furthermore we give a negative answer to the question proposed by Dasgupta and Long (J Comput Syst Sci 70(4):555–569, 2005. https://doi.org/10.1016/j.jcss.2004.10.006), who show a lower bound of $Omega (log(k))$ and ask if the approximation guarantee is in fact $Theta (log (k))$. We present instances where complete linkage performs poorly in the sense that the k-clustering computed by complete linkage is off by a factor of $Omega (k)$ from an optimal solution for radius and diameter. We conclude that in general metric spaces complete linkage does not perform asymptotically better than single linkage, merging the two clusters with smallest inter-cluster distance, for which we prove an approximation guarantee of $O(k)$.

  • Published in:
    Machine Learning
  • Type:
    Article
  • Authors:
    Arutyunova, Anna; Großwendt, Anna; Röglin, Heiko; Schmidt, Melanie; Wargalla, Julian
  • Year:
    2023

Citation information

Arutyunova, Anna; Großwendt, Anna; Röglin, Heiko; Schmidt, Melanie; Wargalla, Julian: Upper and lower bounds for complete linkage in general metric spaces, Machine Learning, 2023, 113, 489--518, November, https://link.springer.com/article/10.1007/s10994-023-06486-8, Arutyunova.etal.2023a,

Associated Lamarr Researchers

lamarr institute person Roeglin Heiko - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)

Prof. Dr. Heiko Röglin

Principal Investigator Resource-aware ML to the profile