应无所住而生其心: correlation-base distance in clustering

this distance is suitable for clustering when you pay much attention on two variable correlation

》》》》层次聚类的树状结构（Dendrogram）的解释：

对于含有n个样本的dendrogram而言，必然是有n-1次fuse into branch，最后才会只有一个branch，因为这个dendrogram最后只有一个branch。而这n-1次的fuse into branch的过程，因为是二叉树，所以可以是任意选择两个的，所以说there are 2^(n-1) possible reordings of the dendrogram,where n is the number of leaves(就是样本个数)。

对于dendrogram中的每个叶子代表了一个样本，as we move up the tree,某些leaves开始做fuse into branches（就是叶子合并到branch上，一个brach上可以有多个叶子）。越早做fuse（lower in the tree），表示这一组的样本之间距离越近。----可以看出每次fuse必然会形成一个group组的概念--------。所以如果你想看两个样本的距离，就去look for the point in the tree where branches containing those two 样本 are first fused,也就是通过纵坐标来衡量。

》》》》如何判定一个dendrogram的好坏？如下图所示：粗体代表了两个变量之间是有线性相关的，其他的两个变量时没有关系的。相关系数矩阵如下所示