Metric learning 度量学习
Metric learning 度量学习
介绍
定义
距离测度学习1的目的即为了衡量样本之间的相近程度,而这也正是模式识别的核心问题之一。大量的机器学习方法,比如K近邻、支持向量机、径向基函数网络等分类方法以及K-means聚类方法,还有一些基于图的方法,其性能好坏都主要有样本之间的相似度量方法的选择决定。
起源
Eric Xing在NIPS 2002提出。
优点
度量学习通常的目标是使同类样本之间的距离尽可能缩小,不同类样本之间的距离尽可能放大。
缺点
TODO
应用领域
人脸识别、物体识别、音乐的相似性、人体姿势估计、信息检索、语音识别、手写体识别等领域。
相关
- 欧式距离 (Euclidean Distance) 与 马氏距离 (Mahalanobis Distance)
- 图像特征:颜色直方图、GIST、SIFT
解法
Reference 2中可找到《An Overview of Distance Metric Learning》、《Distance Metric Learning: A Comprehensive Survey》。
- Supervised Distance Metric Learning
Methods | Locality | Linearity | Learning Strategies | Code Download |
---|---|---|---|---|
Probablistic Global Distance Metric Learning (PGDM) | global | linear | constrained convex programming | by Eric P. Xing |
Relevant Components Analysis (RCA) | global | linear | capture global structure; use equivalence constraints | by Aharon Bar-Hillel and Tomer Hertz, |
Discriminative Component Analysis (DCA) | global | linear | improve RCA by exploring negative constraints | by Steven C.H. Hoi |
Local Fisher Discriminant Analysis (LFDA) | local | linear | extend LDA by assigning greater weights to closer connecting examples | [by Masashi Sugiyama] |
Neighborhood Component Analysis (NCA) | local | linear | extend the nearest neighbor classifier toward metric learing | [by Charless C. Fowlkes] |
Large Margin NN Classifier (LMNN) | local | linear | extend NCA through a maximum margin framework | [by Kilian Q. Weinberger] |
Localized Distance Metric Learning (LDM) | local | linear | optimize local compactness and local separability in a probabilistic framework | [by Liu Yang] |
DistBoost | global | linear | learn distance functions by training binary classifiers with margins in a boosting framework | by Tomer Hertz and Aharon Bar-Hillel |
notes on calling its kernel version | ||||
Active Distance Metric Learning (BAYES+VAR) | global | linear | select example pairs with the greatest uncertainty, posterior estimation with a full Bayesian treatment | [by Liu Yang] |
- Unsupervised Distance Metric Learning
Methods | Locality | Linearity | Learning Strategies | Code Download |
---|---|---|---|---|
Principal Component Analysis(PCA) | global structure preserved | linear | best preserve the variance of the data | [by Deng Cai] |
Multidimensional Scaling(MDS) | global structure preserved | linear | best preserve inter-point distance in low-rank | [ included in Matlab Toolbox for Dimensionality Reduction] |
ISOMAP | global structure preserved | nonlinear | preserve the geodesic distances | [by J. B. Tenenbaum, V. de Silva and J. C. Langford] |
Laplacian Eigenamp (LE) | local structure preserved | nonlinear | preserve local neighbor | [by Mikhail Belkin] |
Locality Preserving Projections (LPP) | local structure preserved | linear | linear approximation to LE | [LPP by Deng Cai] |
[Kernel LPP by Deng Cai] | ||||
Locally Linear Embedding (LLE) | local structure preserved | nonlinear | nonlinear preserve local neighbor | [by Sam T. Roweis and Lawrence K. Saul] |
Hessian LLE can be found at [MANI fold Learning Matlab Demo, by Todd Wittman] | ||||
Neighborhood Preserving Embedding (NPE) | lobal structure preserved | linear | linear approximation to LLE | [by Deng Cai] |
实现
Python
metric-learn
https://pypi.python.org/pypi/metric-learn/
LMNN
1
2
3
4
5
6
7from metric_learn import LMNN
import numpy as np
X = np.array([[0., 0., 1.], [0., 0., 2.], [1.,0.,0.], [2.,0.,0.], [2.,2.,2.], [2.,5.,4.]])
Y = np.array([1, 1, 2, 2, 0, 0])
lmnn = LMNN(k=2, learn_rate=1e-6)
lmnn.fit(X, Y, verbose=False)
Y_c = lmnn.transform(X)output
1
2
3
4
5
6
7>>> Y_c
array([[ 0. , -0.07987306, 0.11081795],
[ 0. , -0.15974612, 0.22163591],
[ 0.07113444, 0. , 0. ],
[ 0.14226889, 0. , 0. ],
[ 0.14226889, -0.04460763, 0.06188978],
[ 0.14226889, -0.03164602, 0.04390651]])
Matlab
DistLearnKit
R
Supervised Distance Metric Learning
应用
TODO