Non-negative Matrix Factorization 非负矩阵分解

介绍

定义

　　非负矩阵分解(non-negative matrix factorization)¹，或非负矩阵近似(non-negative matrix approximation)，是多变量分析和线性代数的算法。

　　输入：非负矩阵\(V\)

　　输出：两个非负矩阵\(W\)和\(H\)

　　目标：\(V=WH\)

起源

　　著名的科学杂志《Nature》于1999年刊登了两位科学家D.D.Lee和H.S.Seung对数学中非负矩阵研究的突出成果。该文² ³ ⁴提出了一种新的矩阵分解思想——非负矩阵分解(Non-negative Matrix Factorization，NMF)算法，即NMF是在矩阵中所有元素均为非负数约束条件之下的矩阵分解方法。

优点

处理大规模数据更快更便捷；
实现简便性、分解形式和分解结果上的可解释性，占用存储空间少。

缺点

NMF中只用一层表示隐变量，无法处理复杂学习问题；
NMF只约束了\(W\)和\(H\)的非负性（这是唯一先验，只要求满足这个），而没有考虑到对于该先验，\(H\)内部元素间的相关性。（如图1中NMF部分的第1列第5行与第4列第7行类似，这二者的线性组合表示眼睛，而NMF假设他们不相关。）

应用领域

　　计算机视觉，文档聚类，化学统计学，音频信号处理，推荐系统

解法

　　有很多方法可以求\(W\)和\(H\)，其中Lee和Seung的倍增更新法则因为实现简单，是最流行的方法。

　　一些成功的算法是基于交替非负最小二乘法：在每一步中，首先固定\(H\)并通过非负最小二乘法求解法得到\(W\)，然后固定\(W\)同理求出\(H\)。求解\(W\)或\(H\)的方法可以相同或不同，因为可以对\(W\)或\(H\)进行规范化（以防止过拟合）。具体求解方法包括the projected gradient descent methods（投影梯度下降方法），the active set method 和 the block principal pivoting method。

Lee和Seung的倍增更新法则

分解的公式：

\[ V_{i\mu} \approx (WH)_{i\mu} = \sum_{a=1}^r W_{ia}H_{a\mu} \tag{1} \]

其中\(r\)是分解矩阵的秩，\(V\)是原矩阵的一个近似，\(W\)与\(H\)就是分解而成的两个矩阵。

优化目标函数为

\[ F=\sum_{i=1}^n \sum_{\mu=1}^m [V_{i\mu} log(WH)_{i\mu} - (WH)_{i\mu}] \tag{2} \]

其中\(V_{i\mu}\)表示一个像素，由泊松噪声的\((WH)_{i\mu}\)而得。

因此目标函数实际上表示由基\(W\)和编码\(H\)产生图像\(V\)的概率。

用迭代算法求\(W\)和\(H\)，初始时是随机矩阵，然后不断迭代更新，使得\(F\)达到一个局部极大值：

\[ W_{ia} \leftarrow W_{ia} \sum_\mu \frac{ V_{i\mu} }{ (WH)_{i\mu} } H_{a\mu} \tag{3} \]

\[ W_{ia} \leftarrow \frac{W_{ia}}{ \sum_j W{ja} } \tag{4} \]

\[ H_{a\mu} \leftarrow H_{a\mu} \sum_i W_{ia} \frac{V_{i\mu}}{ \sum_j (WH)_{i\mu} } \tag{5} \]

实现

Matlab

http://cn.mathworks.com/help/stats/nnmf.html ⁵

给定矩阵 \(A \in R^{n \times m}\) 和整数 \(k\)，

返回分解的非负矩阵 \(W \in R^{n \times k}\), \(H \in R^{k \times m}\)。

还可以返回均方根残差 \(D = norm(A - W \times H, \text{'fro'}) / sqrt(N \times M)\)

1
2
3

[W,H] = nnmf(A,k)
[W,H] = nnmf(A,k,param1,val1,param2,val2,...)
[W,H,D] = nnmf(...)

Python

nimfa - A Python Library for Nonnegative Matrix Factorization Techniques
http://nimfa.biolab.si/

nmf_example.py

import scipy.sparse as spr
import nimfa

V = spr.csr_matrix([[1, 0, 2, 4], [0, 0, 6, 3], [4, 0, 5, 6]])

lsnmf = nimfa.Lsnmf(V, seed='random_vcol', max_iter=10000, rank=2)
lsnmf_fit = lsnmf()

W = lsnmf_fit.basis()
print('Basis matrix:\n%s' % W.todense())

H = lsnmf_fit.coef()
print('Mixture matrix:\n%s' % H.todense())

import numpy as np
print('Target estimate:\n%s' % np.dot(W.todense(), H.todense()))
print('Iterations: %d' % lsnmf_fit.n_iter)
print('Euclidean distance: %5.3f' % lsnmf_fit.distance(metric='euclidean'))
print('K-L divergence: %5.3f' % lsnmf_fit.distance(metric='kl'))

http://www.csie.ntu.edu.tw/~cjlin/nmf/ ⁶
给定\(W\)和\(H\)的初始矩阵，收敛时的容忍值（tolerance），时间限制，迭代次数限制，返回分解的非负矩阵\(W\)和\(H\)。

nmf.py

# NMF by alternative non-negative least squares using projected gradients
# Author: Chih-Jen Lin, National Taiwan University
# Python/numpy translation: Anthony Di Franco

from numpy import *
from numpy.linalg import norm
from time import time
from sys import stdout

def nmf(V,Winit,Hinit,tol,timelimit,maxiter):
 """
 (W,H) = nmf(V,Winit,Hinit,tol,timelimit,maxiter)
 W,H: output solution
 Winit,Hinit: initial solution
 tol: tolerance for a relative stopping condition
 timelimit, maxiter: limit of time and iterations
 """

 W = Winit; H = Hinit; initt = time();

 gradW = dot(W, dot(H, H.T)) - dot(V, H.T)
 gradH = dot(dot(W.T, W), H) - dot(W.T, V)
 initgrad = norm(r_[gradW, gradH.T])
 print 'Init gradient norm %f' % initgrad
 tolW = max(0.001,tol)*initgrad
 tolH = tolW

 for iter in xrange(1,maxiter):
  # stopping condition
  projnorm = norm(r_[gradW[logical_or(gradW<0, W>0)],
                                 gradH[logical_or(gradH<0, H>0)]])
  if projnorm < tol*initgrad or time() - initt > timelimit: break

  (W, gradW, iterW) = nlssubprob(V.T,H.T,W.T,tolW,1000)
  W = W.T
  gradW = gradW.T

  if iterW==1: tolW = 0.1 * tolW

  (H,gradH,iterH) = nlssubprob(V,W,H,tolH,1000)
  if iterH==1: tolH = 0.1 * tolH

  if iter % 10 == 0: stdout.write('.')

 print '\nIter = %d Final proj-grad norm %f' % (iter, projnorm)
 return (W,H)

def nlssubprob(V,W,Hinit,tol,maxiter):
 """
 H, grad: output solution and gradient
 iter: #iterations used
 V, W: constant matrices
 Hinit: initial solution
 tol: stopping tolerance
 maxiter: limit of iterations
 """

 H = Hinit
 WtV = dot(W.T, V)
 WtW = dot(W.T, W)

 alpha = 1; beta = 0.1;
 for iter in xrange(1, maxiter):
  grad = dot(WtW, H) - WtV
  projgrad = norm(grad[logical_or(grad < 0, H >0)])
  if projgrad < tol: break

  # search step size
  for inner_iter in xrange(1,20):
   Hn = H - alpha*grad
   Hn = where(Hn > 0, Hn, 0)
   d = Hn-H
   gradd = sum(grad * d)
   dQd = sum(dot(WtW,d) * d)
   suff_decr = 0.99*gradd + 0.5*dQd < 0;
   if inner_iter == 1:
    decr_alpha = not suff_decr; Hp = H;
   if decr_alpha:
    if suff_decr:
     H = Hn; break;
    else:
     alpha = alpha * beta;
   else:
      if not suff_decr or (Hp == Hn).all():
       H = Hp; break;
      else:
       alpha = alpha/beta; Hp = Hn;

  if iter == maxiter:
   print 'Max iter in nlssubprob'
 return (H, grad, iter)

nmf_example.py

import numpy, nmf

w1 = array([[1,2,3],[4,5,6]])
h1 = array([[1,2],[3,4],[5,6]])
w2 = array([[1,1,3],[4,5,6]])
h2 = array([[1,1],[3,4],[5,6]])

v = dot(w1,h1)

(wo,ho) = nmf(v, w2, h2, 0.001, 10, 10)
print wo
print ho

应用

图像分析

　　NMF最成功的一类应用是在图像的分析和处理领域。图像本身包含大量的数据，计算机一般将图像的信息按照矩阵的形式进行存放，针对图像的识别、分析和处理也是在矩阵的基础上进行的。这些特点使得NMF方法能很好地与图像分析处理相结合。人们已经利用NMF算法，对卫星发回的图像进行处理，以自动辨别太空中的垃圾碎片；使用NMF算法对天文望远镜拍摄到的图像进行分析，有助于天文学家识别星体；美国还尝试在机场安装由NMF算法驱动的识别系统，根据事先输入计算机的恐怖分子的特征图像库来自动识别进出机场的可疑恐怖分子。

文本聚类/数据挖掘

　　文本在人类日常接触的信息中占有很大分量，为了更快更精确地从大量的文本数据中取得所需要的信息，针对文本信息处理的研究一直没有停止过。文本数据不光信息量大，而且一般是无结构的。此外，典型的文本数据通常以矩阵的形式被计算机处理，此时的数据矩阵具有高维稀疏的特征，因此，对大规模文本信息进行处理分析的另一个障碍便是如何削减原始数据的维数。NMF算法正是解决这方面难题的一种新手段。NMF在挖掘用户所需数据和进行文本聚类研究中都有着成功的应用例子。由于NMF算法在处理文本数据方面的高效性，著名的商业数据库软件Oracle在其第10版中专门利用NMF算法来进行文本特征的提取和分类。为什么NMF对于文本信息提取得很好呢？原因在于智能文本处理的核心问题是以一种能捕获语义或相关信息的方式来表示文本，但是传统的常用分析方法仅仅是对词进行统计，而不考虑其他的信息。而NMF不同，它往往能达到表示信息的局部之间相关关系的效果，从而获得更好的处理结果。

语音处理

　　语音的自动识别一直是计算机科学家努力的方向，也是未来智能应用实现的基础技术。语音同样包含大量的数据信息，识别语音的过程也是对这些信息处理的过程。NMF算法在这方面也为我们提供了一种新方法，在已有的应用中，NMF算法成功实现了有效的语音特征提取，并且由于NMF算法的快速性，对实现机器的实时语音识别有着促进意义。也有使用NMF方法进行音乐分析的应用。复调音乐的识别是个很困难的问题，三菱研究所和MIT(麻省理工学院)的科学家合作，利用NMF从演奏中的复调音乐中识别出各个调子，并将它们分别记录下来。实验结果表明，这种采用NMF算法的方法不光简单，而且无须基于知识库。

机器人控制

　　如何快速准确地让机器人识别周围的物体对于机器人研究具有重要的意义，因为这是机器人能迅速作出相应反应和动作的基础。机器人通过传感器获得周围环境的图像信息，这些图像信息也是以矩阵的形式存储的。已经有研究人员采用NMF算法实现了机器人对周围对象的快速识别，根据现有的研究资料显示，识别的准确率达到了80%以上。

生物医学工程和化学工程

　　生物医学和化学研究中，也常常需要借助计算机来分析处理试验的数据，往往一些烦杂的数据会耗费研究人员的过多精力。NMF算法也为这些数据的处理提供了一种新的高效快速的途径。科学家将NMF方法用于处理核医学中的电子发射过程的动态连续图像，有效地从这些动态图像中提取所需要的特征。NMF还可以应用到遗传学和药物发现中。因为NMF的分解不出现负值，因此采用NMF分析基因DNA的分子序列可使分析结果更加可靠。同样，用NMF来选择药物成分还可以获得最有效的且负作用最小的新药物。

　　此外，NMF算法在环境数据处理、信号分析与复杂对象的识别方面都有着很好的应用。近年来采用NMF思想的应用才刚展开，相信以后会有更多的成功应用。这些成功的应用反过来也将促进NMF的进一步研究。

参考

Read List

http://blog.sciencenet.cn/blog-795427-629574.html

1994_Environmeteics_Positive matrix factorization A non-negative factor model with optimal utilization of error estimates of data values.pdf

1997_JM_A fast non-negativity-constrained least squares algorithm.pdf

1999_nature_Learning the parts of objects by non-negative matrix factorization.pdf

2000_IEEE_Normalized Cuts and Image Segmentation.pdf

2001_NIPS_Algorithms for non-negative matrix factorization.pdf

2001_CVPR_Learning Spatially Localized, Parts-B ased Representation.pdf

2001_PSUTR_Spectral Relaxation Models And Structure Analysis For K-Way Graph Clustering And Bi-Clustering .pdf

2001_WI_ICA.pdf

2002_IEEE_Algorithms for Non-Negative Independent Component Analysis .pdf

2002_NIPS_Spectral relaxation for K-means clustering.pdf

2002_NNSP_non-negative sparse coding.pdf

2003_ICASSP_NON-NEGATIVE MATRIX FACTORIZATION FOR VISUAL CODING.pdf

2004_NIPS_When Does Non-Negative Matrix Factorization give a correct Decomposition into parts.pdf

2004_Non-negative Matrix Factorization with Sparseness Constraints.pdf

2004_On reduced rank nonnegative matrix factorization for symmetric nonnegative matrices.pdf

2005-IEEE-Sparse image coding using a 3D non-negative tensor factorization.pdf

2005_ECML_Nonnegative Lagrangian Relaxation of K-Means and Spectral Clustering.pdf

2005_On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering.pdf

2005_ISSTECH_ Projected Gradient Methods for Non-negative Matrix Factorization.pdf

2005_SCIA_Projective Nonnegative Matrix Factorization for Image Compression and Feature Extraction.pdf

2006-IEEE-SOUND SOURCE SEPARATION USING SHIFTED NON-NEGATIVE TENSOR.pdf

2006_ACM_Orthogonal Nonnegative Matrix Tri-Factorizations for Clustering.pdf

2006_ICAISCP_Non-negative matrix factorization with quasi-Newton optimization.pdf

2006_ICDM_The Relationships Among Various Nonnegative Matrix Factorization Methods for Clustering.pdf

2006_IEEE_Convex and Semi-Nonnegative Matrix Factorizations.pdf

2006_IEEE_Nonsmooth nonnegative matrix factorization.pdf

2006_LAA_Nonnegative matrix factorization for spectraldata analysis.pdf

2006_Nonnegative matrix factorization for spectral data analysis.pdf

2007-ISSN-Regularized Alternating Least Squares Algorithms for Non-negative MatrixTensor Factorization.pdf

2007_CSDA_Algorithms and applications for approximate nonnegative matrix factorization.pdf

2007_ICDM_Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization.pdf

2007_NC_Projected Gradient Methods for Nonnegative Matrix Factorization.pdf

2007_VLSI_Non-negative Matrix Factorization with Orthogonality Constraints and its Application to Raman Spectroscopy.pdf

2008_ICDM_Non-negative Matrix Factorization on Manifold.pdf

2008_ICDM_Nonnegative Matrix Factorization for Combinatorial Optimization Spectral Clustering Graph Matching and Clique Finding.pdf

2008_IJCNN_Algorithms for orthogonal nonnegative matrix factorization.pdf

2008_PR_SVD based initialization A head start for nonnegative matrix factorization.pdf

2008_SIAM_ Nonnegative Matrix FactorizationBased on Alternating Nonnegativity Constrained Least Squares and Active Set Method.pdf

2008_Non-negative matrix factorization for semi-supervised data clustering.pdf

2008_SIAM_ Fast Newton-type methods for the least squares nonneg-ative matrix approximation problem.pdf

2008_SIAM_Semi-Supervised Clustering via Matrix Factorization.pdf

2010_AAAI_Non-Negative Matrix Factorization with Constraints.pdf

2010_ECML_Improved MinMax Cut Graph Clustering with Nonnegative Relaxation.pdf

2010_Extended Semi-supervised Matrix Factorization for Clustering.pdf

2010_ICDM_Integrating Symmetric Nonnegative Matrix Factorization and Normalized Cut Spectral Clustering.pdf

2010_Semi-Supervised Nonnegative Matrix Factorization.pdf

2011-JCAM-A multilevel approach for nonnegative matrix factorization.pdf

2011-SADM-Non-Negative Residual Matrix Factorization.pdf

2011_AMC_An alternating projected gradient algorithm for nonnegative matrix factorization.pdf

2011_CIKM_Simultaneous Clustering of Multi-Type Relational Data via symmetric nonnegative matrix tri-factorization.pdf

2011_ICDM_Learning Spectral Embedding for Semi-supervised Clustering.pdf

2011_IEEE_Symmetric Nonnegative Matrix Factorization：Algorithms and Applications to Probabilistic.pdf

2012-ICAISC-Initialization of Nonnegative Matrix.pdf

2012-IEEE-Coupled Nonnegative Matrix Factorization Unmixing.pdf

2012-IWSSIP-AUTOMATIC CYMBAL CLASSIFICATION.pdf

2012-JMLR-MahNMF：Manhattan Non-negative Matrix Factorization.pdf

2012-LSJ-Image Denoising based on Sparse Representation and Non-Negative Matrix Factorization.pdf

2012_ICNIP_Adaptive Multiplicative Updates for Projective Nonnegative Matrix Factorization.pdf

2012_ICONIP_Adaptive Multiplicative Updates for Projective Nonnegative Matrix Factorization.pdf

2012-CORR-Two Algorithms for Orthogonal Nonnegative Matrix Factorization with Application to Clustering.pdf

Nonnegative Matrix Factorization for Clustering（概述）.pdf

https://en.wikipedia.org/wiki/Non-negative_matrix_factorization ↩︎
Daniel D. Lee and H. Sebastian Seung (1999). "Learning the parts of objects by non-negative matrix factorization". Nature 401 (6755): 788–791. doi:10.1038/44565. PMID 10548103.↩︎
Daniel D. Lee and H. Sebastian Seung (2001). Algorithms for Non-negative Matrix Factorization. Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference. MIT Press. pp. 556–562.↩︎
汪鹏.非负矩阵分解：数学的奇妙力量.《计算机教育》2004年第10期.http://blog.sciencenet.cn/blog-248606-466811.html↩︎
Matlab文档nnmf.http://cn.mathworks.com/help/stats/nnmf.html↩︎
Chih-Jen Lin（libSVM的作者）.NMF工具.http://www.csie.ntu.edu.tw/~cjlin/nmf/↩︎

Non-negative Matrix Factorization 非负矩阵分解

介绍

定义

起源

优点

缺点

应用领域

相关

整体比较

具体比较

解法

Lee和Seung的倍增更新法则

实现

Matlab

Python

应用

图像分析

文本聚类/数据挖掘

语音处理

机器人控制

生物医学工程和化学工程

参考

Read List

Jiancheng Li

Non-negative Matrix Factorization 非负矩阵分解

介绍

定义

起源

优点

缺点

应用领域

相关

整体比较

具体比较

解法

Lee和Seung的倍增更新法则

实现

Matlab

Python

应用

图像分析

文本聚类/数据挖掘

语音处理

机器人控制

生物医学工程和化学工程

参考

Read List

Jiancheng Li

ICPC2013南京邀请赛 E. Eloh 解题报告

Jekyll

教务系统课程评价自动填写

Android Handler的处理方式

Android客户端使用socket通信

Java服务器使用socket通信

C#进制间的转换

Code Hunt SECTOR 02 LOOPS

Python使用socket通信

Android客户端使用HTTP通信