Image Annotation Result 图像标注结果
Some experiment results of image annotation according to the papers.
Result
Dataset: Corel-5K, ESP Game, IAPRTC-12
Method | Year | P | R | F1 | N+ | P | R | F1 | N+ | P | R | F1 | N+ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MBRM | 2004 | 24 | 25 | 24.5 | 122 | 18 | 19 | 18.5 | 209 | 24 | 23 | 23.5 | 223 |
JEC | 2008 | 27 | 32 | 29.3 | 139 | 22 | 25 | 23.4 | 224 | 28 | 29 | 28.5 | 250 |
Group Sparsity | 2010 | 30 | 33 | 31.4 | 146 | - | - | - | - | 32 | 29 | 30.4 | 252 |
CNN-R | 2015 | 32 | 41.3 | 36.1 | 166 | 44.5 | 28.5 | 34.7 | 248 | 49 | 31 | 38.0 | 272 |
FastTag | 2013 | 32 | 43 | 36.7 | 166 | 46 | 22 | 29.8 | 247 | 47 | 26 | 33.5 | 280 |
TagProp(σML) | 2009 | 33 | 42 | 37.0 | 160 | 39 | 27 | 31.9 | 239 | 46 | 35 | 39.8 | 266 |
2PKNN | 2012 | 39 | 40 | 39.5 | 177 | 51 | 23 | 31.7 | 245 | 49 | 32 | 38.7 | 274 |
GLKNN | 2015 | 36 | 47 | 40.8 | 184 | 41 | 36 | 38.3 | 282 | 34 | 31 | 32.4 | 255 |
SVM-DMBRM | 2014 | 36 | 48 | 41.1 | 197 | 55 | 25 | 34.4 | 259 | 56 | 29 | 38.2 | 283 |
SKL-CRM | 2014 | 39 | 46 | 42.2 | 184 | 41 | 26 | 31.8 | 248 | 47 | 32 | 38.1 | 274 |
KCCA-2PKNN | 2014 | 42 | 46 | 43.9 | 179 | - | - | - | - | 59 | 30 | 39.8 | 259 |
KCCA | 2015 | 39 | 53 | 44.9 | 184 | 30 | 36 | 32.7 | 252 | 38 | 39 | 38.5 | 273 |
2PKNN+ML | 2012 | 44 | 46 | 45.0 | 191 | 53 | 27 | 35.8 | 252 | 54 | 37 | 43.9 | 278 |
NMF-KNN | 2014 | 38 | 56 | 45.3 | 150 | 33 | 26 | 29.1 | 238 | - | - | - | - |
CCA-KNN | 2015 | 42 | 52 | 46.5 | 201 | 46 | 36 | 40.4 | 260 | 45 | 38 | 41.2 | 278 |
context-RM-B | 2015 | - | - | - | - | 61 | 24 | 34.4 | 242 | 61 | 20 | 30.1 | 234 |
SLED | 2015 | 35 | 51 | 41.5 | - | - | - | - | - | 49.82 | 47.36 | 48.6 | - |
NSIDML | 2016 | 44.12 | 51.76 | 47.76 | 194 | 49.8 | 29.5 | 37.05 | 253 | 56.9 | 36.5 | 45.21 | 282 |
AWD-IKNN | 2016 | 42 | 55 | 47.7 | 198 | 48 | 34 | 40.2 | 257 | 50 | 40 | 44.5 | 282 |
Dataset
Corel 5K
Paper: P. Duygulu, K. Barnard, J. F. de Freitas, and D. A. Forsyth. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In ECCV, 2002.ESP Game
Paper: L. Von Ahn and L. Dabbish. Labeling images with a computer game. In SIGCHI Conference on Human Factors in Computing Systems, 2004.
http://hunch.net/~learning/ESP-ImageSet.tar.gzIAPRTC-12
http://www.imageclef.org/photodata
Large Scale Dataset
- NUS-WIDE
Paper: Chua T S, Tang J, Hong R, et al. NUS-WIDE: a real-world web image database from National University of Singapore[C]//Proceedings of the ACM international conference on image and video retrieval. ACM, 2009: 48.
http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm
Dataset | Corel 5K | ESP Game | IAPR TC-12 | NUS-WIDE |
---|---|---|---|---|
No. of images | 5000 | 20770 | 19627 | 269648 (209347 annotated) |
No. of labels | 260 | 268 | 291 | 81 |
Train images | 4500 | 18689 | 17665 | 110K (not fixed) |
Test images | 500 | 2081 | 1962 | 4K (not fixed) |
labels per image | 3.4, 4, 5 | 4.7, 5, 15 | 5.7, 5, 23 | 2.4, 2 |
images per label | 58.6, 22, 1004 | 326.7, 172, 4553 | 347.7, 153, 4999 | 5701.3, 1682 |
No. of labels < mean-freq | 195 (75.0%) | 201 (75.0%) | 217 (74.6%) |
(entry format: mean, median, maximum)
Features and annotations from INRIA
http://lear.inrialpes.fr/people/guillaumin/data.php
gen_annotation.m
input: files provided in the website
output: train_annot.txt, test_annot.txt foreach dataset folder1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20datasets = { 'corel5k', 'iaprtc12', 'espgame' };
sets = { 'test' , 'train' };
for db=1:length(datasets),
ds = datasets{db};
for s=1:length(sets),
str = sets{s};
list = textread([ds '/' ds '_' str '_list.txt'],'%s');
annot = logical(vec_read([ds '/' ds '_' str '_annot.hvecs']));
fid = fopen([ds '/' str '_list.txt'], 'w');
for i=1:length(list),
annotation = annot(i,:);
fprintf(fid, '%d', annot(1));
for j=2:length(annotation),
fprintf(fid, '\t%d', annot(j));
end
fprintf(fid, '\n');
end
fclose(fid);
end
end
Reference
Method | Year | Conference | Reference Paper |
---|---|---|---|
MBRM | 2004 | CVPR | Multiple bernoulli relevance models for image and video annotation |
JEC | 2008 | ECCV | A new baseline for image annotation |
TagProp(σML) | 2009 | ICCV | Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation |
Group Sparsity | 2010 | CVPR | Automatic image annotation using group sparsity |
2PKNN | 2012 | ECCV | Image annotation using metric learning in semantic neighbourhoods |
2PKNN+ML | 2012 | ECCV | Image annotation using metric learning in semantic neighbourhoods |
FastTag | 2013 | ICML | Fast image tagging |
NMF-KNN | 2014 | CVPR | NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization |
SVM-DMBRM | 2014 | ICMR | A Hybrid Model for Automatic Image Annotation |
KCCA-2PKNN | 2014 | ICMR | A cross-media model for automatic image annotation |
SKL-CRM | 2014 | MIR | A sparse kernel relevance model for automatic image annotation |
context-RM-B | 2015 | CVPR | Feature-Independent Context Estimation for Automatic Image Annotation |
CNN-R | 2015 | ICMR | Automatic Image Annotation using Deep Learning Representations |
KCCA | 2015 | ICMR | Automatic Image Annotation using Deep Learning Representations |
CCA-KNN | 2015 | ICMR | Automatic Image Annotation using Deep Learning Representations |
GLKNN | 2015 | ICMR | Graph Learning on K Nearest Neighbours for Automatic Image Annotation |
SLED | 2015 | J. TIP | SLED: Semantic Label Embedding Dictionary Representation for Multilabel Image Annotation |
NSIDML | 2016 | J. VCIR | Image distance metric learning based on neighborhood sets for automatic image annotation |
AWD-IKNN | 2016 | PCM | Automatic Image Annotation using Adaptive Weighted Distance in Improved K Nearest Neighbors Framework |
Large Scale Dataset
Year | Conference | Reference Paper |
---|---|---|
2014 | ICLR | Deep Convolutional Ranking for Multilabel Image Annotation |
2015 | ICCV | Love Thy Neighbors Image Annotation by Exploiting Image Metadata |
2015 | ICMR | Large Scale Image Annotation via Deep Representation Learning and Tag Embedding Learning |
Some reference links
Method | Link |
---|---|
TagProp(σML) | http://lear.inrialpes.fr/people/guillaumin/code.php#tagprop |
Group Sparsity | http://ranger.uta.edu/~huang/codes/annotation_corel.zip |
2PKNN(+ML) | http://cvit.iiit.ac.in/projects/imageAnnotation/ |
FastTag | http://www.cse.wustl.edu/~mchen/ |
NMF-KNN | http://crcv.ucf.edu/people/phd_students/mahdi/ |
SKL-CRM | https://github.com/sjmoran/sklcrm |