关键词:
Computer engineering
Computer science
Artificial intelligence
摘要:
Deep neural networks (DNN) are proved to be effective and improve the performance dramatically in various kinds of computer vision tasks. The end-to-end learning manner in training DNN consistently shows the powerful modeling ability and consequently mitigates the dedicated efforts for expert feature engineering. On the other hand, it raises the issue that how to improve the black-box network with better representation (feature) learning especially when the learned representations and classifiers are tied together in the manner of supervised learning. In this work, representation learning is studied in four perspectives of different fields, i.e. diversity in ensemble learning, aspect ratio in image aesthetics assessment, invariance in identification task, and composition in color attribute recognition. In light of analyzing the bottleneck of black-box network and designing better representation learning for target tasks, we introduce that: (a) Ensemble learning relies on the diversity of the complementary neural networks, in both feature representations and classifier representations. A diverse representation learning method, namely learning-difficulty-aware embedding, is proposed to adaptively reconcile learning attentions for different categories by training a series of networks with diversified representations sequentially; (b) Widely-adopted data augmentation in image recognition deteriorates aspect ratios, which is an important factor in image aesthetics assessment. An aspect ratio representation learning method, namely adaptive fractional dilated convolution, is proposed to explicitly preserve the learning representation related to aspect ratios by adjusting the receptive fields adaptively and natively; (c) Identification tasks, e.g. person re-identification, aim at learning representations that are robust to interfering variances, e.g. lighting variances, view variances, pose variances. An invariance representation learning method, namely anchor loss, is prop