关键词:
Computer engineering
Computer science
Artificial intelligence
Deep learning
Deep neural network
Processing-step efficiency
Robust model efficiency
Versatile model efficiency
摘要:
Breakthrough of deep learning (DL) has greatly promoted development of machine learning in numerous academic disciplines and industries in recent years. A subsequent concern, which is frequently raised by multidisciplinary researchers, software developers, and machine learning end users, is inefficiency of DL methods: intolerable training and inference time, exhausted computing resources, and unsustainable power consumption. To tackle the inefficiency issues, tons of DL efficiency methods have been proposed to improve efficiency without sacrificing prediction accuracy of a specified application such as image classification and visual object detection. However, we suppose that the traditional DL efficiency methods are not sufficiently flexible or adaptive to meet requirement of practical usage scenarios, based on two observations. First, most of the traditional methods adopt an objective "no accuracy loss for a specified application", while the objective cannot cover considerable scenarios. For example, to meet diverse user needs, a public cloud platform should provide an efficient and multipurpose DL method instead of focusing on an application only. Second, most of the traditional methods adopt model compression and quantization as efficiency enhancement strategies, while these two strategies are severely degraded for a certain number of scenarios. For example, for embedded deep neural networks (DNNs), significant architecture change and quantization may severely weaken customized hardware accelerators designed for predefined DNN operators and precision. In this dissertation, we will investigate three popular usage scenarios and correspondingly propose our DL efficiency methods: versatile model efficiency, robust model efficiency, and processing-step efficiency. The first scenario is requiring a DL method to achieve model efficiency and versatility. The model efficiency is to design a compact deep neural network, while the versatility is to get satisfactory predict