关键词:
Operations research
Artificial intelligence
Computer science
摘要:
For many real-world automated planning problems, it is difficult to obtain a transition model that governs state evolution with complex dynamics. Fortunately, the availability of sensors allows the collection of large quantities of data which can be input to efficient machine learning algorithms to accurately approximate the unknown transition model as a neural network. The central thesis of this dissertation is that automated planning problems with unknown transition models can be efficiently learned from data using neural networks, and the resulting learned planning problems can be efficiently compiled and solved as mathematical optimization problems. In this dissertation, we formally define the deterministic factored planning problem and learn its unknown state transition function with two distinct neural network structures, namely: (i) densely-connected neural networks with Rectified Linear Units (ReLUs), and (ii) Binarized Neural Networks (BNNs). For these learned neural network structures, we provide efficient compilations based on Mixed-Integer Linear Programming (MILP), and Binary Linear Programming (BLP) and Weighted Partial Maximum Satisfiability (WP-MaxSAT), respectively. For the MILP model, we investigate different preprocessing methodologies based on reachability, network sparsification and potential functions to speed up the experimental runtime performance of the underlying MILP solver. In order to find potential functions that support efficient planning with learned neural networks, we formalize the problem of finding potential functions for learned neural networks as a Bilevel Programming model, and solve it efficiently using a finite-time constraint generation algorithm. For the WP-MaxSAT model, we introduce a novel neuron activation compilation that is asymptotically the most compact encoding in the literature that maintains generalized arc-consistency property through unit propagation -- an important property that facilitates efficiency in WP-Max