Off-campus Michigan Tech users: To download campus access theses or dissertations, please use the following button to log in with your Michigan Tech ID and password: log in to proxy server
Non-Michigan Tech users: Please talk to your librarian about requesting this thesis or dissertation through interlibrary loan.
Date of Award
2020
Document Type
Campus Access Master's Thesis
Degree Name
Master of Science in Mechanical Engineering (MS)
Administrative Home Department
Department of Mechanical Engineering-Engineering Mechanics
Advisor 1
Zequn Wang
Committee Member 1
Bo Chen
Committee Member 2
Ye Sun
Abstract
Model-free reinforcement learning based methods such as Proximal Policy Optimization, or Q-learning typically require thousands of interactions with the environment to approximate the optimum controller which may not always be feasible in robotics due to safety and time consumption. Model-based methods such as PILCO or BlackDrops, while data-efficient, provide solutions with limited robustness and complexity. To address this tradeoff, we introduce two distinct methods for creation of virtual environments, which are formed through limited trials conducted in the original environment. We provide an efficient method for uncertainty management, which is used as a metric for self-improvement by identification of the points with maximum expected improvement through adaptive sampling. Capturing the uncertainty also allows for better modeling the reward responses of the original system. The reliance on absolute or deterministic reward as a metric for optimization process renders reinforcement learning highly susceptible to changes in problem dynamics. We introduce a novel framework that effectively quantify the aleatory variability in the design space and induces robustnessin controllers by switching to a reliability-based optimization routine. Our approaches enable the use of complex policy structures and reward functions through a unique combination of model-based and model-free methods, while still retaining the data efficiency. We demonstrate the validity of our methods on several reinforcement learning problems in OpenAI gym. We prove that our approach offers a better modeling capacity for complex system dynamics as compared to established methods.
Recommended Citation
Patwardhan, Narendra, "Proximal Reliability Optimization for Reinforcement Learning", Campus Access Master's Thesis, Michigan Technological University, 2020.