Off-campus Michigan Tech users: To download campus access theses or dissertations, please use the following button to log in with your Michigan Tech ID and password: log in to proxy server

Non-Michigan Tech users: Please talk to your librarian about requesting this thesis or dissertation through interlibrary loan.

Date of Award


Document Type

Campus Access Master's Thesis

Degree Name

Master of Science in Mechanical Engineering (MS)

Administrative Home Department

Department of Mechanical Engineering-Engineering Mechanics

Advisor 1

Zequn Wang

Committee Member 1

Bo Chen

Committee Member 2

Ye Sun


Model-free reinforcement learning based methods such as Proximal Policy Optimization, or Q-learning typically require thousands of interactions with the environment to approximate the optimum controller which may not always be feasible in robotics due to safety and time consumption. Model-based methods such as PILCO or BlackDrops, while data-efficient, provide solutions with limited robustness and complexity. To address this tradeoff, we introduce two distinct methods for creation of virtual environments, which are formed through limited trials conducted in the original environment. We provide an efficient method for uncertainty management, which is used as a metric for self-improvement by identification of the points with maximum expected improvement through adaptive sampling. Capturing the uncertainty also allows for better modeling the reward responses of the original system. The reliance on absolute or deterministic reward as a metric for optimization process renders reinforcement learning highly susceptible to changes in problem dynamics. We introduce a novel framework that effectively quantify the aleatory variability in the design space and induces robustnessin controllers by switching to a reliability-based optimization routine. Our approaches enable the use of complex policy structures and reward functions through a unique combination of model-based and model-free methods, while still retaining the data efficiency. We demonstrate the validity of our methods on several reinforcement learning problems in OpenAI gym. We prove that our approach offers a better modeling capacity for complex system dynamics as compared to established methods.