Date of Award

2013

Document Type

Master's report

Degree Name

Master of Science in Computer Science (MS)

College, School or Department Name

Department of Computer Science

Advisor

Zhenlin Wang

Abstract

This project examines the current available work on the explicit and implicit parallelization of the R scripting language and reports on experimental findings for the development of a model for predicting effective points for automatic parallelization to be performed, based upon input data sizes and function complexity. After finding or creating a series of custom benchmarks, an interval based on data size and time complexity where replacement becomes a viable option was found; specifically between O(N) and O(N3) exclusive. As data size increases, the benefits of parallel processing become more apparent and a point is reached where those benefits outweigh the costs in memory transfer time. Based on our observations, this point can be predicted with a fair amount of accuracy using regression on a sample of approximately ten data sizes spread evenly between a system determined minimum and maximum size.

Share

COinS