Date of Award

2016

Document Type

Open Access Dissertation

Degree Name

Doctor of Philosophy in Computer Engineering (PhD)

Administrative Home Department

Department of Electrical and Computer Engineering

Advisor 1

Laura E. Brown

Advisor 2

Timothy C. Havens

Committee Member 1

Nilufer Onder

Committee Member 2

Jinshan Tang

Abstract

The representation of nonuniform, multi-modal, time-limited time series data is complex and explored through the use of discrete representation, dimensionality reduction with segmentation based techniques, and with behavioral representation approaches. These explorations are done with a focus on an outpatient oncology setting with the classification and regression analysis being used for length of survival prognosis. Each decision of representation and analysis is not independent, with implications of each decision in method for how the data is represented and then which analysis technique is used. One unique aspect of the work is the use of outpatient clinical data for patients, which was explored initially through discrete sampling and behavioral representation. The length of survival was evaluated with both classification and regression methods initially. The first conclusion determined that including more discrete samples in the model showed no statistical benefit and the addition of behavioral approaches did improve the prognostic accuracy.

From this result, the adaption of Piecewise Aggregate Approximation was made to accommodate the multi-modal time series data of the outpatient clinical data, and evaluated with the regression methodologies. This representation approach demonstrated promise due to the simplicity but had decreased performance in the length of survival prognosis compared with behavioral representation and discrete samples approach. A solution was a new representation approach made which incorporates a genetic algorithm to select the window boundaries of the Piecewise Aggregate Approximation method. This selection is based on the fraction of the Piecewise Aggregate Approximation windows that contain values other than zero. The new representation improved the performance in some cases by a 20% reduction in median relative error.