The time until the final zero crossing of random sums with application to nonparametric bandit theory
Motivated by problems in machine learning and more fundamentally, by non-Bayesian, nonparametric problems in the sequential design of experiments, this work contributes to the task of attaining probability bounds for the number of times suboptimal bandits are chosen in a nonterminating sequence of experiments. To our knowledge, previously only the growth of the expectation of the number of incorrect choices has been examined. The derivation is founded, in part, on new contributions to the theory of zero crossings for sums of biased independent, identically distributed (i.i.d.) random variables. © 1994.
Applied Mathematics and Computation
The time until the final zero crossing of random sums with application to nonparametric bandit theory.
Applied Mathematics and Computation,
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p/5575