Asymptotically Efficient Distributed Experimentation
Abstract: Sequential decision making by a large set of myopic agents has gained significant attention over the past decade. In such settings, even a little amount of experimentation from a few agents would benefit all others but obtaining such experimentation could be challenging for a central planner. The academic literature has focused on mechanisms for promoting experimentation through monetary incentives and persuasion through careful information disclosure. In this paper, we study a simple control that the central planner can use to coordinate experimentation. We consider a set of myopic agents that observe their own histories but not the histories of other agents. In a continuous-time stochastic multi-armed bandit model, the agents pick arms myopically and receive instantaneous rewards. Meanwhile, the central planner can observe the history of all agents. We consider a class of policies where the central planner is allowed to irrevocably remove arms. We show that an appropriately chosen policy within this class can generate the needed experimentation and match the regret bounds for a centralized problem thus mitigating the cost of decentralization. We also quantify the minimum number of agents that are needed for such a policy to be asymptotically optimal and the impact of the number of agents on the speed of learning.
Biography: Ankur Mani is a Levenick Scholar at the Institute for Sustainability, Energy, and Environment and an affiliate faculty in the Industrial and Systems Engineering department at the University of Minnesota, Twin Cities. Earlier he received his Ph.D. in Media Arts and Science at the Massachusetts Institute of Technology and spent a year at the New York University, Stern School of Business and Microsoft Research. His research takes an interdisciplinary approach towards efficient design of infrastructure networks and collective decision making, with applications in sustainable production and consumption, rooted in social and economic sciences, operations research, and computer science. His research has appeared in several prominent venues (Management Science, Production and Operations Management, Nature Human Behavior, ACM Economics and Computation, Association for the Advancement of Artificial Intelligence, Proceedings of the IEEE, IEEE Transactions on Signal Processing and others) and received accolades within these disciplines (INFORMS, POMS, Aviation Applications Society, Net Institute).