Diversity has been shown to be key to collective intelligence in natural systems. Despite this, current Multi-Agent Reinforcement Learning (MARL) approaches enforce behavioral homogeneity (to boost efficiency) or blindly promote behavioral diversity via intrinsic rewards or additional loss functions, effectively changing the learning objective and lacking a principled measure for it. In this context, the present work deals with the question of how to control the diversity of a multi-agent system. We introduce Diversity Control (DiCo), a method able to control diversity to an exact value of a given metric by representing policies as the sum of a parameter-shared component and dynamically scaled per-agent components. By applying constraints directly to the policy architecture, DiCo leaves the learning objective unchanged, enabling its applicability to any actor-critic MARL algorithm. We theoretically prove that DiCo achieves the desired diversity, and we provide several experiments, both in cooperative and competitive tasks, that show how DiCo can be employed as a novel paradigm to increase performance and sample efficiency in MARL, as well as lead to the emergence of novel diverse policies. Multimedia results are available on the "project’s website":https://sites.google.com/view/dico-marl.
"You can also join us on Zoom":https://cam-ac-uk.zoom.us/j/83400335522?pwd=LkjYvMOvVpMbabOV1MVTm8QU6DrGN7.1