
KM
Combining Dense and Sparse Rewards to Improve Deep Reinforcement Learning Policies in Reach-Avoid Games with Faster Evaders in Two vs. One Scenarios
Jefferson Silveira, Kalena McCloskey, Camille-Alain Rabbath, Craig Williams, Sidney Givigi
Abstract
This paper investigates a variation of the reach-avoid game, a multi-agent pursuit and evasion scenario applicable to aerial defense, with faster evaders. Using Deep Reinforcement Learning techniques, the study proposes a different reward function that combines dense (distance-based) and sparse (outcome-based) rewards. Focused on the defender’s perspective in aerial defense, this new reward function resulted in effective learned policies against faster evaders, outperforming traditional differential game and DRL strategies with dense-only rewards. Moreover, the learned policy demonstrated versatility across different instances of the problem, including changes in pursuer speeds and winning radii, illustrating its versatility in unseen situations during training.
Published in the 2024 10th International Conference on Control, Decision and Information Technologies (CoDIT)
Published by IEEE
DOI: 10.1109/CoDIT62066.2024.10708505
Effects of temperature and age on stress relaxation in straight and modified asphalt binders from a northern Ontario pavement trial
Kalena McCloskey, M. R. Nivitha, Jianmin Ma, Simon A. M. Hesp, J. Murali Krishnan
Abstract
The effects of temperature and age on stress relaxation were investigated for a set of seven asphalt binders from a northern Ontario pavement trial. It was found that between binders, there were rather wide variations for different modes of relaxation. The first peak in the relaxation spectrum, originating from the rapid relaxation within the mobile saturates domains, was relatively insensitive to ageing. The second and third peaks at higher relaxation times changed more with both laboratory ageing and temperature changes. The straight-run Cold Lake asphalt binder showed the least amount of change, and this can likely explain its superior field performance. High styrene–butadiene (SB and SBS) polymer loadings, the presence of polyphosphoric acid (PPA), or re-refined engine oil bottoms (REOB) all resulted in solid-like behaviour, delayed stress relaxation, and associated reduced lifecycles.
Published in Road Materials and Pavement Design Volume 24, 2023
Presented at the 10th International Conference of the European Asphalt
Technology Association
Published by Taylor & Francis
DOI: 10.1080/14680629.2023.2180995
