top of page

Combining Dense and Sparse Rewards to Improve Deep Reinforcement Learning Policies in Reach-Avoid Games with Faster Evaders in Two vs. One Scenarios

Jefferson Silveira, Kalena McCloskey, Camille-Alain Rabbath, Craig Williams, Sidney Givigi

Abstract

This paper investigates a variation of the reach-avoid game, a multi-agent pursuit and evasion scenario applicable to aerial defense, with faster evaders. Using Deep Reinforcement Learning techniques, the study proposes a different reward function that combines dense (distance-based) and sparse (outcome-based) rewards. Focused on the defender’s perspective in aerial defense, this new reward function resulted in effective learned policies against faster evaders, outperforming traditional differential game and DRL strategies with dense-only rewards. Moreover, the learned policy demonstrated versatility across different instances of the problem, including changes in pursuer speeds and winning radii, illustrating its versatility in unseen situations during training.

Published in the 2024 10th International Conference on Control, Decision and Information Technologies (CoDIT)

Published by IEEE

DOI: 10.1109/CoDIT62066.2024.10708505

Effects of temperature and age on stress relaxation in straight and modified asphalt binders from a northern Ontario pavement trial

Kalena McCloskey, M. R. Nivitha, Jianmin Ma, Simon A. M. Hesp, J. Murali Krishnan

Abstract

The effects of temperature and age on stress relaxation were investigated for a set of seven asphalt binders from a northern Ontario pavement trial. It was found that between binders, there were rather wide variations for different modes of relaxation. The first peak in the relaxation spectrum, originating from the rapid relaxation within the mobile saturates domains, was relatively insensitive to ageing. The second and third peaks at higher relaxation times changed more with both laboratory ageing and temperature changes. The straight-run Cold Lake asphalt binder showed the least amount of change, and this can likely explain its superior field performance. High styrene–butadiene (SB and SBS) polymer loadings, the presence of polyphosphoric acid (PPA), or re-refined engine oil bottoms (REOB) all resulted in solid-like behaviour, delayed stress relaxation, and associated reduced lifecycles.

Published in Road Materials and Pavement Design Volume 24, 2023

Presented at the 10th International Conference of the European Asphalt

Technology Association

Published by Taylor & Francis

DOI: 10.1080/14680629.2023.2180995

bottom of page