An agent that learns optimal actions through trial, error, and reward signals in dynamic environments.