AI Questions & Answers Logo
AI Questions & Answers Part of the Q&A Network
Q&A Logo

How do I implement a custom reward function in a reinforcement learning environment?

Asked on Oct 28, 2025

Answer

To implement a custom reward function in a reinforcement learning environment, you need to define how the agent's actions translate into rewards, which guide the learning process. This involves modifying the environment's code to include your specific criteria for rewarding the agent.
<!-- BEGIN COPY / PASTE -->
    class CustomEnvironment:
        def __init__(self):
            # Initialize environment state
            self.state = self.reset()

        def step(self, action):
            # Update state based on action
            self.state = self._take_action(action)
            # Calculate reward based on custom criteria
            reward = self._calculate_reward(self.state, action)
            # Check if the episode is done
            done = self._check_done(self.state)
            return self.state, reward, done

        def _calculate_reward(self, state, action):
            # Define custom reward logic
            if state == "desired_state":
                return 10  # Reward for reaching desired state
            elif action == "undesired_action":
                return -5  # Penalty for undesired action
            else:
                return 0  # Neutral reward

        def reset(self):
            # Reset the environment to initial state
            return "initial_state"
    <!-- END COPY / PASTE -->


Additional Comment:
  • Ensure your reward function aligns with the goals of the task and encourages the desired behavior from the agent.
  • Test the reward function thoroughly to ensure it doesn't lead to unintended behaviors or exploitation by the agent.
  • Consider the balance between positive rewards for desired actions and penalties for undesired actions to guide effective learning.
  • Use domain knowledge to craft a reward function that reflects the real-world objectives of the task.
✅ Answered with AI best practices.

← Back to All Questions
The Q&A Network