Quick Start#
Installation of the Environments#
Create an environment (requires Conda installation):
Use the following command to create a new Conda environment named
robustgymnasiumwith Python 3.11:conda create -n robustgymnasium python=3.11
Activate the newly created environment:
conda activate robustgymnasium
Install dependency packages:
Install the necessary packages using pip. Make sure you are in the project directory where the
setup.pyfile is located:pip install -r requirements.txt pip install -e .
Testing the Tasks#
To run the tests, navigate to the examples directory and execute the test script, e.g.,
cd examples/robust_action/mujoco/
chmod +x test.sh
./test.sh
Ensure you follow these steps to set up and test the environment properly. Adjust paths and versions as necessary based on your specific setup requirements.
A Simple Example#
import robust_gymnasium
from robust_gymnasium.configs.robust_setting import get_config
env = robust_gymnasium.vector.make("Ant-v4", render_mode="human")
observation, info = env.reset(seed=0)
for _ in range(1000):
args = get_config().parse_args()
action = env.action_space.sample()
robust_input = {"action": action, "robust_config": args}
observation, reward, terminated, truncated, info = env.step(robust_input)
if terminated or truncated:
observation, info = env.reset()
env.close()
Step-by-Step Introduction#
This introduction demonstrates how to set up and use the robust_gymnasium library for robust reinforcement learning environments. We’ll cover configuring tasks, selecting attack modes, managing attack frequencies, and running simulations.
Contents#
Importing Required Packages
Configuring the Environment and Robust Settings
Setting Up Experiment Data Recording
Running the Environment with Robust Attacks
Handling XML File Content Replacement
Wrapping Up and Closing the Environment
1. Importing Required Packages#
The following code imports all the necessary packages:
# Import packages
import robust_gymnasium as gym
from os import path
import json
import os
import time
from datetime import datetime
Explanation:
robust_gymnasium as gym: The robust_gymnasium library is used for robust RL simulations.
os, path, and json: Handle file operations and JSON configuration.
time and datetime: Manage timestamps for recording experiments.
2. Configuring the Environment and Robust Settings#
We define the environment, attack settings, and other parameters:
from robust_gymnasium.configs.robust_setting import get_config
args = get_config().parse_args()
# choose robust task
args.env_name = "Humanoid-v5"
# choose attack type
args.noise_factor = "state"
# choose attack mode
args.noise_type = "gauss"
# attack frequency
args.llm_disturb_interval = 500
Explanation:
get_config(): Loads default configuration for robust settings.
args.env_name: Specifies the simulation environment, e.g., Humanoid-v5.
args.noise_factor: Specifies the aspect of the environment to attack, e.g., state.
args.noise_type: Specifies the type of noise for the attack, e.g., Gaussian.
args.llm_disturb_interval: Sets the interval (in steps) for attacks.
3. Setting Up Experiment Data Recording#
We set up a directory structure to save experiment logs:
folder = os.getcwd()[:0] + 'data/' + str(args.env_name) + '/' + str(args.noise_type) + '/' + str(
start_run_date_and_time) + '/'
if not os.path.exists(folder):
os.makedirs(folder)
json_path = folder + '/config.json'
argsDict = args.__dict__
with open(json_path, 'w') as f:
f.writelines('------------------ start ------------------' + '\n')
for eachArg, value in argsDict.items():
f.writelines(eachArg + ' : ' + str(value) + '\n')
f.writelines('------------------- end -------------------')
Explanation:
Creates a folder structure to store data based on environment and attack settings.
Saves the configuration in a config.json file for reproducibility.
4. Running the Environment with Robust Attacks#
Set up the environment and run it with the specified robust settings:
# Create and render the environment
env = gym.make(args.env_name, render_mode="human")
# Reset environment
observation, info = env.reset(seed=42)
# Simulation loop
try:
for i in range(1000):
action = env.action_space.sample()
robust_input = {
"action": action,
"robust_type": "action",
"robust_config": args,
}
observation, reward, terminated, truncated, info = env.step(robust_input)
env.render() # Render environment
if terminated or truncated:
observation, info = env.reset()
finally:
print('\033[0;31m "Program was terminated by user (Ctrl+C) or finished!" \033[0m')
Explanation:
env = gym.make(): Creates the environment with rendering enabled.
env.reset(): Initializes the environment.
env.step(robust_input): Executes an action with robust input parameters.
env.render(): Renders the environment for visualization.
The loop runs for 1000 iterations or until termination.
5. Handling XML File Content Replacement#
If XML file updates are required during the simulation, use the following function:
def replace_xml_content(source_file_path, target_file_path):
# read data from source file
with open(source_file_path, 'r', encoding='utf-8') as file:
source_content = file.read()
# write the data into the target file
with open(target_file_path, 'w', encoding='utf-8') as file:
file.write(source_content)
Explanation:
Reads the content of a source XML file and writes it to a target XML file.
Useful for updating environment configurations dynamically.
6. Wrapping Up and Closing the Environment#
Ensure the environment is closed properly to release resources:
env.close()
Explanation:
Ensures all resources are released after the simulation.
Conclusion:
This tutorial provides a step-by-step guide to using the robust_gymnasium library for robust RL tasks. By following these sections, you can configure, run, and customize robust simulations efficiently.
A Simple Complete Example#
# Import packages
import robust_gymnasium as gym
from os import path
import json
import os
import time
from datetime import datetime
currentDateAndTime = datetime.now()
start_run_date_and_time = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
from robust_gymnasium.configs.robust_setting import get_config
args = get_config().parse_args()
# choose robust task: choose any tasks that are listed in our benchmark, e.g., "InvertedDoublePendulum-v4",
# "Reacher-v4", "Pusher-v4", "Ant-v4", etc.
args.env_name = "Humanoid-v5"
# choose attack type: choose any robust type that are list in our benchmark, such as state, reward, action, robust force (internal attack), wind (external attack)
args.noise_factor = "state"
# choose attack mode: we provide diverse attack modes, such as gaussian distribution attack, uniform
# distribution attack, LLM as adversary policy attack, etc.
args.noise_type = "gauss"
# attack frequency: Different attack frequency settings are available. You can choose to perform an attack every 500 steps,
# every 100 steps, or customize it to any desired number of steps.
args.llm_disturb_interval = 500
# record experiment data
folder = os.getcwd()[:0] + 'data/' + str(args.env_name) + '/' + str(args.noise_type) + '/' + str(
start_run_date_and_time) + '/'
if not os.path.exists(folder):
os.makedirs(folder)
json_path = folder + '/config.json'
argsDict = args.__dict__
with open(json_path, 'w') as f:
f.writelines('------------------ start ------------------' + '\n')
for eachArg, value in argsDict.items():
f.writelines(eachArg + ' : ' + str(value) + '\n')
f.writelines('------------------- end -------------------')
# env = gym.make("Ant-v4") # without render environments
env = gym.make(args.env_name, render_mode="human") # render environments: human, rgb_array, or depth_array.
def replace_xml_content(source_file_path, target_file_path):
# read data from source file
with open(source_file_path, 'r', encoding='utf-8') as file:
source_content = file.read()
# write the data into the target file
with open(target_file_path, 'w', encoding='utf-8') as file:
file.write(source_content)
observation, info = env.reset(seed=42)
try:
for i in range(1000):
action = env.action_space.sample()
robust_input = {
"action": action,
"robust_type": "action",
"robust_config": args,
}
observation, reward, terminated, truncated, info = env.step(robust_input)
env.render() # render environments
if terminated or truncated:
observation, info = env.reset()
if i > 999:
replace_xml_content(info["source_file_path"], info["target_file_path"])
finally: # except KeyboardInterrupt:
replace_xml_content(info["source_file_path"], info["target_file_path"])
print('\033[0;31m "Program was terminated by user (Ctrl+C) or finished!" \033[0m')
env.close()