alf.bin#

alf.bin.compare#

Compare two algorithms on a set of fixed task initializations.

Run:

python3 -m alf.bin.compare \
--root_dir1=~/tmp/ac_cart_pole \
--root_dir2=~/tmp/ddpg_cart_pole \
--alsologtostderr

Prefix with ``DISPLAY= vglrun -d :7 `` if running remotely with virtual_gl. The cleared DISPLAY env_var is so that gzclients are not created. gzclients are not being torn down after play and can occupy too many xserver connections. Set the proper DISPLAY variable when recording video.

main(_)[source]#

main function.

alf.bin.play#

Play a trained model.

You can visualize playing of the trained model by running:

cd ${PROJECT}/alf/examples;
python -m alf.bin.play \
--root_dir=~/tmp/cart_pole \
--alsologtostderr
launch_snapshot_play()[source]#

This play function uses historical ALF snapshot for playing a trained model, consistent with the code snapshot that trains the model.

In the newer version of train.py, a ALF snapshot is saved to root_dir right before the training begins. So this function prepends root_dir to PYTHONPATH to allow using the snapshot ALF repo in that place.

Note that for any old training root_dir prior to snapshot being enabled, this function doesn’t have any effect and the most up-to-date ALF will be used by play.

main(_)[source]#
play()[source]#

alf.bin.train#

Train model.

To run actor-critic on gym CartPole:

cd ${PROJECT}/alf/examples;
python -m alf.bin.train \
--root_dir=~/tmp/cart_pole \
--gin_file=ac_cart_pole.gin \
--gin_param='create_environment.num_parallel_environments=8' \
--alsologtostderr

You can view various training curves using Tensorboard by running the follwoing command in a different terminal:

tensorboard --logdir=~/tmp/cart_pole

You can visualize playing of the trained model by running:

cd ${PROJECT}/alf/examples;
python -m alf.bin.play \
--root_dir=~/tmp/cart_pole \
--gin_file=ac_cart_pole.gin \
--alsologtostderr

In case you have multiple GPUs on the machine and you would like to train with all of them, specify –distributed multi-gpu. This will use PyTorch’s DistributedDataParallel for training.

If instead of Gin configuration file, you want to use ALF python conf file, then replace the “–gin_file” option with “–conf”, and “–gin_param” with “–conf_param”.

main(_)[source]#
training_worker(rank, world_size, conf_file, root_dir, paras_queue=None)[source]#

An executable instance that trains and evaluate the algorithm

Parameters
  • rank (int) – The ID of the process among all of the DDP processes.

  • world_size (int) – The number of processes in total. If set to 1, it is interpreted as “non distributed mode”.

  • conf_file (str) – Path to the training configuration.

  • root_dir (str) – Path to the directory for writing logs/summaries/checkpoints.

  • paras_queue (Optional[Queue]) – a shared Queue for checking the consistency of model parameters in different worker processes, if multi-gpu training is used.

alf.bin.verify_checkpoint#

Utility to check whether checkpointed algorithm can be restored correctly.

It works as the following:

  1. Save the config.

  2. Train the algorithm for a few iterations.

  3. Test the algorithm for a few steps and store the output of the algorithm and the environment time steps.

  4. Save checkpoint.

  5. Create the algorithm using the saved config.

  6. Load checkpoint.

  7. Run the algorithm using the stored time steps.

  8. Compare the output from step 7 with the output from step 3. They should be exactly same.

The simplest way to use it is to invoke it in the following way:

python -m alf.bin.verify_checkpoint --conf [CONF_FILE_NAME]

You may want to set a different value of --num_train_iterations if your training does not start from beginning because of TrainerConfig.initial_collect_steps. You may also want to set a different value of --num_test_steps to test more steps.

main(_)[source]#