If you need any of the features from the pre-release version listed under “Upcoming” you can just install coax from the main branch:
$ pip install git+https://github.com/coax-dev/coax.git@main
Switch from legacy
Add DeepMind Control Suite example (#29); see DeepMind Control Suite with SAC.
coax.utils.sync_shared_params()utility; example in A2C stub.
Improved performance for replay buffer (#25)
Bug fix: random_seed in _prioritized (#24)
Update to new Jax API (#27)
Add Update to
Bug fix: set logging level on
TrainMonitor.loggeritself (550a965 <https://github.com/coax-dev/coax/commit/550a965d17002bf552ab2fbea49801c65b322c7b>_).
Bug fix: fix affine transform for composite distributions (48ca9ce <https://github.com/coax-dev/coax/commit/48ca9ced42123e906969076dff88540b98e6d0bb>_)
Bug fix: #33
Bug fix: #21
Fix deprecation warnings from using
Bug fixes: #16
jax.ops.index*scatter operations with the new
Bumped version to drop hard dependence on
Implemented stochastic q-learning using quantile regression in
coax.StochasticQ, see example: IQN
coax.utils.quantiles()for equally spaced quantile fractions as in QR-DQN.
coax.utils.quantiles_uniform()for uniformly sampled quantile fractions as in IQN.
This is not much of a release. It’s only really the dependencies that were updated.
Added serialization utils:
Implemented Prioritized Experience Replay:
SegmentTreethat allows for batched updating.
SumTreesubclass that allows for batched weighted sampling.
Drop TransitionSingle (only use
TransitionBatchfrom now on).
TransitionBatch.idxfield to identify specific transitions.
TransitionBatch.Wfield to collect sample weights
policy_objectivesupdaters compatible with
Added scripts and notebooks: agent stub and pong.
FrameStackingwrapper that respects the
gym.spaceAPI and is compatible with the
Added data summary (min, median, max) for arrays in
StepwiseLinearFunctionutility, which is handy for hyperparameter schedules, see example usage here.
Implemented Distributional RL algorithm:
Added two new methods to all proba_dists:
Made TD-learning updaters compatible with
Made value-based policies compatible with
First version to go public.