alf.networks#
alf.networks.action_encoder#
A simple parameterless action encoder.
- class SimpleActionEncoder(action_spec)[source]#
Bases:
alf.networks.network.NetworkA simple encoder for action.
It encodes discrete action to one hot representation and use the original continous actions. The output is the concat of all of them after flattening.
- Parameters
action_spec (nested BoundedTensorSpec) – spec for actions
- forward(inputs, state=())[source]#
Generate encoded actions.
- Parameters
inputs (nested Tensor) – action tensors.
- Returns
nested Tensor with the same structure as inputs.
- training: bool#
alf.networks.actor_distribution_networks#
ActorDistributionNetwork and ActorRNNDistributionNetwork.
- class ActorDistributionNetwork(input_tensor_spec, action_spec, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_fc_bn=False, discrete_projection_net_ctor=<class 'alf.networks.projection_networks.CategoricalProjectionNetwork'>, continuous_projection_net_ctor=<class 'alf.networks.projection_networks.NormalProjectionNetwork'>, name='ActorDistributionNetwork')[source]#
Bases:
alf.networks.actor_distribution_networks.ActorDistributionNetworkBaseNetwork which outputs temporally uncorrelated action distributions.
- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input
action_spec (TensorSpec) – the action spec
input_preprocessors (nested InputPreprocessor) – a nest of InputPreprocessor, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec(after reshaping). If any element is None, then it will be treated as math_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept input_tensor_spec as the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.
conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes.
activation (nn.functional) – activation used for hidden layers.
kernel_initializer (Callable) – initializer for all the layers excluding the projection net. If none is provided a default xavier_uniform will be used.
use_fc_bn (bool) – whether use Batch Normalization for the internal FC layers (i.e. FC layers except the last one).
discrete_projection_net_ctor (ProjectionNetwork) – constructor that generates a discrete projection network that outputs discrete actions.
continuous_projection_net_ctor (ProjectionNetwork) – constructor that generates a continuous projection network that outputs continuous actions.
name (str) –
- training: bool#
- class ActorDistributionNetworkBase(input_tensor_spec, action_spec, encoding_network_ctor, discrete_projection_net_ctor, continuous_projection_net_ctor, name='ActorDistributionNetworkBase', **encoder_kwargs)[source]#
Bases:
alf.networks.network.NetworkA base class for
ActorDistributionNetworkandActorDistributionRNNNetwork.Can also be used to create customized actor networks by providing different encoding network creators.
- Parameters
input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the input.action_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the action.encoding_network_ctor (
Callable) – the creator of the encoding network that does the heavy lifting of the actor.discrete_projection_net_ctor (ProjectionNetwork) – constructor that generates a discrete projection network that outputs discrete actions.
continuous_projection_net_ctor (ProjectionNetwork) – constructor that generates a continuous projection network that outputs continuous actions.
name (
str) – name of the networkencoder_kwargs – the extra keyword arguments to the encoding network
- forward(observation, state=())[source]#
Computes an action distribution given an observation.
- Parameters
observation (torch.Tensor) – consistent with
input_tensor_specstate – empty for API consistent with
ActorRNNDistributionNetwork
- Returns
action distribution state: empty
- Return type
act_dist (torch.distributions)
- make_parallel(n)[source]#
Create a
ParallelActorDistributionNetworkusingnreplicas ofself. The initialized network parameters will be different.
- property state_spec#
Return the state spec of the actor network. It is simply the state spec of the encoding network.
- training: bool#
- class ActorDistributionRNNNetwork(input_tensor_spec, action_spec, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, lstm_hidden_size=100, actor_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, discrete_projection_net_ctor=<class 'alf.networks.projection_networks.CategoricalProjectionNetwork'>, continuous_projection_net_ctor=<class 'alf.networks.projection_networks.NormalProjectionNetwork'>, name='ActorRNNDistributionNetwork')[source]#
Bases:
alf.networks.actor_distribution_networks.ActorDistributionNetworkBaseNetwork which outputs temporally correlated action distributions.
- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input
action_spec (TensorSpec) – the action spec
input_preprocessors (nested InputPreprocessor) – a nest of
InputPreprocessor, each of which will be applied to the corresponding input. If not None, then it must have the same structure withinput_tensor_spec(after reshaping). If any element is None, then it will be treated as math_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers for encoding the observation.
lstm_hidden_size (int or tuple[int]) – the hidden size(s) of the LSTM cell(s). Each size corresponds to a cell. If there are multiple sizes, then lstm cells are stacked.
actor_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers that are applied after the lstm cell’s output.
activation (nn.functional) – activation used for hidden layers.
kernel_initializer (Callable) – initializer for all the layers excluding the projection net. If none is provided a default xavier_uniform will be used.
discrete_projection_net_ctor (ProjectionNetwork) – constructor that generates a discrete projection network that outputs discrete actions.
continuous_projection_net_ctor (ProjectionNetwork) – constructor that generates a continuous projection network that outputs continuous actions.
name (str) –
- training: bool#
- class LatentActorDistributionNetwork(input_tensor_spec, action_spec, prior_actor_distribution_network_ctor=<class 'alf.networks.actor_distribution_networks.UnitNormalActorDistributionNetwork'>, normalizing_flow_network_ctor=<class 'alf.networks.normalizing_flow_networks.RealNVPNetwork'>, conditional_flow=True, scale_distribution=False, dist_squashing_transform=StableTanh(), name='LatentActorDistributionNetwork')[source]#
Bases:
alf.networks.network.NetworkGenerating an actor distribution by transforming a prior action distribution (e.g., standard Normal noise \(\mathcal{N}(0,1)\)) with a normalizing flow network. The resulting distribution might have an arbitrary shape.
Warning
Like some invertible transform such as
StableTanh, the inverse computation of a normalizing flow transform might cause numerical issues. For policy gradient methods like AC and PPO, transform caches are usually invalidated because of detaching actions for PG loss. SoLatentActorDistributionNetworkis best suitable for non PG algorithms like DDPG and SAC. Seealf/docs/notes/compute_probs_of_transformed_dist.rstfor details.- Parameters
input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the inputaction_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the action specprior_actor_distribution_network_ctor (
Callable) – a constructor that creates any actor distribution network. The only requirement is that this class returns an action distribution (could be transformed) forforward().normalizing_flow_network_ctor (
Callable) – a constructor that creates a normalizing flow network which is used to transform the prior action distribution.conditional_flow (
bool) – whether to make the normalizing flow network use inputs to condition its transformations. Only valid for normalizing flow nets that support this option.scale_distribution (
bool) – Whether or not to scale the output distribution to ensure that the output aciton fits within theaction_spec.dist_squashing_transform (
Transform) – A distribution Transform which transforms values into \((-1, 1)\). Default todist_utils.StableTanh()name (
str) – name of the network
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class ParallelActorDistributionNetwork(actor_network, n, name='ParallelActorDistributionNetwork')[source]#
Bases:
alf.networks.network.NetworkPerform
nactor distribution computations in parallel.It creates a parallelized version of
actor_network. :type actor_network:ActorDistributionNetwork:param actor_network: non-parallelized actor network :type actor_network: ActorDistributionNetwork :type n:int:param n: makenreplicas fromactor_networkwith differentinitialization.
- Parameters
name (str) –
- forward(observation, state=())[source]#
Computes action distribution given a batch of observations. :param inputs: A tuple of Tensors consistent with input_tensor_spec`. :type inputs: tuple :param state: Empty for API consistent with
ActorDistributionRNNNetwork. :type state: tuple
- property state_spec#
Return the state spec of the actor network. It is simply the state spec of the encoding network.
- training: bool#
- class UnitNormalActorDistributionNetwork(input_tensor_spec, action_spec, name='UnitNormalActorDistributionNetwork')[source]#
Bases:
alf.networks.network.NetworkOutputs a constant unit normal regardless of the inputs.
Args: input_tensor_spec (nested TensorSpec): the (nested) tensor spec of
the input.
- state_spec (nested TensorSpec): the (nested) tensor spec of the state
of the network.
name (str):
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
alf.networks.actor_networks#
ActorNetworks
- class ActorNetwork(input_tensor_spec, action_spec, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, squashing_func=<built-in method tanh of type object>, kernel_initializer=None, name='ActorNetwork')[source]#
Bases:
alf.networks.actor_networks.ActorNetworkBaseCreates an instance of
ActorNetwork, which maps the inputs to actions (single or nested) through a sequence of deterministic layers.- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input.
action_spec (BoundedTensorSpec) – the tensor spec of the action.
input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec(after reshaping). If any element is None, then it will be treated asmath_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
squashing_func (Callable) – the activation function used to squashing the output to the range \((-1, 1)\). Default to
tanh.kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a
variance_scaling_initializerwith uniform distribution will be used.name (str) – name of the network
- training: bool#
- class ActorNetworkBase(input_tensor_spec, action_spec, encoding_network_ctor=<class 'alf.networks.encoding_networks.EncodingNetwork'>, squashing_func=<built-in method tanh of type object>, name='ActorNetworkBase', **encoder_kwargs)[source]#
Bases:
alf.networks.network.NetworkA base class for
ActorNetworkandActorRNNNetwork.Can also be used to create customized actor networks by providing different encoding network creators.
- Parameters
input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the input.action_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the action.encoding_network_ctor (
Callable) – the creator of the encoding network that does the heavy lifting of the actor.squashing_func – the activation function used to squashing the output to the range \((-1, 1)\). Default to
tanh.name – name of the network
encoder_kwargs – the extra keyword arguments to the encoding network
- forward(observation, state=())[source]#
Computes action given an observation.
- Parameters
inputs – A tensor consistent with
input_tensor_specstate – empty for API consistent with
ActorRNNNetwork
- Returns
action (torch.Tensor): a tensor consistent with
action_specstate: empty
- Return type
tuple
- property state_spec#
Return the state spec of the actor network. It is simply the state spec of the encoding network.
- training: bool#
- class ActorRNNNetwork(input_tensor_spec, action_spec, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, lstm_hidden_size=100, actor_fc_layer_params=None, activation=<built-in method relu_ of type object>, squashing_func=<built-in method tanh of type object>, kernel_initializer=None, name='ActorRNNNetwork')[source]#
Bases:
alf.networks.actor_networks.ActorNetworkBaseCreates an instance of ActorRNNNetwork, which maps the inputs (observation and states) to actions (single or nested) through a sequence of deterministic layers.
- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input.
action_spec (BoundedTensorSpec) – the tensor spec of the action.
input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec(after reshaping). If any element is None, then it will be treated asmath_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes.
lstm_hidden_size (int or tuple[int]) – the hidden size(s) of the LSTM cell(s). Each size corresponds to a cell. If there are multiple sizes, then lstm cells are stacked.
actor_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers that are applied after the lstm cell’s output.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
squashing_func (Callable) – the activation function used to squashing the output to the range \((-1, 1)\). Default to
tanh.kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a variance_scaling_initializer with uniform distribution will be used.
name (str) – name of the network
- training: bool#
alf.networks.containers#
Various Network containers.
- Branch(*modules, input_tensor_spec=None, name='Branch', **named_modules)[source]#
Apply multiple networks on the same input.
Example:
net = Branch((module1, module2)) y, new_state = net(x, state)
is equivalent to the following:
y0, new_state0 = module1(x, state[0]) y1, new_state1 = module2(x, state[1]) y = (y0, y1) new_state = (new_state0, new_state1)
- Parameters
modules (nested nn.Module | Callable) – a nest of
torch.nn.Modulealf.nn.NetworkorCallable. Note thatBranch(module_a, module_b)is equivalent toBranch((module_a, module_b))named_modules (nn.Module | Callable) – a simpler way of specifying a dict of modules.
Branch(a=model_a, b=module_b)is equivalent toBranch(dict(a=module_a, b=module_b))input_tensor_spec (nested TensorSpec) – must be provided if it cannot be inferred from any one of
modulesname (str) –
- class Echo(block, input_tensor_spec=None)[source]#
Bases:
alf.networks.network.NetworkEcho network.
Echo network uses part of the output of
blockof current step as part of the input ofblockfor the next step. In particular, if the input ofblockis a dictionary, it should contains two keys ‘input’ and ‘echo’, and ‘echo’ will be taken from the output of the previous step. If the input ofblockis a tuple, the second input will be taken from the output of the previous step. If the output is a dictionary, it should contains two keys ‘output’ and ‘echo’, and ‘echo’ will be used as the input for the next step. If the output is a tuple, the second output will be used as the input for the next step.Note that
blockitself can be a recurrent network with state.Examples:
echo = Echo(block) output, state = echo(real_input, state)
is equivalent to the following if the input and output of block are dicts:
block_state, echo_input = state block_output, block_state = block(dict(input=real_input, echo=echo_input), block_state) output = block_output['output'] echo_output = block_output['echo'] state = (block_state, echo_output)
and is equivalent to the following if the input and output of block are tuples:
block_state, echo_input = state block_output, block_state = block((real_input, echo_input), block_state) output, echo_output = block_output state = (block_state, echo_output)
- Parameters
block (Network) – the module for performing the actual computation
input_tensor_spec (nested TensorSpec) – If provided, it must match the
block.input_tensor_spec[0]orblock.input_tensor_spec['input']
- forward(input, state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- make_parallel(n)[source]#
Create a parallelized version of this network.
- Parameters
n (int) – the number of copies
- Returns
the parallelized version of this network
- training: bool#
- class Parallel(modules, input_tensor_spec=None, name='Parallel')[source]#
Bases:
alf.networks.network.NetworkApply each Network in the nest of Network to the corresponding input.
Example:
net = Parallel((module1, module2)) y, new_state = net(x, state)
is equivalent to the following:
y0, new_state0 = module1(x[0], state[0]) y1, new_state1 = module2(x[1], state[1]) y = (y0, y1) new_state = (new_state0, new_state1)
- Parameters
modules (nested nn.Module) – a nest of
torch.nn.Moduleoralf.nn.Network.input_tensor_spec (nested TensorSpec) – must be provided if it cannot be inferred from
modules.name (str) –
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- make_parallel(n)[source]#
Create a parallelized version of this network.
- Parameters
n (int) – the number of copies
- Returns
the parallelized version of this network
- property networks#
- training: bool#
- Sequential(*modules, output='', input_tensor_spec=None, name='Sequential', **named_modules)[source]#
Network composed of a sequence of torch.nn.Module or alf.nn.Network.
All the modules provided through
modulesandnamed_modulesare calculated sequentially in the same order as they appear in the call toSequential. Typically, each module takes the result of the previous module as its input (or the input to the Sequential if it is the first module), and the result of the last module is the output of the Sequential. But we also allow more flexibilities as shown in example 2.Example 1:
net = Sequential(module1, module2) y, new_state = net(x, state)
is equivalent to the following:
z, new_state1 = module1(x, state[0]) y, new_state2 = module2(z, state[1]) new_state = (new_state1, new_state2)
Example 2:
net = Sequential( module1, a=module2, b=(('input', 'a'), module3), output=('a', 'b')) output, new_state = net(input, state)
is equivalent to the following:
_, new_state1 = module1(input, state[0]) a, new_state2 = module2(_, state[1]) b, new_state3 = module3((input, a), state[2]) new_state = (new_state1, new_state2, new_state3) output = (a, b)
- Parameters
modules (Callable | (nested str, Callable)) – The
Callablecan be atorch.nn.Module,alf.nn.Networkor plainCallable. Optionally, their inputs can be specified by the first element of the tuple. If input is not provided, it is assumed to be the result of the previous module (or input to thisSequentialfor the first module). If input is provided, it should be a nested str. It will be used to retrieve results from the dictionary of the currentnamed_results. For modules specified bymodules, because nonamed_moduleshas been invoked,named_resultsis{'input': input}.named_modules (Callable | (nested str, Callable)) – The
Callablecan be atorch.nn.Module,alf.nn.Networkor plainCallable. Optionally, their inputs can be specified by the first element of the tuple. If input is not provided, it is assumed to be the result of the previous module (or input to thisSequentialfor the first module). If input is provided, it should be a nested str. It will be used to retrieve results from the dictionary of the currentnamed_results.named_resultsis updated once the result of a named module is calculated.output (nested str) – if not provided, the result from the last module will be used as output. Otherwise, it will be used to retrieve results from
named_resultsafter the results of all modules have been calculated.input_tensor_spec (TensorSpec) – the tensor spec of the input. It must be specified if it cannot be inferred from
modules[0].name (str) –
alf.networks.critic_networks#
CriticNetworks
- class CriticNetwork(input_tensor_spec, output_tensor_spec=TensorSpec(shape=(), dtype=torch.float32), observation_input_processors=None, observation_preprocessing_combiner=None, observation_conv_layer_params=None, observation_fc_layer_params=None, action_input_processors=None, action_preprocessing_combiner=None, action_fc_layer_params=None, observation_action_combiner=None, joint_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_fc_bn=False, use_naive_parallel_network=False, name='CriticNetwork')[source]#
Bases:
alf.networks.encoding_networks.EncodingNetworkCreates an instance of
CriticNetworkfor estimating action-value of continuous or discrete actions. The action-value is defined as the expected return starting from the given input observation and taking the given action. This module takes observation as input and action as input and outputs an action-value tensor with the shape of[batch_size].The network take a tuple of (observation, action) as input to computes the action-value given an observation.
- Parameters
input_tensor_spec – A tuple of
TensorSpec``s ``(observation_spec, action_spec)representing the inputs.output_tensor_spec (TensorSpec) – spec for the output
observation_input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding observation input.
observation_preprocessing_combiner (NestCombiner) – preprocessing called on complex observation inputs.
observation_conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.observation_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes for observations.
action_input_processors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding action input.
action_preprocessing_combiner (NestCombiner) – preprocessing called to combine complex action inputs.
action_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes for actions.
observation_action_combiner (NestCombiner) – combiner class for fusing the observation and action. If None,
NestConcatwill be used.joint_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes FC layers after merging observations and actions.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a variance_scaling_initializer with uniform distribution will be used.
use_fc_bn (bool) – whether use Batch Normalization for the internal FC layers (i.e. FC layers beside the last one).
use_naive_parallel_network (bool) – if True, will use
NaiveParallelNetworkwhenmake_parallelis called. This might be useful in cases when theNaiveParallelNetworkhas an advantange in terms of speed overParallelNetwork. You have to test to see which way is faster for your particular situation.name (str) –
- make_parallel(n)[source]#
Create a parallel critic network using
nreplicas ofself. The initialized network parameters will be different. Ifuse_naive_parallel_networkis True, useNaiveParallelNetworkto create the parallel network.
- training: bool#
- class CriticRNNNetwork(input_tensor_spec, output_tensor_spec=TensorSpec(shape=(), dtype=torch.float32), observation_input_processors=None, observation_preprocessing_combiner=None, observation_conv_layer_params=None, observation_fc_layer_params=None, action_input_processors=None, action_preprocessing_combiner=None, action_fc_layer_params=None, joint_fc_layer_params=None, lstm_hidden_size=100, critic_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, name='CriticRNNNetwork')[source]#
Bases:
alf.networks.encoding_networks.LSTMEncodingNetworkCreates an instance of
CriticRNNNetworkfor estimating action-value of continuous or discrete actions. The action-value is defined as the expected return starting from the given inputs (observation and state) and taking the given action. It takes observation and state as input and outputs an action-value tensor with the shape of [batch_size].- Parameters
input_tensor_spec – A tuple of
TensorSpec``s ``(observation_spec, action_spec)representing the inputs.ourput_tensor_spec (TensorSpec) – spec for the output
observation_input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding observation input.
observation_preprocessing_combiner (NestCombiner) – preprocessing called on complex observation inputs.
observation_conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.observation_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes for observations.
action_input_processors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding action input.a
action_preprocessing_combiner (NestCombiner) – preprocessing called to combine complex action inputs.
action_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes for actions.
joint_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes FC layers after merging observations and actions.
lstm_hidden_size (int or tuple[int]) – the hidden size(s) of the LSTM cell(s). Each size corresponds to a cell. If there are multiple sizes, then lstm cells are stacked.
critic_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers that are applied after the lstm cell’s output.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a
variance_scaling_initializerwith uniform distribution will be used.name (str) –
- make_parallel(n)[source]#
Create a parallel critic RNN network using
nreplicas ofself. The initialized network parameters will be different. Ifuse_naive_parallel_networkis True, useNaiveParallelNetworkto create the parallel network.
- training: bool#
alf.networks.dynamics_networks#
DynamicsNetwork
- class DynamicsNetwork(input_tensor_spec, output_tensor_spec, joint_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, prob=False, continuous_projection_net_ctor=<class 'alf.networks.projection_networks.NormalProjectionNetwork'>, name='DynamicsNetwork')[source]#
Bases:
alf.networks.network.NetworkCreate an instance of DynamicsNetwork.
Creates an instance of DynamicsNetwork for predicting the next observation given current observation and action.
- Parameters
input_tensor_spec – A tuple of TensorSpecs (observation_spec, action_spec) representing the inputs.
joint_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes FC layers after merging observations and actions.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a variance_scaling_initializer with uniform distribution will be used.
prob (bool) – If True, use the probabistic mode of network; otherwise, use the determinstic mode of network.
continuous_projection_net_ctor (ProjectionNetwork) – constructor that generates a continuous projection network that outputs a distribution.
name (str) –
- forward(inputs, state=())[source]#
Computes prediction given inputs.
- Parameters
inputs – A tuple of Tensors consistent with input_tensor_spec
state – empty for API consistency
- Returns
- a tensor of the size [B, n, d] if self._prob is False
and a distribution if self._prob is True.
state: empty
- Return type
out
- make_parallel(n)[source]#
Create a
ParallelCriticNetworkusingnreplicas ofself. The initialized network parameters will be different.
- training: bool#
- class ParallelDynamicsNetwork(dynamics_network, n, name='ParallelDynamicsNetwork')[source]#
Bases:
alf.networks.network.NetworkCreate
nDynamicsNetwork in parallel.It create a parallelized version of
DynamicsNetwork.- Parameters
dynamics_network (DynamicsNetwork) – non-parallelized dynamics network
n (int) – make
nreplicas fromdynamics_networkwith different initializations.name (str) –
- forward(inputs, state=())[source]#
Computes prediction given inputs.
- Parameters
inputs – A tuple of Tensors consistent with input_tensor_spec
state – empty for API consistency
- Returns
- a tensor of the size [B, n, d] if self._prob is False
and a distribution if self._prob is True.
state: empty
- Return type
out
- training: bool#
alf.networks.encoding_networks#
- class AutoShapeImageDeconvNetwork(input_size, transconv_layer_params, output_shape, start_decoding_channels, preprocess_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, output_activation=<built-in method tanh of type object>, name='AutoShapeImageDeconvNetwork')[source]#
Bases:
alf.networks.containers._Sequential- A general template class for creating image deconv (transposed convolutional)
networks with auto-shape inference (thus named as
AutoShapeImageDeconvNetwork).
Auto-shape inference: instead of specifying an initial start shape for image deconv, this class only needs to specify the desired output shape for the image and will automatically calculate the desired shape to start decoding based on the specified
transconv_layer_paramsand uses a FC layer to map the to the desired start shape.- Parameters
input_size (int) – the size of the input latent vector
transconv_layer_params (tuple[tuple]) – a non-empty tuple of tuple (num_filters, kernel_size, strides, padding), where
paddingis optional.output_shape (tuple) – the complete output size would be output_shape = (c, h, w).
start_decoding_channels (int) – the initial number of channels we’d like to have for the feature map. Note that we always first project an input latent vector into a vector of an appropriate length so that it can be reshaped into (
start_decoding_channels,start_decoding_height,start_decoding_width), wherestart_decoding_heightandstart_decoding_widthare automatically inferred based on the specifiedoutput_shapeandtransconv_layer_params.preprocess_fc_layer_params (tuple[int]) – a tuple of fc layer units. These fc layers are used for preprocessing the latent vector before transposed convolutions.
activation (nn.functional) – activation for hidden layers
kernel_initializer (Callable) – initializer for all the layers.
output_activation (nn.functional) – activation for the output layer. Usually our image inputs are normalized to [0, 1] or [-1, 1], so this function should be
torch.sigmoidortorch.tanh.name (str) –
- training: bool#
- class EncodingNetwork(input_tensor_spec, output_tensor_spec=None, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_fc_bn=False, last_layer_size=None, last_activation=None, last_kernel_initializer=None, last_use_fc_bn=False, name='EncodingNetwork')[source]#
Bases:
alf.networks.containers._SequentialFeed Forward network with CNN and FC layers which allows the last layer to have different settings from the other layers.
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input. If nested, then
preprocessing_combinermust not be None.output_tensor_spec (None|TensorSpec) – spec for the output. If None, the output tensor spec will be assumed as
TensorSpec((output_size, )), whereoutput_sizeis inferred from network output. Otherwise, the output tensor spec will beoutput_tensor_specand the network output will be reshaped according tooutput_tensor_spec. Note thatoutput_tensor_specis only used for reshaping the network outputs for interpretation purpose and is not used for specifying any network layers.input_preprocessors (nested Network|nn.Module|None) – a nest of preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing FC layer sizes.
activation (nn.functional) – activation used for all the layers but the last layer.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If None, a variance_scaling_initializer will be used.
use_fc_bn (bool) – whether use Batch Normalization for fc layers.
last_layer_size (int) – an optional size of an additional layer appended at the very end. Note that if
last_activationis specified,last_layer_sizehas to be specified explicitly.last_activation (nn.functional) – activation function of the additional layer specified by
last_layer_size. Note that iflast_layer_sizeis not None,last_activationhas to be specified explicitly.last_use_fc_bn (bool) – whether use Batch Normalization for the last fc layer.
last_kernel_initializer (Callable) – initializer for the the additional layer specified by
last_layer_size. If None, it will be the same withkernel_initializer. Iflast_layer_sizeis None,last_kernel_initializerwill not be used.name (str) –
- make_parallel(n, allow_non_parallel_input=False)[source]#
Make a parallelized version of
module.A parallel network has
ncopies of network with the same structure but different independently initialized parameters. The parallel network can process a batch of the data with shape [batch_size, n, …] usingnnetworks with same structure.TODO: remove
allow_non_parallel_input. This means to make parallel network not to accept non-parallel input. It will make the logic more transparent.- Parameters
n (int) – the number of copies
allow_non_parallel_input (bool) – if True, the returned network will also accept non-parallel input with shape [batch_size, …]. In this case, the network will check whether the input is parallel input. If not, the input will be automatically replicated
ntimes at the beginning.
- Returns
the parallelized network.
- training: bool#
- class ImageDecodingNetwork(input_size, transconv_layer_params, start_decoding_size, start_decoding_channels, same_padding=False, preprocess_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, output_activation=<built-in method tanh of type object>, name='ImageDecodingNetwork')[source]#
Bases:
alf.networks.containers._SequentialA general template class for creating transposed convolutional decoding networks.
Initialize the layers for decoding a latent vector into an image. Currently there seems no need for this class to handle nested inputs; If necessary, extend the argument list to support it in the future.
How to calculate the output size: https://pytorch.org/docs/stable/generated/torch.nn.ConvTranspose2d.html:
H = (H1-1) * strides + HF - 2P + OP
where H = output size, H1 = input size, HF = size of kernel, P = padding, OP = output_padding (currently hardcoded to be 0 for this class).
Regarding padding: in the previous TF version, we have two padding modes:
validandsame. For the former, we always have no padding (P=0); for the latter, it’s also calledhalf padding(P=(HF-1)//2 when strides=1 and HF is an odd number the output has the same size with the input. Currently, PyTorch doesn’t support different left and right paddings and P is always (HF-1)//2. So if HF is an even number, the output size will increaseby 1 when strides=1).- Parameters
input_size (int) – the size of the input latent vector
transconv_layer_params (tuple[tuple]) – a non-empty tuple of tuple (num_filters, kernel_size, strides, padding), where
paddingis optional.start_decoding_size (int or tuple) – the initial height and width we’d like to have for the feature map
start_decoding_channels (int) – the initial number of channels we’d like to have for the feature map. Note that we always first project an input latent vector into a vector of an appropriate length so that it can be reshaped into (
start_decoding_channels,start_decoding_height,start_decoding_width).same_padding (bool) – similar to TF’s conv2d
samepadding mode. If True, the user provided paddings intransconv_layer_paramswill be replaced by automatically calculated ones; if False, it corresponds to TF’svalidpadding mode (the user can still provide custom paddings though).preprocess_fc_layer_params (tuple[int]) – a tuple of fc layer units. These fc layers are used for preprocessing the latent vector before transposed convolutions.
activation (nn.functional) – activation for hidden layers
kernel_initializer (Callable) – initializer for all the layers.
output_activation (nn.functional) – activation for the output layer. Usually our image inputs are normalized to [0, 1] or [-1, 1], so this function should be
torch.sigmoidortorch.tanh.name (str) –
- training: bool#
- class ImageDecodingNetworkV2(input_size, upsample_conv_layer_params, start_decoding_size, start_decoding_channels, preprocess_fc_layer_params=None, upsampling_mode='nearest', same_padding=False, activation=<built-in method relu_ of type object>, kernel_initializer=None, output_activation=<built-in method tanh of type object>, name='ImageDecodingNetworkV2')[source]#
Bases:
alf.networks.containers._SequentialImage decoding using upsampling+convolution.
Different with
ImageDecodingNetworkwhich uses transposed convolution to transform a smaller input to a larger image output, this class uses upsampling followed by convolution layers. The idea is to let conv layer refine the upsampling (e.g., nearest neighbor, bilinear, etc) results.The difference between transposed conv and upsampling+conv can be found in this article: https://distill.pub/2016/deconv-checkerboard/. In short, upsampling+conv might help reduce checkerboard artifacts that are common in the outputs by transposed convolutions.
An example network of upsampling+conv for decoding images.
net = ImageDecodingNetworkV2(input_size=100, start_decoding_size=10, start_decoding_channels=8, same_padding=True, upsample_conv_layer_params=( 2, (16, 3, 1), (32, 3, 1), 2, (64, 3, 1), (3, 3, 1))) # The image shape: (8,10,10) -> (8,20,20) -> (16,20,20) -> (32,20,20) # -> (32,40,40) -> (64,40,40) -> (3,40,40)
- Parameters
input_size (
int) – the size of the input latent vectorupsample_conv_layer_params (
Tuple[Union[int,Tuple[int]]]) – a tuple of ints or tuples. If the element is an int, it represents the scaling factor for atorch.nn.Upsamplelayer; otherwise it should a tuple of ints representing conv params(num_filters, kernel_size, strides, padding), wherepaddingis optional.start_decoding_size (
Union[int,Tuple[int]]) – the initial height and width we’d like to have for the feature map.start_decoding_channels (
int) – the initial number of channels we’d like to have for the feature map. Note that we always first project an input latent vector into a vector of an appropriate length so that it can be reshaped into (start_decoding_channels,start_decoding_height,start_decoding_width).preprocess_fc_layer_params (
Optional[Tuple[int]]) – if not None, then the input will be fed to a list of fc layers specified by this argument, before doing deconvolution.upsampling_mode (
str) – the argument for choosing an upsampling algorithm fortorch.nn.Upsample.same_padding (
bool) – similar to TF’s conv2dsamepadding mode. If True, the user provided paddings intransconv_layer_paramswill be replaced by automatically calculated ones; if False, it corresponds to TF’svalidpadding mode (the user can still provide custom paddings though). Please refer to the docstring ofImageEncodingNetworkfor definitions of the two padding modes.activation (
Callable) – activation for hidden layerskernel_initializer (
Optional[Callable]) – initializer for all the layers.output_activation (
Callable) – activation for the output layer. Usually our image inputs are normalized to [0, 1] or [-1, 1], so this function should betorch.sigmoidortorch.tanh.name (str) –
- training: bool#
- class ImageEncodingNetwork(input_channels, input_size, conv_layer_params, same_padding=False, activation=<built-in method relu_ of type object>, kernel_initializer=None, flatten_output=False, name='ImageEncodingNetwork')[source]#
Bases:
alf.networks.containers._SequentialA general template class for creating convolutional encoding networks.
Initialize the layers for encoding an image into a latent vector. Currently there seems no need for this class to handle nested inputs; If necessary, extend the argument list to support it in the future.
How to calculate the output size: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html:
H = (H1 - HF + 2P) // strides + 1
where H = output size, H1 = input size, HF = size of kernel, P = padding.
Regarding padding: in the previous TF version, we have two padding modes:
validandsame. For the former, we always have no padding (P=0); for the latter, it’s also called “half padding” (P=(HF-1)//2 when strides=1 and HF is an odd number the output has the same size with the input. Currently, PyTorch don’t support different left and right paddings and P is always (HF-1)//2. So if HF is an even number, the output size will decrease by 1 when strides=1).- Parameters
input_channels (int) – number of channels in the input image
input_size (int or tuple) – the input image size (height, width)
conv_layer_params (tuppe[tuple]) – a non-empty tuple of tuple (num_filters, kernel_size, strides, padding), where padding is optional
same_padding (bool) – similar to TF’s conv2d
samepadding mode. If True, the user provided paddings in conv_layer_params will be replaced by automatically calculated ones; if False, it corresponds to TF’svalidpadding mode (the user can still provide custom paddings though)activation (torch.nn.functional) – activation for all the layers
kernel_initializer (Callable) – initializer for all the layers.
flatten_output (bool) – If False, the output will be an image structure of shape
BxCxHxW; otherwise the output will be flattened into a feature of shapeBxN.
- training: bool#
- class LSTMEncodingNetwork(input_tensor_spec, output_tensor_spec=None, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, pre_fc_layer_params=None, hidden_size=(100, ), lstm_output_layers=-1, post_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, last_layer_size=None, last_activation=None, last_kernel_initializer=None, name='LSTMEncodingNetwork')[source]#
Bases:
alf.networks.containers._SequentialLSTM cells followed by an encoding network.
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input. If nested, then
preprocessing_combinermust not be None.output_tensor_spec (None|TensorSpec) – spec for the output. If None, the output tensor spec will be assumed as
TensorSpec((output_size, )), whereoutput_sizeis inferred from network output. Otherwise, the output tensor spec will beoutput_tensor_specand the network output will be reshaped according tooutput_tensor_spec. Note thatoutput_tensor_specis only used for reshaping the network outputs for interpretation purpose and is not used for specifying any network layers.input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.pre_fc_layer_params (tuple[int]) – a tuple of integers representing FC layers that are applied before the LSTM cells.
hidden_size (int or tuple[int]) – the hidden size(s) of the lstm cell(s). Each size corresponds to a cell. If there are multiple sizes, then lstm cells are stacked.
lstm_output_layers (None|int|list[int]) – -1 means the output from the last lstm layer.
Nonemeans all lstm layers.post_fc_layer_params (tuple[int]) – an optional tuple of integers representing hidden FC layers that are applied after the LSTM cells.
activation (nn.functional) – activation for all the layers but the last layer.
kernel_initializer (Callable) – initializer for all the layers but the last layer.
last_layer_size (int) – an optional size of an additional layer appended at the very end. Note that if
last_activationis specified,last_layer_sizehas to be specified explicitly.last_activation (nn.functional) – activation function of the additional layer specified by
last_layer_size. Note that iflast_layer_sizeis not None,last_activationhas to be specified explicitly.last_kernel_initializer (Callable) – initializer for the the additional layer specified by
last_layer_size. If None, it will be the same withkernel_initializer. Iflast_layer_sizeis None,last_kernel_initializerwill not be used.
- make_parallel(n, allow_non_parallel_input=False)[source]#
Make a parallelized version of
module.A parallel network has
ncopies of network with the same structure but different independently initialized parameters. The parallel network can process a batch of the data with shape [batch_size, n, …] usingnnetworks with same structure.- Parameters
n (int) – the number of copies
allow_non_parallel_input (bool) – if True, the returned network will also accept non-parallel input with shape [batch_size, …]. In this case, the network will check whether the input is parallel input. If not, the input will be automatically replicated
ntimes at the beginning.
- Returns
the parallelized network.
- training: bool#
- ParallelEncodingNetwork(input_tensor_spec, n, output_tensor_spec=None, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_fc_bn=False, last_layer_size=None, last_activation=None, last_kernel_initializer=None, last_use_fc_bn=False, name='ParallelEncodingNetwork')[source]#
Parallel encoding network which effectively runs
nindividual encoding network simultaneuosl.- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input. If nested, then
preprocessing_combinermust not be None.n (int) – number of parallel networks
output_tensor_spec (None|TensorSpec) – spec for the output, excluding the dimension of paralle networks
n. If None, the output tensor spec will be assumed asTensorSpec((n, output_size, )), whereoutput_sizeis inferred from network output. Otherwise, the output tensor spec will beTensorSpec((n, *output_tensor_spec.shape))and the network output will be reshaped accordingly. Note thatoutput_tensor_specis only used for reshaping the network outputs for interpretation purpose and is not used for specifying any network layers.input_preprocessors (None) – must be
None.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing FC layer sizes.
activation (nn.functional) – activation used for all the layers but the last layer.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If None, a variance_scaling_initializer will be used.
use_fc_bn (bool) – whether use Batch Normalization for fc layers.
last_layer_size (int) – an optional size of an additional layer appended at the very end. Note that if
last_activationis specified,last_layer_sizehas to be specified explicitly.last_activation (nn.functional) – activation function of the additional layer specified by
last_layer_size. Note that iflast_layer_sizeis not None,last_activationhas to be specified explicitly.last_kernel_initializer (Callable) – initializer for the the additional layer specified by
last_layer_size. If None, it will be the same withkernel_initializer. Iflast_layer_sizeis None,last_kernel_initializerwill not be used.last_use_fc_bn (bool) – whether use Batch Normalization for the last fc layer.
name (str) –
- Returns
the parallelized network
- SpatialBroadcastDecodingNetwork(input_size, output_height, conv_layer_params, output_width=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, output_activation=<function identity>, name='SpatialBroadcastDecodingNetwork')[source]#
Implements the spatial broadcast decoder in
In short, given a latent embedding and target output height/width, this decoder first spatially broadcast the embedding over
height*width, append a uniformxymeshgrid in [-1,1], and apply conv layers.- Parameters
input_size (
int) – the latent embedding sizeoutput_height (
int) – the target output image heightconv_layer_params (
Tuple[Tuple[int]]) – a tuple of conv layer params after broadcastingoutput_width (
Optional[int]) – if None, it’s equal tooutput_heightfc_layer_params (
Optional[Tuple[int]]) – a tuple of fc layers applied to the input embedding before broadcastingactivation (
Callable) – activation of the intermediate conv layersoutput_activation (
Callable) – the final activation
alf.networks.mdq_critic_networks#
MdqCriticNetworks
- class MdqCriticNetwork(input_tensor_spec, action_qt=None, num_critic_replicas=2, obs_encoding_layer_params=None, pre_encoding_layer_params=None, mid_encoding_layer_params=None, post_encoding_layer_params=None, free_form_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, debug_summaries=False, name='MdqCriticNetwork')[source]#
Bases:
alf.networks.network.NetworkCreate an instance of MdqCriticNetwork for estimating action-value of continuous actions and action sampling used in the MDQ algorithm.
Creates an instance of MdqCriticNetwork for estimating action-value of continuous actions and action sampling.
- Currently there are two branches of networks:
free-form branch: a plain MLP for Q-learning
- adv-form branch: an advantage form of network for action
generation. It is trained by a target from the free-form net.
- The adv-form branch has the following structures for flexibility:
obs -> [obs_encoding_net] -> encoded_obs encoded_obs, action ->
[pre_encoding_nets] -> [mid_shared_encoding_nets] -> [post_encoding_nets] -> outputs
where the pre_encoding_nets and post_encoding_nets do not share parameters across action dimensions while mid_shared_encoding_nets shares parameters across action dimensions. If the encoding_layer_params for a sub-net is None, that sub-net is effectively neglected.
Furthermore, to enable parallel computation across action dimension in the case of value computation, we have both parallel and individual versions for the nets without parameter sharing. For exmaple, for post_encoding_nets, we also have post_encoding_parallel_net, which is essentially the equivalent form of post_encoding_nets but supports parallel forwarding. The parameters of the two versions are synced. The partial actions (a[0:i]) are zero-padded for both parallel and individual networks to enable parallel computation.
For conciseness purpose, the following notations will be used when convenient:
B: batch size
d: dimensionality of feature
n: number of network replica
action_dim: the dimensionality of actions
action_bin: number of discrete bins for each action dim
- Parameters
input_tensor_spec – A tuple of TensorSpecs (observation_spec, action_spec) representing the inputs.
action_qt (ActionQuantizer) – action quantization module
num_critic_replicas (int) – number of critic networks
obs_encoding_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes for encoding observations.
pre_encoding_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes for encoding concatenated [encoded_observation, actions]. Parameters are not shared across action dimensions
mid_encoding_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer for further encoding the outputs from pre_encoding_net. The parameters are shared across action dimensions.
post_encoding_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer for further encoding the outputs from mid_encoding_net. The parameters are not shared across action dimensions.
free_form_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer for Q-learning. We refer it as the free form to differentiate it from the mdq-form of network which is structured.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a variance_scaling_initializer with uniform distribution will be used.
name (str) –
- forward(inputs, alpha, state=(), free_form=False)[source]#
Computes action-value given an observation.
- Parameters
inputs – A tuple of Tensors consistent with input_tensor_spec
alpha – the temperature used for the advantage computation
state – empty for API consistenty
free_form (bool) – use the free-form branch for computation if True; default value is False
- Returns
- if free_form is True, its shape is [B, n]
if free_form is False, its shape is [B, n, action_dim]
state: empty
- Return type
Q_values (torch.Tensor)
- get_action(inputs, alpha, greedy)[source]#
Sample action from the distribution induced by the mdq-net.
- Parameters
inputs – A tuple of Tensors consistent with input_tensor_spec
alpha – the temperature used for the advantage computation
greedy (bool) – If True, do greedy sampling by taking the mode of the distribution. If False, do direct sampling from the distribution.
- Returns
a tensor of the shape [B, n, action_dim] log_pi_per_dim (torch.Tensor): a tensor of the shape
[B, n, action_dim] representing the log_pi for each dimension of the sampled multi-dimensional action
- Return type
actions (torch.Tensor)
- training: bool#
alf.networks.memory#
Various memory classes.
Currently, all the memory classes implemented here only supports memory in one episode, which means that the memory is reset at the beginning of an episode.
- class FIFOMemory(dim, size, name='FIFOMemory')[source]#
Bases:
alf.networks.memory.MemoryA Simple FIFO memory.
When new memory slots are written, the oldest memory slots are removed.
- Parameters
dim (int) – dimension of memory content
size (int) – number of memory slots
- build(batch_size)[source]#
Build the memory for batch_size.
User does not need to call this explictly. read and write will automatically call this if the memory has not been built yet.
Note: Subsequent write and read must match this batch_size :param batch_size: batch size of the model. :type batch_size: int
- from_states(states)[source]#
Restore the memory from states.
- Parameters
states (tuple of Tensor) – It is should be obtained from states().
- mask()[source]#
Get the mask for the stored memory.
- Returns
shape=(batch_size, size), dtype=torch.bool
- Return type
Tensor
- read(keys)[source]#
Read out memory vectors for the given keys.
- Parameters
keys (Tensor) – shape is (b, dim) or (b, k, dim) where b is batch size, k is the number of read keys, and dim is memory content dimension
- Returns
- shape is same as keys. result[…, i] is the read
result for the corresponding key.
- Return type
resutl (Tensor)
- property states#
Get the states of the memory.
- Returns
tuple of memory content and usage tensor.
- Return type
memory states
- class Memory(dim, size, state_spec, name='Memory')[source]#
Bases:
objectAbstract base class for Memory.
- Parameters
dim (int) – dimension of memory content
size (int) – number of memory slots
state_spec (nested TensorSpec) – the spec for the states
name (str) – name of this memory
- property dim#
Get the dimension of each content vector.
- abstract read(keys)[source]#
Read out memory vectors for the given keys.
- Parameters
keys (Tensor) – shape is (b, dim) or (b, k, dim) where b is batch size, k is the number of read keys, and dim is memory content dimension
- Returns
- shape is same as keys. result[…, i] is the read
result for the corresponding key.
- Return type
resutl (Tensor)
- property size#
Get the size of the memory (i.e. the number of memory slots).
- property state_spec#
Get the state tensor specs.
- class MemoryWithUsage(dim, size, snapshot_only=False, normalize=True, scale=None, usage_decay=None, name='MemoryWithUsage')[source]#
Bases:
alf.networks.memory.MemoryMemory with usage indicator.
MemoryWithUsage stores memory in a matrix. During memory write, the memory slot with the smallest usage is replaced by the new memory content. The memory content can be retrived thrugh attention mechanism using read.
This implementation follows the one decribed in arXiv:1803.10760.
See Methods 2.3 of Unsupervised Predictive Memory in a Goal-Directed Agent
- Parameters
dim (int) – dimension of memory content
size (int) – number of memory slots
snapshot_only (bool) – If True, only keeps the last snapshot of the memory instead of keeping all the memory snapshot at every steps. If True, gradient cannot be propagated to the writer.
normalize (bool) – If True, use cosine similarity, otherwise use dot product.
scale (None|float) – Scale the similarity by this. If scale is None, a default value is used based
normalize. Ifnormalizeis True,scaleis default to 5.0. Ifnormalizeis False,scaleis default to1/sqrt(dim).usage_decay (None|float) – The usage will be scaled by this factor at every
writecall. If None, it is default to1 - 1 / size
- build(batch_size)[source]#
Build the memory for batch_size.
User does not need to call this explictly. read and write will automatically call this if the memory has not been built yet.
Note: Subsequent write and read must match this batch_size :param batch_size: batch size of the model. :type batch_size: int
- create_keynet(query_spec, num_keys)[source]#
Create a net which can be used to generate keys.
The created keynet can be used with
genkey_and_read.- Parameters
query_spec (alf.TensorSpec) – the spec for the query
num_keys (int) – the number of keys to be generated.
- Returns
a function which calculates
num_keyskeys given query.- Return type
Callable
- from_states(states)[source]#
Restore the memory from states.
- Parameters
states (tuple of Tensor) – It is should be obtained from states().
- genkey_and_read(keynet, query, flatten_result=True)[source]#
Generate key and read.
- Parameters
keynet (Callable) –
keynet(query)is a tensor of shape (batch_size, num_keys * (dim + 1)).keynetcan be created usingcreate_keynet.query (Tensor) – the query from which the keys are generated
flatten_result (bool) – If True, the result shape will be (batch_size, num_keys * dim), otherwise it is (batch_size, num_keys, dim)
- Returns
- If flatten_result is True, its shape is
(batch_size, num_keys * dim), otherwise it is
(batch_size, num_keys, dim)
- If flatten_result is True, its shape is
- Return type
Tensor
- read(keys, scale=None)[source]#
Read from memory.
Read the memory for given the keys. For each key in keys we will get one result as \(r = \sum_i M_i a_i\) where \(M_i\) is the memory content at location i and \(a_i\) is the attention weight for key at location i. \(a\) is calculated as softmax of a scaled similarity between key and each memory content: \(a_i = \exp(\frac{scale*sim_i}{\sum_i scale*sim_i})\)
- Parameters
keys (Tensor) – shape[-1] is dim. For single key read, the shape is (batch_size, dim). For multiple key read, the shape is (batch_szie, k, dim), where k is the number of keys.
scale (None|float|Tensor) – shape is () or keys.shape[:-1]. The cosine similarities are multiplied with
scalebefore softmax is applied. If None, use the scale provided at constructor.
- Returns
- shape is same as keys. result[…, i] is the read
result for the corresponding key.
- Return type
resutl Tensor
- property states#
Get the states of the memory.
- Returns
tuple of memory content and usage tensor.
- Return type
memory states
- property usage#
Get the usage for each memory slots.
- Returns
usage (Tensor) of shape (batch_size, size)
alf.networks.network#
Base extension to torch.nn.Module. Adapted from tf_agents/tf_agents/networks/network.py
- class BatchSquashNetwork(network, batch_dims=2, name='BatchSquashNetwork')[source]#
Bases:
alf.networks.network.NetworkWrap a network so that it works on multiple batch dims. Note that the output spec of this network is the same with that of the wrapped network ( it won’t include batch dims).
- Parameters
network (
Network) – the network to be wrappedbatch_dims (
int) – how many batch dims to squash before forward
Args: input_tensor_spec (nested TensorSpec): the (nested) tensor spec of
the input.
- state_spec (nested TensorSpec): the (nested) tensor spec of the state
of the network.
name (str):
- forward(x, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class NaiveParallelNetwork(network, n, name=None)[source]#
Bases:
alf.networks.network.NetworkNaive implementation of parallel network.
A parallel network has
ncopies of network with the same structure but different indepently initialized parameters.NaiveParallelNetworkcreatednindependent networks with the same structure asnetworkand evaluate them separately in loop duringforward().- Parameters
network (Network) – the parallel network will have
ncopies ofnetwork.n (int) –
ncopies ofnetworkname (str) – a string that will be used as the name of the created NaiveParallelNetwork instance. If
None,naive_parallel_followed by thenetwork.namewill be used by default.
- forward(inputs, state=())[source]#
Compute the output and the next state.
- Parameters
inputs (nested torch.Tensor) – its shape can be
[B, n, ...], or[B, ...]state (nested torch.Tensor) – its shape must be
[B, n, ...]
- Returns
its shape is
[B, n, ...]next_state (nested torch.Tensor): its shape is[B, n, ...]- Return type
output (nested torch.Tensor)
- training: bool#
- class Network(input_tensor_spec, state_spec=(), name='Network')[source]#
Bases:
torch.nn.modules.module.ModuleA base class for various networks.
Base extension to nn.Module to simplify copy operations.
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input.
state_spec (nested TensorSpec) – the (nested) tensor spec of the state of the network.
name (str) –
- copy(**kwargs)[source]#
Create a copy of this network or return the current instance.
If
self._singleton_instanceis True, callingcopy()will returnself; otherwise it will re-create and return a newNetworkinstance using the original arguments used by the constructor.NOTE When re-creating
Network, Network layer weights are never copied. This method recreates theNetworkinstance with the same arguments it was initialized with (excepting any new kwargs).- Parameters
**kwargs – Args to override when recreating this network. Commonly overridden args include ‘name’.
- Returns
- Return type
- property input_tensor_spec#
Return the input tensor spec BEFORE preprocessings have been applied.
- property is_distribution_output#
Whether the output is Distribution.
- property is_rnn#
Whether this network is a recurrent net.
- make_parallel(n)[source]#
Make a parallelized version of this network.
A parallel network has
ncopies of network with the same structure but different independently initialized parameters.By default, it creates
NaiveParallelNetwork, which simply makingncopies of this network and use a loop to call them inforward(). If possible, the subclass should override this to generate an optimized parallel implementation.- Parameters
n (int) – the number of copies
- Returns
A parallel network
- Return type
- property name#
Name of this
Network.
- property output_spec#
Return the spec of the network’s encoding output. By default, we use _test_forward to automatically compute the output and get its spec. For efficiency, subclasses can overwrite this function if the output spec can be obtained easily in other ways.
- property saved_args#
Return the dictionary of the arguments used to construct the network.
- singleton(singleton_instance=True)[source]#
Change the singleton property to the value given by the input argument
singleton_instance. :param singleton_instance: a flag indicating whether to turn :type singleton_instance: bool :param theself._singleton_instanceproperty on or off.: :param Ifself._singleton_instanceis True, callingcopy()will: :param returnself; otherwise a re-createdNetworkinstance will be: :param returned.:- Returns
self, which facilitates cascaded calling.
- property state_spec#
Return the state spec to be used by an
Algorithm.Subclass should override this to return the correct
state_spec.
- training: bool#
- class NetworkWrapper(module, input_tensor_spec, state_spec=(), name='NetworkWrapper')[source]#
Bases:
alf.networks.network.NetworkWrap module or function as a Network.
- Parameters
module (
Callable) – can be called asmodule(input)to calculate the output. Ifstate_spec != (), then it’s called asmodule(input,state)and its return should be a tuple of(output,new_state).input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec for the input ofmodulestate_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec for the state ofmodulename (
str) – name of the wrapped network
- forward(x, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- make_parallel(n)[source]#
Make a parallelized version of this network.
A parallel network has
ncopies of network with the same structure but different independently initialized parameters.By default, it creates
NaiveParallelNetwork, which simply makingncopies of this network and use a loop to call them inforward(). If possible, the subclass should override this to generate an optimized parallel implementation.- Parameters
n (int) – the number of copies
- Returns
A parallel network
- Return type
- training: bool#
- get_input_tensor_spec(net)[source]#
Get the input_tensor_spec of net if possible
- Parameters
net (nn.Module) –
- Returns
- None if input_tensor_spec cannot be inferred
from
net.
- Return type
nested TensorSpec | None
- wrap_as_network(net, input_tensor_spec)[source]#
Wrap net as a Network if it is not a Network.
- Parameters
net (Network | Callable) –
input_tensor_spec (nested TensorSpec) – if net is not a
Network,input_tensor_specmust be provided unless net is aFC. In that case,input_tensor_specwill be inferred fromnet.input_sizeif it is not provided.
- Returns
- Return type
- Raises
ValueError – if input_tensor_spec is None and cannot be inferred from
net
alf.networks.networks#
Various concrete Networks.
- class AMPWrapper(enabled, net)[source]#
Bases:
alf.networks.network.NetworkWrap a network to run in a given AMP context.
- Parameters
enabled (
bool) – whether to enable AMP autocastnet (
Network) – the wrapped network
Args: input_tensor_spec (nested TensorSpec): the (nested) tensor spec of
the input.
- state_spec (nested TensorSpec): the (nested) tensor spec of the state
of the network.
name (str):
- forward(input, state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class Delay(input_tensor_spec, delay=1, name='Delay')[source]#
Bases:
alf.networks.network.NetworkThe output is the input of the
delaystep ago.- Parameters
input_tensor_spec (nested TensorSpec) – representing the input
delay (int) – if 0, there is no delay and the output is same as the input.
Args: input_tensor_spec (nested TensorSpec): the (nested) tensor spec of
the input.
- state_spec (nested TensorSpec): the (nested) tensor spec of the state
of the network.
name (str):
- forward(input, state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class GRUCell(input_size, hidden_size, name='GRUCell')[source]#
Bases:
alf.networks.network.NetworkA gated recurrent unit (GRU) cell
\[\begin{split}\begin{array}{ll} r = \sigma(W_{ir} x + b_{ir} + W_{hr} h + b_{hr}) \\ z = \sigma(W_{iz} x + b_{iz} + W_{hz} h + b_{hz}) \\ n = \tanh(W_{in} x + b_{in} + r * (W_{hn} h + b_{hn})) \\ h' = (1 - z) * n + z * h \end{array}\end{split}\]where \(\sigma\) is the sigmoid function, and \(*\) is the Hadamard product.
- Parameters
input_size (int) – The number of expected features in the input x
hidden_size (int) – The number of features in the hidden state h
- forward(input, state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class LSTMCell(input_size, hidden_size, name='LSTMCell')[source]#
Bases:
alf.networks.network.NetworkA long short-term memory (LSTM) cell.
\[\begin{split}\begin{array}{ll} i = \sigma(W_{ii} x + b_{ii} + W_{hi} h + b_{hi}) \\ f = \sigma(W_{if} x + b_{if} + W_{hf} h + b_{hf}) \\ g = \tanh(W_{ig} x + b_{ig} + W_{hg} h + b_{hg}) \\ o = \sigma(W_{io} x + b_{io} + W_{ho} h + b_{ho}) \\ c' = f * c + i * g \\ h' = o * \tanh(c') \\ \end{array}\end{split}\]where \(\sigma\) is the sigmoid function, and \(*\) is the Hadamard product.
- Parameters
input_size (int) – The number of expected features in the input x
hidden_size (int) – The number of features in the hidden state h
- forward(input, state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class NoisyFC(input_size, output_size, std_init=0.5, new_noise_prob=0.01, activation=<function identity>, use_bn=False, use_ln=False, bn_ctor=<class 'torch.nn.modules.batchnorm.BatchNorm1d'>, kernel_initializer=None, kernel_init_gain=1.0, bias_init_value=0.0, bias_initializer=None, weight_opt_args=None, bias_opt_args=None)[source]#
Bases:
alf.networks.network.NetworkThe Noisy Linear Layer discribed in
Fortunato et. al. Noisy Networks for Exploration
In short, the original weight \(w\) and bias \(b\) of FC layer are replaced with \(w + w_\sigma \odot \epislon^w\) and \(b + b_\sigma \odot \epsion^b\) where \(\epsilon^w\) and \(\epsilon^b\) are noise and \(w, w_\sigma, b, b_\sigma\) are trainable parameters.
Some details:
The noise for each sample in a batch is different.
The noise is maintained as state. It has a probability of new_noise_prob to change to new noise.
Since the initial state is always 0, a new noise will always be generated for zero state.
If it is running in eval mode (i.e., common.is_eval() is True), noise will be disabled (i.e. same as alf.layers.FC).
The noise is factorized Gaussian noise as described in the paper.
- Parameters
input_size (
int) – input size.output_size (
int) – output size.activation (
Callable) – activation function.std_init (
float) – the scaling factor for the initial value of weight_sigma and bias_sigma.new_noise_prob (
float) – the probability of resample the noise.use_bn (
bool) – whether use batch normalization.use_ln (
bool) – whether use layer normalizationbn_ctor (
Callable) – will be called asbn_ctor(num_features)to create the BN layer.kernel_initializer (
Optional[Callable]) – initializer for the FC layer kernel. If none is provided avariance_scaling_initializerwith gain askernel_init_gainwill be used.kernel_init_gain (
float) – a scaling factor (gain) applied to the std of kernel init distribution. It will be ignored ifkernel_initializeris not None.bias_init_value (
float) – a constant for the initial bias value. This is ignored ifbias_initializeris provided.bias_initializer (
Optional[Callable]) – initializer for the bias parameter.weight_opt_args (
Optional[Dict]) – If provided, it will be used as optimizer arguments for weight. And it will be combined with zero_mean=False and fixed_norm=False as optimizer arguments for weight_sigma.bias_opt_args (
Optional[Dict]) – If provided, it will be used as optimizer arguments for bias. And it will be combined with zero_mean=False as optimizer arguments for bias_sigma.
Args: input_tensor_spec (nested TensorSpec): the (nested) tensor spec of
the input.
- state_spec (nested TensorSpec): the (nested) tensor spec of the state
of the network.
name (str):
- property bias#
- forward(input, state)[source]#
Forward computation.
- Parameters
inputs – its shape should be ``[batch_size, input_size]`
state (
Tuple[Tensor]) – tuple of noise
- Returns
with shape as
[batch_size, output_size]- Return type
Tensor
- property input_size#
- property output_size#
- training: bool#
- property weight#
- class Residue(block, input_tensor_spec=None, activation=<built-in method relu_ of type object>, name='Residue')[source]#
Bases:
alf.networks.network.NetworkResidue block.
It performs
y = activation(x + block(x)).- Parameters
block (Callable) –
input_tensor_spec (nested TensorSpec) – input tensor spec for
blockif it cannot be infered fromblockactivation (Callable) – activation function
- forward(x, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class TemporalPool(input_size, stack_size, pooling_size=1, dtype=torch.float32, mode='skip', name='TemporalPool')[source]#
Bases:
alf.networks.network.NetworkPool features temporally.
Suppose input_size=(), stack_size=2, pooling_size=2, the following table shows the output of different mode for an input sequence of 1,2,3,4,5 (ignoring batch dimension)
1, 2, 3, 4, 5
skip: [0, 1], [0, 1], [1, 3], [1, 3], [3, 5] avg: [0, 0], [0, 1.5], [0, 1.5], [1.5, 3.5], [1.5, 3.5] max: [0, 0], [0, 2], [0, 2], [2, 4], [2, 4]
Note that for ‘avg’ and ‘max’, the result is zero for the first
pooling_size - 1steps because it needspooling_sizeinput to calculate the result. After that, the output changes everypooling_sizesteps as the new pooling result available. On the other hand, for ‘skip’, the first input is immediately reflected in the output because it is a valid way of skipping.Example:
# A temporal CNN with progressively large temporal receptive field. cnn = alf.networks.Sequential([ alf.networks.TemporalPool(256, 3, 1), torch.nn.Flatten(), alf.layers.FC(768, 256, activation=torch.relu_), alf.networks.TemporalPool(256, 3, 2), torch.nn.Flatten(), alf.layers.FC(768, 256, activation=torch.relu_), alf.networks.TemporalPool(256, 3, 4), torch.nn.Flatten(), alf.layers.FC(768, 256, activation=torch.relu_)])
Note that the output of the above network changes every 4 steps, which may make the response too slow for many tasks. So a practical way of using
TemporalPoolis to combine it withResidueso that the output will not lag:block = alf.networks.Residue( alf.networks.Sequential([ alf.networks.TemporalPool(256, 3, 2), torch.nn.Flatten(), alf.layers.FC(768, 256, activation=torch.relu_)]))
- Parameters
input_size (int|tuple[int]) – shape of the input
stack_size (int) – stack the features from so many steps
pooling_size (int) – if > 1, perform a pooling first.
pooling_sizesteps of features will be pooled as single feature vector according tomodemode (str) –
one of (‘skip’, ‘avg’, ‘max’), only effective if pooling_size > 1. ‘skip’: only keeping features at step
t * pooling_size‘avg’: features are averaged for each window ofpooling_sizesteps.The pooling results for first
pooling_size - 1steps are 0.- ’max’: features are maxed for each window of
pooling_sizesteps The pooling results for first
pooling_size - 1steps are 0.
- ’max’: features are maxed for each window of
- Returns
tensor of shape (stack_size, input_size)
internal states
- Return type
tuple of
- forward(x, state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
alf.networks.normalizing_flow_networks#
Different normalizing flow networks.
A normalizing flow network \(f: \mathbb{R}^N \rightarrow \mathbb{R}^N\)
is invertible, namely given any output \(y=f(x)\), we can easily compute the corresponding input \(x=f^{-1}(y)\), and
whose Jacobian determinant is easy to compute, for example, the product of diagonal elements.
- class NormalizingFlowNetwork(input_tensor_spec, conditional_input_tensor_spec=None, use_transform_cache=True, name='NormalizingFlowNetwork')[source]#
Bases:
alf.networks.network.NetworkThe base class for normalizing flow networks.
Compared to traditional
Networkclasses, its subclass needs to implement the interfacemake_invertible_transform().- Parameters
input_tensor_spec (
TensorSpec) – input tensor specconditional_input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – a nested tensor specuse_transform_cache (
bool) – whether to cache transforms. When there is a conditional input, different transforms might be created depending on the conditonal inputs. When there is no conditional input, the same transform will always be used. Note that this only caches the transform itself; to correctly cache the inverse result, you also have to setcache_size=1when creating the transform.name (
str) – name of the network
- forward(xz, state=())[source]#
When we have no conditional input for forward:
y=self.forward(x). Otherwisey=self.forward((x,z))wherezis the conditional input.- Parameters
xz (
Union[Tensor,Tuple[Tensor,Union[Tensor,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]]]) – the input can be either an unnested tensorxor a tuple of an unnested tensor and a nested tensor(x, z).zis an optional conditional input that conditions the normalizing flow mapping fromxtoy.state (
Union[Tensor,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – should be an empty tuple
- inverse(yz, state=())[source]#
When we have no conditional input for forward:
x=self.inverse(y). Otherwisex=self.inverse((y,z))wherezis the conditional input.- Parameters
yz (
Union[Tensor,Tuple[Tensor,Union[Tensor,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]]]) – the input can be either an unnested tensoryor a tuple of an unnested tensor and a nested tensor(y, z).zis an optional conditional input that conditions the normalizing flow inverse mapping fromytox.state (
Union[Tensor,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – should be an empty tuple
- make_invertible_transform(conditional_inputs=None)[source]#
Express the network forward computation as an invertible PyTorch
Transform. This overall transformation can be a composed one chaining many transformation layers.- Parameters
conditional_inputs (
Union[Tensor,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – an optional nested conditional inputs that condition the mapping \(x \rightarrow y\).- Return type
Transform- Returns
an invertible transform
- training: bool#
- property use_conditional_inputs: bool#
- Return type
bool- Returns
- Whether this normalizing flow uses inputs to condition the
transforms.
- class RealNVPNetwork(input_tensor_spec, conditional_input_tensor_spec=None, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method tanh of type object>, transform_scale_nonlinear=functools.partial(<function clipped_exp>, clip_value_min=-10, clip_value_max=2), sub_dim=None, mask_mode='contiguous', num_layers=2, use_transform_cache=True, name='RealNVPNetwork')[source]#
Bases:
alf.networks.normalizing_flow_networks.NormalizingFlowNetworkReal-valued non-volume preserving transformations.
“DENSITY ESTIMATION USING REAL NVP”, Dinh et al., ICLR 2017.
In short, each transformation layer does
\[\begin{split}\begin{array}{rcl} y_{1:d} &=& x_{1:d}\\ y_{d+1:D} &=& x_{d+1:D}\bigodot \exp(s(x_{1:d};z)) + t(x_{1:d};z)\\ \end{array}\end{split}\]where \(d\) is a hyperparameter that determines the two-way split of the input vector \(x\), \(D\) the total length of \(x\), \(s\) a (learned) scale function, and \(t\) a (learned) translation function. The scale and translation functions can depend on other input \(z\). It can be verified that the Jacobian is a lower-triangular matrix and its diagonal elements are \(\mathbb{I}_d\) and \(\text{diag}(\exp(s(x_{1:d};z)))\), regardless of how complex \(s\) and \(t\) are.
The original paper suggests to alternate the computations of \(y_{1:d}\) and \(y_{d+1:D}\) to avoid some part of \(x\) always getting copied.
Our implementation also allows specifying other binary masks. We additionally support a random binary mask and an evenly distributed mask. The reason is that we can always re-arrange the 0s and 1s and swap the rows of the Jacobian to make it triangular. Because we always take the absolute of Jacobian determinant, row swapping will not change the result of
log_abs_det_jacobian().Note that whichever binary mask is used, an alternating computation is always used. For example, let \(b\) be the mask, then
\[\begin{split}\begin{array}{rcl} y &=& b\bigodot x + (1-b)\bigodot(x\bigodot \exp(s(x\bigodot b;z)) + t(x\bigodot b;z))\\ \end{array}\end{split}\]At even layers, we flip the values of \(b\).
For inverse computation,
\[\begin{split}\begin{array}{rcl} x &=& b\bigodot y + (1-b)\bigodot((y - t(y\bigodot b;z)) \div \exp(s(y\bigodot b;z)))\\ \end{array}\end{split}\]Note
The scale and translation network’s initial output should be in a good range, so their hidden activations default to
torch.tanh.- Parameters
input_tensor_spec (
TensorSpec) – input tensor specconditional_input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – a nested tensor specinput_preprocessors (
Any) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure withinput_tensor_spec(after reshaping). If any element is None, then it will be treated as math_ops.identity. Only used when conditional inputs are present, where its structure should be(x_processor, z_processor).preprocessing_combiner (
NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also acceptinput_tensor_specas the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. Only used when conditional inputs are present.conv_layer_params (
Tuple[Tuple[int]]) – a tuple of tuples where each tuple takes a format(filters, kernel_size, strides, padding), wherepaddingis optional. Used by the scale and translation networks.fc_layer_params (
Tuple[int]) – a tuple of integers representing FC layer sizes of the scale and translation networks.activation (
Callable) – hidden activation of the scale and translation networkstransform_scale_nonlinear (
Callable) – nonlinear function applied to the scale network output. Its codomain should be \([0,+\infty)\). Make sure that the value of this function won’t explode after several RealNVP transform layers.sub_dim (
int) – the dimensionality to keep unchanged at odd layers. If None, then half of the input is unchanged at a time. When it’s 0, all input dims will be changed by an affine transform independent of the input. This case can still be interesting because the affine transform could depend on other variables (i.e., conditionalAffineTransform).mask_mode (
str) – three options are supported: “contiguous” (default), “distributed”, and “random”. “contiguous” means at odd layers, the firstsub_dimelements are kept unchanged; “distributed” means that thesub_dimelements evenly distributed on the vector (good for vector with local similarity); “random” means that the mask is randomized.num_layers (
int) – number of transformation layers. Note that for mask mode of “random”, every two layers will have a different randomized mask.use_transform_cache (
bool) – whether use cached transform. Note that this only stores the transform itself; you also have to usecache_size=1for the created transform to correctly cache the inverse result.name (
str) – name of the network
- training: bool#
alf.networks.ou_process#
Ornstein-Uhlenbeck process.
- class OUProcess(state_spec, damping=0.15, stddev=0.2)[source]#
Bases:
alf.networks.network.NetworkA zero-mean Ornstein-Uhlenbeck process.
A Class for generating noise from a zero-mean Ornstein-Uhlenbeck process.
The Ornstein-Uhlenbeck process is a process that generates temporally correlated noise via a random walk with damping. This process describes the velocity of a particle undergoing brownian motion in the presence of friction. This can be useful for exploration in continuous action environments with momentum.
The temporal update equation is: x_next = (1 - damping) * x + N(0, std_dev)
- Parameters
state_spec (nested TensorSpec) – spec of the state
damping (float) – The rate at which the noise trajectory is damped towards the mean. We must have 0 <= damping <= 1, where a value of 0 gives an undamped random walk and a value of 1 gives uncorrelated Gaussian noise. Hence in most applications a small non-zero value is appropriate.
stddev (float) – Standard deviation of the Gaussian component.
- forward(state)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- property state_spec#
Return the state spec to be used by an
Algorithm.Subclass should override this to return the correct
state_spec.
- training: bool#
alf.networks.param_networks#
Networks with input parameters.
- class ParamConvNet(input_channels, input_size, conv_layer_params, same_padding=False, activation=<built-in method relu_ of type object>, use_bias=False, use_ln=False, n_groups=None, kernel_initializer=None, flatten_output=False, name='ParamConvNet')[source]#
Bases:
alf.networks.network.NetworkA fully 2D conv network that does not maintain its own network parameters, but accepts them from users. If the given parameter tensor has an extra batch dimension (first dimension), it performs parallel operations.
- Parameters
input_channels (int) – number of channels in the input image
input_size (int or tuple) – the input image size (height, width)
conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding, pooling_kernel), wherepaddingandpooling_kernelare optional.same_padding (bool) – similar to TF’s conv2d
samepadding mode. If True, the user provided paddings in conv_layer_params will be replaced by automatically calculated ones; if False, it corresponds to TF’svalidpadding mode (the user can still provide custom paddings though)activation (torch.nn.functional) – activation for all the layers
use_bias (bool) – whether use bias.
use_ln (bool) – whether use layer normalization
n_groups (int) – number of parallel groups, must be specified if
use_lnkernel_initializer (Callable) – initializer for all the layers.
flatten_output (bool) – If False, the output will be an image structure of shape
(B, n, C, H, W); otherwise the output will be flattened into a feature of shape(B, n, C*H*W).name (str) –
- forward(inputs, state=())[source]#
- Parameters
inputs (Tensor) –
state – not used, just keeps the interface same with other networks.
- property param_length#
Get total number of parameters for all layers.
- set_parameters(theta, reinitialize=False)[source]#
Distribute parameters to corresponding layers.
- Parameters
theta (torch.Tensor) –
- with shape
[D] (groups=1) or
[B, D] (groups=B)
where the meaning of the symbols are: -
B: batch size -D: length of parameters, should be self.param_length When the shape of inputs is[D], it will be unsqueezed to[1, D].- with shape
reinitialize (bool) – whether to reinitialize parameters of each layer.
- training: bool#
- class ParamNetwork(input_tensor_spec, conv_layer_params=None, fc_layer_params=None, use_conv_bias=False, use_conv_ln=False, use_fc_bias=True, use_fc_ln=False, n_groups=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, last_layer_size=None, last_activation=None, last_use_bias=True, last_use_ln=False, name='ParamNetwork')[source]#
Bases:
alf.networks.network.NetworkA network with Fc and conv2D layers that does not maintain its own network parameters, but accepts them from users. If the given parameter tensor has an extra batch dimension (first dimension), it performs parallel operations.
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input. If nested, then
preprocessing_combinermust not be None.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding, pooling_kernel), wherepaddingandpooling_kernelare optional.fc_layer_params (tuple[int]) – a tuple of integers representing FC layer sizes.
use_conv_bias (bool) – whether use bias for conv layers.
use_conv_ln (bool) – whether use layer normalization for conv layers.
use_fc_bias (bool) – whether use bias for fc layers.
use_fc_ln (bool) – whether use layer normalization for fc layers.
n_groups (int) – number of parallel groups, must be specified if
use_bnactivation (torch.nn.functional) – activation for all the layers
kernel_initializer (Callable) – initializer for all the layers.
last_layer_size (int) – an optional size of an additional layer appended at the very end. Note that if
last_activationis specified,last_layer_sizehas to be specified explicitly.last_activation (nn.functional) – activation function of the additional layer specified by
last_layer_param. Note that iflast_layer_paramis not None,last_activationhas to be specified explicitly.last_use_bias (bool) – whether use bias for the additional layer.
last_use_fn (bool) – whether use layer normalization for the additional layer.
name (str) –
- forward(inputs, state=())[source]#
- Parameters
inputs (Tensor) –
state – not used, just keeps the interface same with other networks.
- property param_length#
Get total number of parameters for all layers.
- set_parameters(theta, reinitialize=False)[source]#
Distribute parameters to corresponding layers.
- Parameters
theta (torch.Tensor) –
- with shape
[D] (groups=1) or
[B, D] (groups=B)
where the meaning of the symbols are: -
B: batch size -D: length of parameters, should be self.param_length When the shape of inputs is[D], it will be unsqueezed to[1, D].- with shape
reinitialize (bool) – whether to reinitialize parameters of each layer.
- training: bool#
alf.networks.preprocessor_networks#
PreprocessorNetworks.
- class PreprocessorNetwork(input_tensor_spec, input_preprocessors=None, preprocessing_combiner=None, name='PreprocessorNetwork')[source]#
Bases:
alf.networks.network.NetworkA base class for networks with input processing need.
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input.
input_preprocessors (nested Network|nn.Module|None) – a nest of preprocessors, each of which will be applied to the corresponding input. If None, it is treated as
math_ops.identity. If not None,input_tensor_specmust have the same structure withinput_preprocessorsupto the structure defined byinput_preprocessors(seealf.nest.map_structure_upto), and each element ofinput_preproccessorswill be applied to the corresponding subnest ininput_tensor_spec. If any element is None, it will be treated asmath_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector. Note that only stateless networks are supported as input preprocessors byPreprocessorNetwork.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. It must be provided if the result from
input_preprocessorsis nested. This combiner must accept the result frominput_preprocessorsas the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.name (str) – name of the network
- forward(inputs, state=(), min_outer_rank=1, max_outer_rank=1)[source]#
Preprocessing nested inputs.
- Parameters
inputs (nested Tensor) – inputs to the network
state (nested Tensor) – RNN state of the network
min_outer_rank (int) – the minimal outer rank allowed
max_outer_rank (int) – the maximal outer rank allowed
- Returns
tensor after preprocessing.
- Return type
Tensor
- training: bool#
alf.networks.preprocessors#
This file contains input preprocessors as stateless Networks, used for the purpose of preprocessing input and making gin files more convenient to configure.
Example: In your gin file, below will be possible to configure: input1 (img) -> preprocessor1 -> embed1 —-> EncodingNetwork input2 (action) -> preprocessor2 -> embed2 / (with NestCombiner)
- class EmbeddingPreprocessor(input_tensor_spec, embedding_dim, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, last_activation=<function identity>, name='EmbeddingPreproc')[source]#
Bases:
alf.networks.network.NetworkA preprocessor that converts the input to an embedding vector. This can be used when the input is a discrete scalar, or a continuous vector to be projected to a different dimension (to have the same length with other vectors). In the former case,
torch.nn.Embeddingis used without any activation. In the latter case, anEncodingNetworkis used with the specified network hyperparameters.- Parameters
input_tensor_spec (TensorSpec) – the input spec
embedding_dim (int) – output embedding size
conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing FC layer sizes.
activation (torch.nn.functional) – activation of hidden layers if the input is a continuous vector.
last_activation (nn.functional) – activation function of the last layer specified by embedding_dim.
math_ops.identityis used by default. Only used when the input is continuous.name (str) –
- forward(inputs, state=())[source]#
Preprocess either a tensor input or a TensorSpec.
- Parameters
inputs (TensorSpec or Tensor) –
- Returns
- if
Tensor, the returned is the preprocessed result; otherwise it’s the tensor spec of the result.
- if
- Return type
Tensor or TensorSpec
- training: bool#
alf.networks.projection_networks#
- class BetaProjectionNetwork(input_size, action_spec, parallelism=None, activation=<built-in function softplus>, min_concentration=0.0, projection_output_init_gain=0.0, bias_init_value=0.541324854612918, grad_clip=0.01, name='BetaProjectionNetwork')[source]#
Bases:
alf.networks.network.NetworkBeta projection network.
Its output is a distribution with independent beta distribution for each action dimension. Since the support of beta distribution is [0, 1], we also apply an affine transformation so the support fill the range specified by
action_spec.- Parameters
input_size (int) – input vector dimension
action_spec (TensorSpec) – a tensor spec containing the information of the output distribution.
parallelism (
Optional[int]) – when specified, this network will be parallelized. As a result, a batch dimension ofparallelismwill be appended to the batch shape of the output distribution, while the event shape remains the same. This is useful when you are creating a mixture of policies.activation (Callable) – activation function to use in dense layers.
bias_init_value (float) – the default value is chosen so that, for softplus activation, the initial concentration will be close 1, which corresponds to uniform distribution.
grad_clip (float) – if provided, the L2-norm of the gradient of concentration will be clipped to be no more than
grad_clip.min_concentration (float) – there may be issue of numerical stability if the calculated concentration is very close to 0. A positive value of this may help to alleviate it.
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- make_parallel(n)[source]#
Make a parallelized version of this network.
A parallel network has
ncopies of network with the same structure but different independently initialized parameters.By default, it creates
NaiveParallelNetwork, which simply makingncopies of this network and use a loop to call them inforward(). If possible, the subclass should override this to generate an optimized parallel implementation.- Parameters
n (int) – the number of copies
- Returns
A parallel network
- Return type
- training: bool#
- class CategoricalProjectionNetwork(input_size, action_spec, fc_ctor=<class 'alf.layers.FC'>, logits_init_output_factor=0.1, weight_opt_args=None, bias_opt_args=None, name='CategoricalProjectionNetwork')[source]#
Bases:
alf.networks.network.NetworkCreates a categorical projection network that outputs a discrete distribution over a number of classes.
Currently there seems no need for this class to handle nested inputs; If necessary, extend the argument list to support it in the future.
- Parameters
input_size (int) – the input vector size
action_spec (BounedTensorSpec) – a tensor spec containing the information of the output distribution.
fc_ctor (
Callable) – the constructor of FC layer. It is defaulted to alf.layers.FC. However, you can use different FC layers such as alf.nn.NoisyFC.weight_opt_args – optimizer arguments for weight.
bias_opt_args – optimizer arguments for bias.
name (str) –
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- make_parallel(n)[source]#
Creates a
ParallelCategoricalProjectionNetworkusingnreplicas ofself. The initialized layer parameters will be different.
- training: bool#
- class CauchyProjectionNetwork(input_size, action_spec, squash_median=True, scale_bias_initializer_value=0.0, state_dependent_scale=False, scale_transform=<built-in function softplus>, scale_distribution=False, dist_squashing_transform=StableTanh(), name='CauchyProjectionNetwork')[source]#
Bases:
alf.networks.projection_networks.NormalProjectionNetworkSimilar to
NormalProjectionNetworkexcept that the output distribution is aDiagMultivariateCauchy. Also since Cauchy doesn’t have mean or std, we provide parameters for its median and scale instead. But the median and scale will just reuse the code for handling mean and std inNormalProjectionNetwork.- Parameters
input_size (int) – input vector dimension
action_spec (TensorSpec) – a tensor spec containing the information of the output distribution.
squash_median (bool) – If True, squash the output median to fit the action spec. If
scale_distributionis also True, this value will be ignored.scale_bias_initializer_value (float) – Initial value for the bias of the scale projection layer.
state_dependent_scale (bool) – If True, scale will be generated depending on the current state; otherwise a global scale will be generated regardless of the current state.
scale_transform (Callable) – Transform to apply to the scale, on top of activation.
scale_distribution (bool) – Whether or not to scale the output distribution to ensure that the output aciton fits within the action_spec. Note that this is different from mean_transform which merely squashes the mean to fit within the spec.
dist_squashing_transform (td.Transform) – A distribution Transform which transform values to fall in (-1, 1). Default to dist_utils.StableTanh()
name (str) – name of this network.
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class MixtureProjectionNetwork(input_size, action_spec, num_components, component_ctor, mixture_ctor=<class 'alf.networks.projection_networks.CategoricalProjectionNetwork'>, name='mix_proj_net')[source]#
Bases:
alf.networks.network.NetworkA projection network that outputs MixtureSameFamily distributions.
The output distribution consists of 2 parts:
A categorical distribution for each of the component.
A components distribution of
num_componentsreplicas.
Constructs an instance of MixtureProjectionNetwork.
- Parameters
input_size (
int) – the input vector sizeaction_spec (
TensorSpec) – a tensor spec containing the information of the output distribution.num_components (
int) – the number of component distributions.component_ctor (
Callable[[int,TensorSpec],Network]) – constructor to a projection network that outputs distribution for all the components. Themake_parallelmethod of the projection network will be called to make the actual projection network that has a replica ofnum_components.mixture_ctor (
Callable[[int,BoundedTensorSpec],Network]) – constructor to a projection network that outputs the mixture (categorical) distributions. The number of categories equalsnum_components.
- forward(inputs, state={'components': (), 'mixture': ()})[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- property num_components: int#
- Return type
int
- training: bool#
- class NormalProjectionNetwork(input_size, action_spec, parallelism=None, activation=<function identity>, projection_output_init_gain=0.3, std_bias_initializer_value=0.0, squash_mean=True, state_dependent_std=False, std_transform=<built-in function softplus>, scale_distribution=False, dist_squashing_transform=StableTanh(), name='NormalProjectionNetwork')[source]#
Bases:
alf.networks.network.NetworkCreates an instance of NormalProjectionNetwork.
Currently there seems no need for this class to handle nested inputs; If necessary, extend the argument list to support it in the future.
- Parameters
input_size (int) – input vector dimension
action_spec (TensorSpec) – a tensor spec containing the information of the output distribution.
parallelism (
Optional[int]) – when specified, this network will be parallelized. As a result, a batch dimension ofparallelismwill be appended to the batch shape of the output distribution, while the event shape remains the same. This is useful when you are creating a mixture of policies.activation (Callable) – activation function to use in dense layers.
projection_output_init_gain (float) – Output gain for initializing action means and std weights.
std_bias_initializer_value (float) – Initial value for the bias of the
std_projection_layer.squash_mean (bool) – If True, squash the output mean to fit the action spec. If
scale_distributionis also True, this value will be ignored.state_dependent_std (bool) – If True, std will be generated depending on the current state; otherwise a global std will be generated regardless of the current state.
std_transform (Callable) – Transform to apply to the std, on top of activation.
scale_distribution (bool) – Whether or not to scale the output distribution to ensure that the output aciton fits within the action_spec. Note that this is different from mean_transform which merely squashes the mean to fit within the spec.
dist_squashing_transform (td.Transform) – A distribution Transform which transforms values into \((-1, 1)\). Default to
dist_utils.StableTanh()name (str) – name of this network.
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- make_parallel(n)[source]#
Make a parallelized version of this network.
A parallel network has
ncopies of network with the same structure but different independently initialized parameters.By default, it creates
NaiveParallelNetwork, which simply makingncopies of this network and use a loop to call them inforward(). If possible, the subclass should override this to generate an optimized parallel implementation.- Parameters
n (int) – the number of copies
- Returns
A parallel network
- Return type
- training: bool#
- class OnehotCategoricalProjectionNetwork(input_size, action_spec, logits_init_output_factor=0.1, mode='st', gumbel_temperature=1.0, name='OnehotCategoricalProjectionNetwork')[source]#
Bases:
alf.networks.network.NetworkCreates a onehot categorical projection network that outputs a discrete distribution over a number of classes.
An option to use the straight-through estimator is provided for this network, which is proposed by Bengio et al., “Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation”, 2013.
- Parameters
input_size (int) – the input vector size
action_spec (BounedTensorSpec) – a tensor spec containing the information of the output distribution.
logits_init_output_factor (float) – the gain factor to initialize the FC layer for predicting the logits
mode (
str) – one of (‘st’, ‘gumbel’, ‘st-gumbel’, ‘plain’). All modes other than ‘plain’ enables gradient backprop through the samples. ‘st’ uses the straight-through grad estimator; ‘gumbel’ uses the Gumbel-softmax distribution to sample soft onehot vectors; ‘st-gumbel’ additionally takes argmax on the soft vectors and applies the straight-through grad estimator. Generally, ‘st-gumbel’ should have a lower grad variance than ‘st’.gumbel_temperature (
float) – the temperature of the Gumbel-softmax distribution. Only used by ‘gumbel’ and ‘st-gumbel’ modes. A higher temperature leads to a more uniform sample (less like one-hot).name (str) –
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class ParallelCategoricalProjectionNetwork(input_size, action_spec, n, fc_ctor=<class 'alf.layers.FC'>, logits_init_output_factor=0.1, name='ParallelCategoricalProjectionNetwork')[source]#
Bases:
alf.networks.network.NetworkCreates an instance of ParallelCategoricalProjectionNetwork.
- Parameters
input_size (int) – input vector dimension
action_spec (TensorSpec) – a tensor spec containing the information of the output distribution.
n (int) – number of the parallel networks
fc_ctor – must be alf.layers.FC
name (str) – name of this network.
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class StableNormalProjectionNetwork(input_size, action_spec, parallelism=None, activation=<function identity>, projection_output_init_gain=1e-05, squash_mean=True, state_dependent_std=False, inverse_std_transform='softplus', scale_distribution=False, init_std=1.0, min_std=0.0, max_std=None, dist_squashing_transform=StableTanh(), name='StableNormalProjectionNetwork')[source]#
Bases:
alf.networks.projection_networks.NormalProjectionNetworkGenerates a Multi-variate normal by predicting a mean and std.
It parameterizes the normal distributions as \(\sigma=c_0+\frac{1}{c_1+softplus(b)}\) and \(\mu=a\cdot\sigma\) where a and b are outputs from means_projection_layer and stds_projectin_layer respectively. \(c_0\) and \(c_1\) are chosen so that \(\sigma_{min} <= \sigma <= \sigma_{max}\). The advantage of this parameterization is that its second order derivatives with respect to a and b are bounded even when the standard deviations become very small so that the optimization is more stable. See
docs/stable_gradient_descent_for_gaussian_distribution.pyfor detail.Creates an instance of StableNormalProjectionNetwork.
Currently there seems no need for this class to handle nested inputs; If necessary, extend the argument list to support it in the future.
- Parameters
input_size (int) – input vector dimension
action_spec (TensorSpec) – a tensor spec containing the information of the output distribution.
activation (Callable) – activation function to use in dense layers.
parallelism (
Optional[int]) – when specified, this network will be parallelized. As a result, a batch dimension ofparallelismwill be appended to the batch shape of the output distribution, while the event shape remains the same. This is useful when you are creating a mixture of policies.projection_output_init_gain (float) – Output gain for initializing action means and std weights.
squash_mean (bool) – If True, squash the output mean to fit the action spec. If scale_distribution is also True, this value will be ignored.
state_dependent_std (bool) – If True, std will be generated depending on the current state; otherwise a global std will be generated regardless of the current state.
inverse_std_transform (str) – Currently supports “exp” and “softplus”. Transformation to obtain inverse std. The transformed values are further transformed according to min_std and max_std.
scale_distribution (bool) – Whether or not to scale the output distribution to ensure that the output aciton fits within the action_spec. Note that this is different from ‘mean_transform’ which merely squashes the mean to fit within the spec.
init_std (float) – Initial value for standard deviation.
min_std (float) – Minimum value for standard deviation.
max_std (float) – Maximum value for standard deviation. If None, no maximum is enforced.
dist_squashing_transform (td.Transform) – A distribution Transform which transforms values into \((-1, 1)\). Default to
dist_utils.StableTanh()name (str) – name of this network.
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
- class TruncatedProjectionNetwork(input_size, action_spec, activation=<function identity>, projection_output_init_gain=0.3, scale_bias_initializer_value=0.0, state_dependent_scale=False, loc_transform=<built-in method tanh of type object>, scale_transform=<built-in function softplus>, min_scale=None, max_scale=None, dist_ctor=<class 'alf.utils.distributions.TruncatedNormal'>, name='TruncatedProjectionNetwork')[source]#
Bases:
alf.networks.network.NetworkCreates an instance of TruncatedProjectionNetwork.
Its output is a TruncatedDistribution with bounds given by the action bounds specified in
action_spec.- Parameters
input_size (int) – input vector dimension
action_spec (TensorSpec) – a tensor spec containing the information of the output distribution.
activation (Callable) – activation function to use in dense layers.
projection_output_init_gain (float) – Output gain for initializing action means and std weights.
std_bias_initializer_value (float) – Initial value for the bias of the
std_projection_layer.state_dependent_scale (bool) – If True, std will be generated depending on the current state (i.e. inputs); otherwise a global scale will be generated regardless of the current state.
loc_transform (Callable) – Tranform to apply to the loc, on top of activation to make it within [-1, 1].
scale_transform (Callable) – Transform to apply to the std, on top of activation to make it positive.
min_scale (float) – Minimum value for scale. If None, no maximum is enforced.
max_scale (float) – Maximum value for scale. If None, no maximum is enforced.
dist_ctor (Callable) – constructor for the distribution called as: dist_ctor(loc=loc, scale=scale, lower_bound=lower_bound, upper_bound=upper_bound).
name (str) – name of this network.
- forward(inputs, state=())[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool#
alf.networks.q_networks#
QNetworks
- class ParallelQNetwork(q_network, n, name='ParallelQNetwork')[source]#
Bases:
alf.networks.network.NetworkPerform
nQ-value computations in parallel.- Parameters
q_network (QNetwork) – non-parallelized q network
n (int) – make
nreplicas fromq_networkwith different parameter initializations.name (str) –
- forward(inputs, state=())[source]#
Compute action values given an observation.
- Parameters
inputs (nest) – consistent with
input_tensor_spec.state – empty for API consistent with
QRNNNetwork.
- Returns
action_value (Tensor): a tensor of shape \([B,n,k]\), where \(B\) is the batch size, \(n\) is the num of replicas, and \(k\) is the number of actions.
state: empty
- Return type
tuple
- property state_spec#
Return the state spec of the q network. It is simply the state spec of the encoding network.
- training: bool#
- class QNetwork(input_tensor_spec, action_spec, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_naive_parallel_network=False, name='QNetwork')[source]#
Bases:
alf.networks.q_networks.QNetworkBaseCreate an instance of QNetwork.
Creates an instance of
QNetworkfor estimating action-value of discrete actions. The action-value is defined as the expected return starting from the given input observation and taking the given action. It takes observation as input and outputs an action-value tensor with the shape of[batch_size, num_of_actions].- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input
action_spec (TensorSpec) – the tensor spec of the action
input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec(after reshaping). If any element is None, then it will be treated asmath_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format
(filters, kernel_size, strides, padding), wherepaddingis optional.fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a default
variance_scaling_initializerwill be used.use_naive_parallel_network (bool) – if True, will use
NaiveParallelNetworkwhenmake_parallelis called. This might be useful in cases when theNaiveParallelNetworkhas an advantange in terms of speed overParallelNetwork. You have to test to see which way is faster for your particular situation.
- training: bool#
- class QNetworkBase(input_tensor_spec, action_spec, encoding_network_ctor, use_naive_parallel_network=False, name='QNetworkBase', **encoder_kwargs)[source]#
Bases:
alf.networks.network.NetworkA base class for
QNetworkandQRNNNetwork.Can also be used to create customized value networks by providing different encoding network creators.
- Parameters
input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the inputaction_spec (
BoundedTensorSpec) – the tensor spec of the actionencoding_network_ctor (
Callable) – the creator of the encoding network that does the heavy lifting of the q network.use_naive_parallel_network (
bool) – if True, will useNaiveParallelNetworkwhenmake_parallelis called. This might be useful in cases when theNaiveParallelNetworkhas an advantange in terms of speed overParallelNetwork. You have to test to see which way is faster for your particular situation.name (
str) – name of the networkencoder_kwargs – the extra keyword arguments to the encoding network
- forward(observation, state=())[source]#
Computes action values given an observation.
- Parameters
observation (nest) – consistent with
input_tensor_specstate – empty for API consistent with
QRNNNetwork
- Returns
action_value (torch.Tensor): a tensor of the size
[batch_size, num_actions]state: empty
- Return type
tuple
- make_parallel(n)[source]#
Create a
ParallelQNetworkusingnreplicas ofself. The initialized network parameters will be different. Ifuse_naive_parallel_networkis True, useNaiveParallelNetworkto create the parallel network.
- property state_spec#
Return the state spec of the q network. It is simply the state spec of the encoding network.
- training: bool#
- class QRNNNetwork(input_tensor_spec, action_spec, input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, lstm_hidden_size=100, value_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_naive_parallel_network=False, name='QRNNNetwork')[source]#
Bases:
alf.networks.q_networks.QNetworkBaseCreate a RNN-based that outputs temporally correlated q-values.
Creates an instance of QRNNNetwork for estimating action-value of discrete actions. The action-value is defined as the expected return starting from the given inputs (observation and state) and taking the given action. It takes observation and state as input and outputs an action-value tensor with the shape of [batch_size, num_of_actions]. :type input_tensor_spec:
TensorSpec:param input_tensor_spec: the tensor spec of the input :type input_tensor_spec: TensorSpec :type action_spec:BoundedTensorSpec:param action_spec: the tensor spec of the action :type action_spec: TensorSpec :param input_preprocessors: a nest ofinput preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with input_tensor_spec (after reshaping). If any element is None, then it will be treated as math_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.
- Parameters
preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept input_tensor_spec as the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.
conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format (filters, kernel_size, strides, padding), where padding is optional.
fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers for encoding the observation.
lstm_hidden_size (int or tuple[int]) – the hidden size(s) of the LSTM cell(s). Each size corresponds to a cell. If there are multiple sizes, then lstm cells are stacked.
value_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers that are applied after the lstm cell’s output.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a default variance_scaling_initializer will be used.
use_naive_parallel_network (bool) – if True, will use
NaiveParallelNetworkwhenmake_parallelis called. This might be useful in cases when theNaiveParallelNetworkhas an advantange in terms of speed overParallelNetwork. You have to test to see which way is faster for your particular situation.
- training: bool#
alf.networks.relu_mlp#
- class ReluMLP(input_tensor_spec, output_size=None, hidden_layers=(64, 64), name='ReluMLP')[source]#
Bases:
alf.networks.network.NetworkA MLP with relu activations. Diagonals of input-output Jacobian can be computed directly without calling autograd.
Create a ReluMLP.
- Parameters
input_tensor_spec (TensorSpec) –
output_size (int) – output dimension.
hidden_layers (tuple) – size of hidden layers.
name (str) –
- compute_jac(inputs, output_partial_idx=None)[source]#
Compute the input-output Jacobian, support partial output.
- Parameters
inputs (Tensor) – size (self._input_size) or (B, self._input_size)
output_partial_idx (list) – list of output indices for taking partial output-input Jacobian. Default is
None, where standard full output-input Jacobian will be used.
- Returns
- shape (out_size, in_size) or (B, out_size, in_size),
where
out_sizeis self._output_size ifoutput_partial_idxis None,len(output_partial_idx)otherwise.
- Return type
Jacobian (Tensor)
- compute_jvp(inputs, vec, output_partial_idx=None)[source]#
Compute Jacobian-vector product, support partial output-input Jacobian.
- Parameters
inputs (Tensor) – size (self._input_size) or (B, self._input_size)
vec (Tensor) – the vector for which the Jacobian-vector product is computed. Must be of size (self._input_size) or (B, self._input_size).
output_partial_idx (list) – list of output indices for taking partial output-input Jacobian. Default is
None, where standard full output-input Jacobian will be used.
- Returns
- shape (out_size) or (B, out_size), where
out_size is self._output_size if
output_partial_idxis None,len(output_partial_idx)otherwise.
outputs (Tensor): outputs of the ReluMLP
- shape (out_size) or (B, out_size), where
- Return type
jvp (Tensor)
- compute_vjp(inputs, vec, output_partial_idx=None)[source]#
Compute vector-Jacobian product, support partial output-input Jacobian.
- Parameters
inputs (Tensor) – size (self._input_size) or (B, self._input_size)
vec (Tensor) – the vector for which the vector-Jacobian product is computed. Must be of size (self._output_size) or (B, self._output_size).
output_partial_idx (list) – list of output indices for taking partial output-input Jacobian. Default is
None, where standard full output-input Jacobian will be used.
- Returns
shape (self._input_size) or (B, self._input_size). outputs (Tensor): outputs of the ReluMLP
- Return type
vjp (Tensor)
- forward(inputs, state=(), requires_jac=False, requires_jac_diag=False)[source]#
- Parameters
inputs (torch.Tensor) –
state – not used
requires_jac (bool) – whether outputs input-output Jacobian.
requires_jac_diag (bool) – whetheer outputs diagonals of Jacobian.
- training: bool#
- class SimpleFC(input_size, output_size, activation=<function identity>)[source]#
Bases:
torch.nn.modules.linear.LinearA simple FC layer that record its output before activation. It is for used in the ReluMLP to enable explicit computation of diagonals of input-output Jacobian.
Initialize a SimpleFC layer.
- Parameters
input_size (int) – input dimension.
output_size (int) – output dimension.
activation (nn.functional) – activation used for this layer. Default is math_ops.identity.
- forward(inputs)[source]#
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- in_features: int#
- out_features: int#
- weight: torch.Tensor#
alf.networks.transformer_networks#
- class SocialAttentionNetwork(input_tensor_spec, input_preprocessors=None, preprocessing_combiner=None, fc_layer_params=(128, 128), activation=<built-in method relu_ of type object>, kernel_initializer=None, use_fc_bn=False, num_of_heads=1, last_layer_size=None, last_activation=None, last_kernel_initializer=None, name='SocialAttentionNetwork')[source]#
Bases:
alf.networks.preprocessor_networks.PreprocessorNetworkSimple graph encoding network, which takes as input a set of objects and outputs one encoded feature vector. Reference:
Leurent et al “Social Attention for Autonomous Decision-Making in Dense Traffic”, arXiv:1911.12250
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input. If nested, then
preprocessing_combinermust not be None.input_preprocessors (nested InputPreprocessor) – a nest of
InputPreprocessor, each of which will be applied to the corresponding input. If not None, then it must have the same structure withinput_tensor_spec. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept
input_tensor_specas the input to compute the processed tensor spec. For example, seealf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.fc_layer_params (tuple[int]) – a tuple of integers representing FC layer sizes for generating embeddings.
activation (nn.functional) – activation used for all the layers but the last layer.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If None, a variance_scaling_initializer will be used.
use_fc_bn (bool) – whether use Batch Normalization for fc layers.
num_of_heads (int) – number of heads for the mult-head attention
last_layer_size (None) – nt used; for interface compatibility
last_activation (None) – not used; for interface compatibility
last_kernel_initializer (None) – not used; for interface compatibility
last_use_fc_bn (None) – not used; for interface compatibility
name (str) –
- forward(inputs, state=())[source]#
- Parameters
inputs (Tensor) – with the shape of [B, N, d], where B denotes batch size, N the number of entities, and d the feature dimension
state (nested Tensor) – states
- Returns
shape is [B, d’], where d’ denotes the output dimension of the last layer specified by fc_layer_params (i.e. fc_layer_params[-1])
- Return type
Tensor
- training: bool#
- class TransformerNetwork(input_tensor_spec, num_prememory_layers, num_attention_heads, d_ff=None, core_size=1, use_core_embedding=True, memory_size=0, num_memory_layers=0, return_core_only=True, centralized_memory=True, input_preprocessors=None, name='TransformerNetwork')[source]#
Bases:
alf.networks.preprocessor_networks.PreprocessorNetworkA Network composed of Memory and TransformerBlock.
The following is the pseudocode for the computation:
for i in range(num_prememory_layers): core, inputs = T_i([core, inputs], [core, inputs]) for j in range(num_memory_layers): new_core, inputs = TM_j([memory_j, core, inputs], [core, inputs]) memory_j.write(core) core = new_core return core, new_memory_state
where T_i denotes the
TransformerBlockfor the i-th prememory layers and TM_j denotes theTransformerBlockfor the j-th memory layers. memory_j is anFIFOMemoryobject (not to be confused with thememoryargument ofTransformerBlock.forward() function)The core embedding serves the same purpose of [CLS] in the BERT model in [1], which is to generate a fixed dimensional representation for downstream tasks. Different from BERT, which only has one [CLS] embedding, we allow the option of having multiple core embeddings. In addition to generating a fixed dimensional representation, the core embedding is also used to update the memory.
- [1]. Devlin et al. BERT: Pre-training of Deep Bidirectional Transformers for
Language Understanding
- Parameters
input_tensor_spec (nested TensorSpec) – the (nested) tensor spec of the input. If
input_tensor_specis not nested, it should represent a rank-2 tensor of shape[input_size, d_model], whereinput_sizeis the length of the input sequence, andd_modelis the dimension of embedding.num_prememory_layers (int) – number of TransformerBlock calculation without using memory
num_attention_heads (int) – number of attention heads for each
TransformerBlockd_ff (int) – the size of the hidden layer of the feedforward network in each
TransformerBlock. If None,TransformerBlockwill calculate it as4*d_model.memory_size (int) – size of memory.
num_memory_layers (int) – number of TransformerBlock calculation using memory
return_core_only (bool) – If True, only return the core embedding. Otherwise, return all embeddings
core_size (int) – size of core (i.e. number of embeddings of core)
use_core_embedding (bool) – whether to use learnable core embedding. If True, will use additional learnable core embedding to augment the input. If False, the first
core_sizeembeddings of the input are treated as core.centralized_memory (bool) – if False, there will be a separate memory for each memory layers. if True, there will be a single memory for all the memroy layers and it is updated using the last core embeddings.
input_preprocessors (nested Network|nn.Module) – a nest of stateless preprocessor networks, each of which will be applied to the corresponding input. If not None, then it must have the same structure with
input_tensor_spec. If any element is None, then it will be treated as math_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector. The output_spec of each input preprocessor i should be [input_size_i, d_model]. The result of all the preprocessors will be concatenated as a Tensor of shape[batch_size, input_size, d_model], whereinput_size = sum_i input_size_i.
- forward(inputs, state=())[source]#
- Parameters
inputs (nested Tensor) – consistent with
input_tensor_specprovided at__init__()state (nested Tensor) – states
- Returns
- shape is [B, core_size * d_model] if
return_core_only, and [B, core_size + input_size, d_model] if not
return_core_only, whereinput_sizeis the number of embeddings from the (processed) input.
nested Tensor: network states.
- shape is [B, core_size * d_model] if
- Return type
Tensor
- property state_spec#
Return the state spec to be used by an
Algorithm.Subclass should override this to return the correct
state_spec.
- training: bool#
alf.networks.value_networks#
ValueNetwork and ValueRNNNetwork.
- class ParallelValueNetwork(value_network, n, name='ParallelValueNetwork')[source]#
Bases:
alf.networks.network.NetworkPerform
nvalue computations in parallel.It creates a parallelized version of
value_network. :type value_network:ValueNetwork:param value_network: non-parallelized value network :type value_network: ValueNetwork :type n:int:param n: makenreplicas fromvalue_networkwith differentinitialization.
- Parameters
name (str) –
- forward(observation, state=())[source]#
Computes values given a batch of observations. :param inputs: A tuple of Tensors consistent with input_tensor_spec`. :type inputs: tuple :param state: Empty for API consistent with
ValueRNNNetwork. :type state: tuple
- property state_spec#
Return the state spec of the value network. It is simply the state spec of the encoding network.
- training: bool#
- class ValueNetwork(input_tensor_spec, output_tensor_spec=TensorSpec(shape=(), dtype=torch.float32), input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, use_fc_bn=False, name='ValueNetwork')[source]#
Bases:
alf.networks.value_networks.ValueNetworkBaseOutput temporally uncorrelated values.
Creates a value network that estimates the expected return.
- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input
output_tensor_spec (TensorSpec) – spec for the output
input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with input_tensor_spec (after reshaping). If any element is None, then it will be treated as math_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.
preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept input_tensor_spec as the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.
conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format (filters, kernel_size, strides, padding), where padding is optional.
fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layer sizes.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a default xavier_uniform initializer will be used.
use_fc_bn (bool) – whether use Batch Normalization for the internal FC layers (i.e. FC layers beside the last one).
name (str) –
- training: bool#
- class ValueNetworkBase(input_tensor_spec, output_tensor_spec, encoding_network_ctor, name='ValueNetworkBase', **encoder_kwargs)[source]#
Bases:
alf.networks.network.NetworkA base class for
ValueNetworkandValueRNNNetwork.Can also be used to create customized value networks by providing different encoding network creators.
- Parameters
input_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – the tensor spec of the input.output_tensor_spec (
Union[TensorSpec,List[ForwardRef],Tuple[()],Tuple[ForwardRef, …],Dict[str,ForwardRef]]) – spec for the value output.encoding_network_ctor (
Callable) – the creator of the encoding network that does the heavy lifting of the value network.name – name of the network
encoder_kwargs – the extra keyword arguments to the encoding network
- forward(observation, state=())[source]#
Computes a value given an observation.
- Parameters
observation (torch.Tensor) – consistent with input_tensor_spec
state – empty for API consistent with ValueRNNNetwork
- Returns
a 1D tensor state: empty
- Return type
value (torch.Tensor)
- make_parallel(n)[source]#
Create a
ParallelValueNetworkusingnreplicas ofself. The initialized network parameters will be different.
- property state_spec#
Return the state spec of the value network. It is simply the state spec of the encoding network.
- training: bool#
- class ValueRNNNetwork(input_tensor_spec, output_tensor_spec=TensorSpec(shape=(), dtype=torch.float32), input_preprocessors=None, preprocessing_combiner=None, conv_layer_params=None, fc_layer_params=None, lstm_hidden_size=100, value_fc_layer_params=None, activation=<built-in method relu_ of type object>, kernel_initializer=None, name='ValueRNNNetwork')[source]#
Bases:
alf.networks.value_networks.ValueNetworkBaseOutputs temporally correlated values.
Creates an instance of ValueRNNNetwork.
- Parameters
input_tensor_spec (TensorSpec) – the tensor spec of the input
output_tensor_spec (TensorSpec) – spec for the output
input_preprocessors (nested Network|nn.Module|None) – a nest of input preprocessors, each of which will be applied to the corresponding input. If not None, then it must have the same structure with input_tensor_spec (after reshaping). If any element is None, then it will be treated as math_ops.identity. This arg is helpful if you want to have separate preprocessings for different inputs by configuring a gin file without changing the code. For example, embedding a discrete input before concatenating it to another continuous vector.
preprocessing_combiner (NestCombiner) – preprocessing called on complex inputs. Note that this combiner must also accept input_tensor_spec as the input to compute the processed tensor spec. For example, see alf.nest.utils.NestConcat. This arg is helpful if you want to combine inputs by configuring a gin file without changing the code.
conv_layer_params (tuple[tuple]) – a tuple of tuples where each tuple takes a format (filters, kernel_size, strides, padding), where padding is optional.
fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers for encoding the observation.
lstm_hidden_size (int or tuple[int]) – the hidden size(s) of the LSTM cell(s). Each size corresponds to a cell. If there are multiple sizes, then lstm cells are stacked.
value_fc_layer_params (tuple[int]) – a tuple of integers representing hidden FC layers that are applied after the lstm cell’s output.
activation (nn.functional) – activation used for hidden layers. The last layer will not be activated.
kernel_initializer (Callable) – initializer for all the layers but the last layer. If none is provided a default xavier_uniform initializer will be used.
name (str) –
- training: bool#