Training#
Configuration reference for jaqmc hall train.
This page shows the effective defaults for the train workflow preset. Use
--dry-run to see the resolved config for your run, or add
workflow.config.verbose=true to include field descriptions. Keys use the same
dot notation as CLI overrides, such as train.run.iterations=5000. Defaults
are resolved in this order: schema defaults, workflow preset, YAML config, then
CLI overrides. For evaluation config, see Evaluation.
Root-level runtime keys such as logging.*, jax.*, and distributed.* are
shared by all commands. See Runtime Configuration.
Workflow (workflow.*)#
These keys control workflow-level settings shared across all stages.
workflow.seed
Fixed random seed.
workflow.batch_size
Number of walkers (samples) to use in each iteration.
workflow.save_path
Path to save checkpoints and logs.
workflow.restore_path
Path to restore checkpoints from.
workflow.config.ignore_extra
If True, silently ignore unrecognized config keys.
workflow.config.verbose
If True, print the fully resolved config with field descriptions at startup.
System (system.*)#
Defines the quantum Hall system on the Haldane sphere.
See Quantum Hall for physics background and usage examples.
system.flux
Magnetic flux \(2Q\) (positive integer).
system.nspins
(n_up, n_down) electron counts.
system.radius
Sphere radius.
system.interaction_type
Interaction potential form.
system.interaction_strength
Scaling factor for the potential energy.
system.lz_center
Target \(L_z\) for the penalty method.
system.lz_penalty
Penalty strength for \((L_z - L_{z,0})^2\).
system.l2_penalty
Penalty strength for \(L^2\).
Wavefunction (wf.*)#
Selects and configures the neural network ansatz.
Default module selection:
mhpo. Effective defaults for the built-in architectures are listed below. Built-in choices aremhpo,laughlin, andfree.
See Quantum Hall for background on each architecture.
MHPO options (wf.*)#
wf.ndets
Number of determinants.
wf.num_heads
Number of attention heads.
wf.heads_dim
Dimension of each attention head.
wf.num_layers
Number of Psiformer layers.
wf.flux_per_elec
Flux quanta attached per electron for composite fermions.
Laughlin options (wf.*)#
wf.cf_flux
Composite fermion flux attachment parameter \(p\).
Free options (wf.*)#
Train Stage (train.*)#
The VMC optimization loop. Samples electron configurations on the Haldane sphere, computes energy (and optional angular momentum penalties), and updates wavefunction parameters.
Run options (train.run.*)#
train.run.check_vma
Enable JAX validity checks during shard_map.
train.run.iterations
Total number of iterations to run.
train.run.burn_in
Sampling iterations to discard before the main loop for MCMC equilibration.
train.run.save_time_interval
Minimum wall-clock seconds between checkpoint saves.
train.run.save_step_interval
Save checkpoints only at steps that are multiples of this value.
train.run.stop_on_nan
Abort training when NaN is detected in step statistics. True checks all stat keys, False disables the check, or pass a comma-separated string of specific keys to monitor (e.g. "loss").
Optimizer (train.optim.*)#
Default optimizer module:
kfac. Effective defaults for the built-in optimizers are listed below.
KFAC options#
train.optim.learning_rate
The learning rate. Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.
train.optim.learning_rate.module
Select the implementation used for this component.
train.optim.learning_rate.rate
Initial learning rate.
train.optim.learning_rate.delay
Delay in steps before decay starts.
train.optim.learning_rate.decay
Decay rate exponent.
train.optim.norm_constraint
The update is scaled down so that its approximate squared Fisher norm \(v^T F v\) is at most the specified value.
train.optim.curvature_ema
Decay factor used when calculating the covariance estimate moving averages.
train.optim.l2_reg
Tell the optimizer what L2 regularization coefficient you are using.
train.optim.inverse_update_period
Number of steps in between updating the inverse curvature approximation.
train.optim.damping
Fixed damping parameter.
SR options#
train.optim.learning_rate
Step size (scalar or schedule). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.
train.optim.learning_rate.module
Select the implementation used for this component.
train.optim.learning_rate.rate
Initial learning rate.
train.optim.learning_rate.delay
Delay in steps before decay starts.
train.optim.learning_rate.decay
Decay rate exponent.
train.optim.max_norm
Constrained update norm C (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.max_norm.module changes.
train.optim.max_norm.module
Select the implementation used for this component.
train.optim.max_norm.rate
The constant rate.
train.optim.damping
Damping lambda (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.damping.module changes.
train.optim.damping.module
Select the implementation used for this component.
train.optim.damping.rate
The constant rate.
train.optim.max_cond_num
Maximum condition number for adaptive damping.
train.optim.spring_mu
SPRING momentum coefficient mu (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.spring_mu.module changes.
train.optim.spring_mu.module
Select the implementation used for this component.
train.optim.spring_mu.rate
The constant rate.
train.optim.march_beta
Decay factor for the MARCH variance accumulator (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.march_beta.module changes.
train.optim.march_beta.module
Select the implementation used for this component.
train.optim.march_beta.rate
The constant rate.
train.optim.march_mode
MARCH variance mode. "diff" uses update differences and "var" uses score variance along the batch axis.
train.optim.eps
Small numerical constant for stability.
train.optim.mixed_precision
Whether to use mixed precision for Gram factorization.
train.optim.score_chunk_size
Chunk size for score computation.
train.optim.score_norm_clip
Optional clip value for the mean absolute score per batch row.
train.optim.gram_num_chunks
Number of chunks for Gram matrix computation.
train.optim.gram_dot_prec
Precision mode for Gram matrix dot products.
train.optim.prune_inactive
Whether to structurally prune inactive parameter leaves when forming the SR system.
Adam options#
train.optim.learning_rate
A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.
train.optim.learning_rate.module
Select the implementation used for this component.
train.optim.learning_rate.rate
Initial learning rate.
train.optim.learning_rate.delay
Delay in steps before decay starts.
train.optim.learning_rate.decay
Decay rate exponent.
train.optim.b1
Exponential decay rate to track the first moment of past gradients.
train.optim.b2
Exponential decay rate to track the second moment of past gradients.
train.optim.eps
A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.
train.optim.eps_root
A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.
LAMB options#
train.optim.learning_rate
A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.
train.optim.learning_rate.module
Select the implementation used for this component.
train.optim.learning_rate.rate
Initial learning rate.
train.optim.learning_rate.delay
Delay in steps before decay starts.
train.optim.learning_rate.decay
Decay rate exponent.
train.optim.b1
Exponential decay rate to track the first moment of past gradients.
train.optim.b2
Exponential decay rate to track the second moment of past gradients.
train.optim.eps
A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.
train.optim.eps_root
A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.
train.optim.weight_decay
Strength of the weight decay regularization. Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.weight_decay.module changes.
train.optim.weight_decay.module
Select the implementation used for this component.
train.optim.weight_decay.rate
The constant rate.
Sampler (train.sampler.*)#
Default sampler module:
mcmc, and its effective keys are listed below.
train.sampler.steps
Number of Metropolis-Hastings updates per sample draw.
train.sampler.initial_width
Initial width (stddev) of the Gaussian proposal.
train.sampler.adapt_frequency
Frequency of adaptive width updates.
train.sampler.pmove_range
Target range for acceptance rate.
Writers (train.writers.*)#
The train stage enables console, csv, and hdf5 writers by default.
Console writer (train.writers.console.*)#
train.writers.console.interval
Step interval for logging.
train.writers.console.fields
Comma-separated list of field specs.
CSV writer (train.writers.csv.*)#
train.writers.csv.path_template
Output path template.
HDF5 writer (train.writers.hdf5.*)#
train.writers.hdf5.path_template
Output path template.
Loss gradients#
The workflow wires LossAndGrad
automatically. When angular momentum penalties are enabled
(system.lz_penalty or system.l2_penalty), the loss key is set to
penalized_loss; otherwise it defaults to total_energy. There is no
user-facing train.grads.* schema for hall workflows.
Estimators (estimators.*)#
Energy estimators are configured programmatically by the workflow and are not typically overridden via config. The same definitions are used by Evaluation. For physics and derivations, see How Estimators Work. For the API, see Estimators.
TotalEnergy automatically sums all energy:-prefixed components. When
system.lz_penalty or system.l2_penalty are nonzero, a PenalizedLoss
estimator is added automatically. Neither is configurable via a config key.
Kinetic energy (estimators.energy.kinetic.*)#
Kinetic energy estimator on the Haldane sphere using the covariant Laplacian. See Kinetic energy for physics details and Laplacian mode trade-offs.
estimators.energy.kinetic.vmap_chunk_size
Number of walkers to evaluate per vmap chunk.
estimators.energy.kinetic.mode
Laplacian computation mode. scan and fori_loop use a Hessian-based approach; forward_laplacian uses the forward Laplacian.
estimators.energy.kinetic.monopole_strength
Half the magnetic flux (\(Q = \text{flux}/2\)).
estimators.energy.kinetic.radius
Sphere radius.
Coulomb potential (estimators.energy.potential.*)#
Coulomb repulsion on the Haldane sphere.
estimators.energy.potential.vmap_chunk_size
Number of walkers to evaluate per vmap chunk.
estimators.energy.potential.interaction_type
Interaction potential form.
estimators.energy.potential.monopole_strength
\(Q = \text{flux}/2\).
estimators.energy.potential.radius
Sphere radius.
estimators.energy.potential.interaction_strength
Overall scaling factor.