Training

Training#

Configuration reference for jaqmc hall train. This page shows the effective defaults for the train workflow preset. Use --dry-run to see the resolved config for your run, or add workflow.config.verbose=true to include field descriptions. Keys use the same dot notation as CLI overrides, such as train.run.iterations=5000. Defaults are resolved in this order: schema defaults, workflow preset, YAML config, then CLI overrides. For evaluation config, see Evaluation.

Root-level runtime keys such as logging.*, jax.*, and distributed.* are shared by all commands. See Runtime Configuration.

Workflow (`workflow.*`)#

These keys control workflow-level settings shared across all stages.

workflow.seed

Default: None · Type: int | None

Fixed random seed.

workflow.batch_size

Default: 4096 · Type: int

Number of walkers (samples) to use in each iteration.

workflow.save_path

Default: '' · Type: str

Path to save checkpoints and logs.

workflow.restore_path

Default: '' · Type: str

Path to restore checkpoints from.

workflow.config.ignore_extra

Default: False · Type: bool

If True, silently ignore unrecognized config keys.

workflow.config.verbose

Default: False · Type: bool

If True, print the fully resolved config with field descriptions at startup.

System (`system.*`)#

Defines the quantum Hall system on the Haldane sphere.

See Quantum Hall for physics background and usage examples.

system.flux

Default: 2 · Type: int

Magnetic flux \(2Q\) (positive integer).

system.nspins

Default: (3, 0) · Type: tuple[int, int]

(n_up, n_down) electron counts.

system.radius

Default: None · Type: float | None

Sphere radius.

system.interaction_type

Default: coulomb · Type: InteractionType

Interaction potential form.

system.interaction_strength

Default: 1.0 · Type: float

Scaling factor for the potential energy.

system.lz_center

Default: 0.0 · Type: float

Target \(L_z\) for the penalty method.

system.lz_penalty

Default: 0.0 · Type: float

Penalty strength for \((L_z - L_{z,0})^2\).

system.l2_penalty

Default: 0.0 · Type: float

Penalty strength for \(L^2\).

Wavefunction (`wf.*`)#

Selects and configures the neural network ansatz.

Default module selection: mhpo. Effective defaults for the built-in architectures are listed below. Built-in choices are mhpo, laughlin, and free.

See Quantum Hall for background on each architecture.

MHPO options (`wf.*`)#

wf.ndets

Default: 1 · Type: int

Number of determinants.

wf.num_heads

Default: 4 · Type: int

Number of attention heads.

wf.heads_dim

Default: 64 · Type: int

Dimension of each attention head.

wf.num_layers

Default: 2 · Type: int

Number of Psiformer layers.

wf.flux_per_elec

Default: 0 · Type: int

Flux quanta attached per electron for composite fermions.

Laughlin options (`wf.*`)#

wf.cf_flux

Default: 1 · Type: int

Composite fermion flux attachment parameter \(p\).

Free options (`wf.*`)#

Train Stage (`train.*`)#

The VMC optimization loop. Samples electron configurations on the Haldane sphere, computes energy (and optional angular momentum penalties), and updates wavefunction parameters.

Run options (`train.run.*`)#

train.run.check_vma

Default: True · Type: bool

Enable JAX validity checks during shard_map.

train.run.iterations

Default: 200000 · Type: int

Total number of iterations to run.

train.run.burn_in

Default: 100 · Type: int

Sampling iterations to discard before the main loop for MCMC equilibration.

train.run.save_time_interval

Default: 600 · Type: int

Minimum wall-clock seconds between checkpoint saves.

train.run.save_step_interval

Default: 1000 · Type: int

Save checkpoints only at steps that are multiples of this value.

train.run.stop_on_nan

Default: 'loss' · Type: bool | str

Abort training when NaN is detected in step statistics. True checks all stat keys, False disables the check, or pass a comma-separated string of specific keys to monitor (e.g. "loss").

Optimizer (`train.optim.*`)#

Default optimizer module: kfac. Effective defaults for the built-in optimizers are listed below.

KFAC options#

train.optim.learning_rate

Default: Standard · Type: swappable

The learning rate. Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.norm_constraint

Default: 0.001 · Type: float

The update is scaled down so that its approximate squared Fisher norm \(v^T F v\) is at most the specified value.

train.optim.curvature_ema

Default: 0.95 · Type: float

Decay factor used when calculating the covariance estimate moving averages.

train.optim.l2_reg

Default: 0.0 · Type: float

Tell the optimizer what L2 regularization coefficient you are using.

train.optim.inverse_update_period

Default: 1 · Type: int

Number of steps in between updating the inverse curvature approximation.

train.optim.damping

Default: 0.001 · Type: float

Fixed damping parameter.

SR options#

train.optim.learning_rate

Default: Standard · Type: swappable

Step size (scalar or schedule). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.max_norm

Default: Constant · Type: swappable

Constrained update norm C (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.max_norm.module changes.

train.optim.max_norm.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.max_norm.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.damping

Default: Constant · Type: swappable

Damping lambda (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.damping.module changes.

train.optim.damping.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.damping.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.max_cond_num

Default: 10000000.0 · Type: float | None

Maximum condition number for adaptive damping.

train.optim.spring_mu

Default: Constant · Type: swappable

SPRING momentum coefficient mu (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.spring_mu.module changes.

train.optim.spring_mu.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.spring_mu.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.march_beta

Default: Constant · Type: swappable

Decay factor for the MARCH variance accumulator (scalar or schedule). Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.march_beta.module changes.

train.optim.march_beta.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.march_beta.rate

Default: 0.05 · Type: float

The constant rate.

train.optim.march_mode

Default: 'var' · Type: Literal[var, diff]

MARCH variance mode. "diff" uses update differences and "var" uses score variance along the batch axis.

train.optim.eps

Default: 1e-08 · Type: float

Small numerical constant for stability.

train.optim.mixed_precision

Default: True · Type: bool

Whether to use mixed precision for Gram factorization.

train.optim.score_chunk_size

Default: 128 · Type: int | None

Chunk size for score computation.

train.optim.score_norm_clip

Default: None · Type: float | None

Optional clip value for the mean absolute score per batch row.

train.optim.gram_num_chunks

Default: 4 · Type: int | None

Number of chunks for Gram matrix computation.

train.optim.gram_dot_prec

Default: 'F64' · Type: str | None

Precision mode for Gram matrix dot products.

train.optim.prune_inactive

Default: False · Type: bool

Whether to structurally prune inactive parameter leaves when forming the SR system.

Adam options#

train.optim.learning_rate

Default: Standard · Type: swappable

A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.b1

Default: 0.9 · Type: float

Exponential decay rate to track the first moment of past gradients.

train.optim.b2

Default: 0.999 · Type: float

Exponential decay rate to track the second moment of past gradients.

train.optim.eps

Default: 1e-08 · Type: float

A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.

train.optim.eps_root

Default: 0.0 · Type: float

A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.

LAMB options#

train.optim.learning_rate

Default: Standard · Type: swappable

A global scaling factor, either fixed or evolving along iterations with a scheduler, see optax.scale_by_learning_rate(). Swappable component; the nested keys below are the options for the current module Standard and change when train.optim.learning_rate.module changes.

train.optim.learning_rate.module

Default: Standard · Type: module path

Select the implementation used for this component.

train.optim.learning_rate.rate

Default: 0.05 · Type: float

Initial learning rate.

train.optim.learning_rate.delay

Default: 2000 · Type: float

Delay in steps before decay starts.

train.optim.learning_rate.decay

Default: 1 · Type: float

Decay rate exponent.

train.optim.b1

Default: 0.9 · Type: float

Exponential decay rate to track the first moment of past gradients.

train.optim.b2

Default: 0.999 · Type: float

Exponential decay rate to track the second moment of past gradients.

train.optim.eps

Default: 1e-06 · Type: float

A small constant applied to denominator outside of the square root (as in the Adam paper) to avoid dividing by zero when rescaling.

train.optim.eps_root

Default: 0.0 · Type: float

A small constant applied to denominator inside the square root (as in RMSProp), to avoid dividing by zero when rescaling.

train.optim.weight_decay

Default: Constant · Type: swappable

Strength of the weight decay regularization. Swappable component; the nested keys below are the options for the current module Constant and change when train.optim.weight_decay.module changes.

train.optim.weight_decay.module

Default: Constant · Type: module path

Select the implementation used for this component.

train.optim.weight_decay.rate

Default: 0.05 · Type: float

The constant rate.

Sampler (`train.sampler.*`)#

Default sampler module: mcmc, and its effective keys are listed below.

train.sampler.steps

Default: 10 · Type: int

Number of Metropolis-Hastings updates per sample draw.

train.sampler.initial_width

Default: 0.1 · Type: float

Initial width (stddev) of the Gaussian proposal.

train.sampler.adapt_frequency

Default: 100 · Type: int

Frequency of adaptive width updates.

train.sampler.pmove_range

Default: (0.5, 0.55) · Type: tuple[float, float]

Target range for acceptance rate.

Writers (`train.writers.*`)#

The train stage enables console, csv, and hdf5 writers by default.

Console writer (`train.writers.console.*`)#

train.writers.console.interval

Default: 1 · Type: int

Step interval for logging.

train.writers.console.fields

Default: 'pmove:.2f,energy=total_energy_real:.4f,variance=total_energy_var:.4f,Lz=angular_momentum_z:+.4f,L_square=angular_momentum_square:.4f' · Type: str

Comma-separated list of field specs.

CSV writer (`train.writers.csv.*`)#

train.writers.csv.path_template

Default: '{stage}_stats.csv' · Type: str

Output path template.

HDF5 writer (`train.writers.hdf5.*`)#

train.writers.hdf5.path_template

Default: '{stage}_stats.h5' · Type: str

Output path template.

Loss gradients#

The workflow wires LossAndGrad automatically. When angular momentum penalties are enabled (system.lz_penalty or system.l2_penalty), the loss key is set to penalized_loss; otherwise it defaults to total_energy. There is no user-facing train.grads.* schema for hall workflows.

Estimators (`estimators.*`)#

Energy estimators are configured programmatically by the workflow and are not typically overridden via config. The same definitions are used by Evaluation. For physics and derivations, see How Estimators Work. For the API, see Estimators.

TotalEnergy automatically sums all energy:-prefixed components. When system.lz_penalty or system.l2_penalty are nonzero, a PenalizedLoss estimator is added automatically. Neither is configurable via a config key.

Kinetic energy (`estimators.energy.kinetic.*`)#

Kinetic energy estimator on the Haldane sphere using the covariant Laplacian. See Kinetic energy for physics details and Laplacian mode trade-offs.

estimators.energy.kinetic.vmap_chunk_size

Default: None · Type: int | None

Number of walkers to evaluate per vmap chunk.

estimators.energy.kinetic.mode

Default: scan · Type: LaplacianMode

Laplacian computation mode. scan and fori_loop use a Hessian-based approach; forward_laplacian uses the forward Laplacian.

estimators.energy.kinetic.monopole_strength

Default: 1.0 · Type: float

Half the magnetic flux (\(Q = \text{flux}/2\)).

estimators.energy.kinetic.radius

Default: None · Type: float | None

Sphere radius.

Coulomb potential (`estimators.energy.potential.*`)#

Coulomb repulsion on the Haldane sphere.

estimators.energy.potential.vmap_chunk_size

Default: None · Type: int | None

Number of walkers to evaluate per vmap chunk.

estimators.energy.potential.interaction_type

Default: coulomb · Type: InteractionType

Interaction potential form.

estimators.energy.potential.monopole_strength

Default: 1.0 · Type: float

\(Q = \text{flux}/2\).

estimators.energy.potential.radius

Default: 1.0 · Type: float

Sphere radius.

estimators.energy.potential.interaction_strength

Default: 1.0 · Type: float

Overall scaling factor.

Training

Contents

Training#

Workflow (workflow.*)#

System (system.*)#

Wavefunction (wf.*)#

MHPO options (wf.*)#

Laughlin options (wf.*)#

Free options (wf.*)#

Train Stage (train.*)#

Run options (train.run.*)#

Optimizer (train.optim.*)#

KFAC options#

SR options#

Adam options#

LAMB options#

Sampler (train.sampler.*)#

Writers (train.writers.*)#

Console writer (train.writers.console.*)#

CSV writer (train.writers.csv.*)#

HDF5 writer (train.writers.hdf5.*)#

Loss gradients#

Estimators (estimators.*)#

Kinetic energy (estimators.energy.kinetic.*)#

Coulomb potential (estimators.energy.potential.*)#

Workflow (`workflow.*`)#

System (`system.*`)#

Wavefunction (`wf.*`)#

MHPO options (`wf.*`)#

Laughlin options (`wf.*`)#

Free options (`wf.*`)#

Train Stage (`train.*`)#

Run options (`train.run.*`)#

Optimizer (`train.optim.*`)#

Sampler (`train.sampler.*`)#

Writers (`train.writers.*`)#

Console writer (`train.writers.console.*`)#

CSV writer (`train.writers.csv.*`)#

HDF5 writer (`train.writers.hdf5.*`)#

Estimators (`estimators.*`)#

Kinetic energy (`estimators.energy.kinetic.*`)#

Coulomb potential (`estimators.energy.potential.*`)#