Command line reference

Contents

Command line reference#

Entry point: python -m src.tabstruct.experiment.run_experiment

Core arguments#

–pipeline: prediction | generation
–model: see Models section
–task: classification | regression (prediction); generation infers from dataset
–dataset: dataset name (as supported by tabcamel)
–test_size, –valid_size: float in (0,1] or integer counts
–split_mode: stratified | random (regression -> random only)
–seed: int
–device: cpu | cuda

Data curation#

–curate_mode: sharing
–curate_ratio: float, number of curated per real sample
–generator, –generator_tags: reference past generation in W&B
–synthetic_data_path: explicit path

Lightning training#

–max_steps_tentative, –batch_size_tentative, –full_batch_training
–optimizer [adam|adamw|sgd], –gradient_clip_val
–lr_scheduler [none|plateau|cosine_warm_restart|linear|lambda]
–metric_model_selection, –patience_early_stopping
–log_every_n_steps_tentative, –check_val_every_n_epoch_tentative

Evaluation toggles#

–eval_only, –disable_eval_density, –disable_eval_privacy, –enable_eval_structure

Tuning#

–enable_optuna, –optuna_trial, –disable_optuna_pruning, –tune_reduction, –tune_max_workers

W&B#

–tags, –wandb_log_model, –disable_wandb, –checkpoint_tags

Notes#

For generation eval-only, either provide --synthetic_data_path or ensure matching --generator_tags are retrievable.
Regression tasks require --split_mode random.