calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7719

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.1715 1.0 14 2.5138
2.4606 2.0 28 2.4244
2.4161 3.0 42 2.4069
2.3712 4.0 56 2.3128
2.3084 5.0 70 2.3046
2.3103 6.0 84 2.3062
2.3059 7.0 98 2.3058
2.3035 8.0 112 2.3015
2.2891 9.0 126 2.2507
2.1651 10.0 140 2.0143
1.8401 11.0 154 1.4818
1.2616 12.0 168 1.0189
0.9410 13.0 182 0.8415
0.8222 14.0 196 0.7897
0.7887 15.0 210 0.7789
0.7803 16.0 224 0.7753
0.7768 17.0 238 0.7738
0.7749 18.0 252 0.7726
0.7736 19.0 266 0.7723
0.7727 20.0 280 0.7719
0.7720 21.0 294 0.7714
0.7714 22.0 308 0.7713
0.7709 23.0 322 0.7710
0.7705 24.0 336 0.7709
0.7701 25.0 350 0.7710
0.7697 26.0 364 0.7708
0.7692 27.0 378 0.7709
0.7689 28.0 392 0.7710
0.7684 29.0 406 0.7711
0.7682 30.0 420 0.7710
0.7677 31.0 434 0.7712
0.7676 32.0 448 0.7712
0.7673 33.0 462 0.7717
0.7670 34.0 476 0.7715
0.7666 35.0 490 0.7716
0.7664 36.0 504 0.7716
0.7661 37.0 518 0.7718
0.7658 38.0 532 0.7719
0.7657 39.0 546 0.7719
0.7656 40.0 560 0.7719

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
89
Safetensors
Model size
7.82M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support