calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4045 1.0 6 2.7587
2.3917 2.0 12 1.9900
1.8734 3.0 18 1.6958
1.6324 4.0 24 1.6081
1.5676 5.0 30 1.5619
1.5436 6.0 36 1.6197
1.5139 7.0 42 1.4991
1.4614 8.0 48 1.4779
1.4407 9.0 54 1.4234
1.3644 10.0 60 1.3460
1.3096 11.0 66 1.3823
1.2634 12.0 72 1.2711
1.1912 13.0 78 1.2382
1.1856 14.0 84 1.1337
1.1019 15.0 90 1.2100
1.1441 16.0 96 1.1382
1.0611 17.0 102 1.0282
0.9967 18.0 108 0.9920
0.9765 19.0 114 0.9946
0.9517 20.0 120 0.9478
0.9374 21.0 126 0.9441
0.8931 22.0 132 0.9748
0.8756 23.0 138 0.8511
0.8523 24.0 144 0.8759
0.8757 25.0 150 0.8253
0.8209 26.0 156 0.8182
0.8190 27.0 162 0.7820
0.7795 28.0 168 0.7740
0.8097 29.0 174 0.7571
0.7626 30.0 180 0.7584
0.7491 31.0 186 0.7444
0.7320 32.0 192 0.7177
0.7235 33.0 198 0.7124
0.7145 34.0 204 0.7032
0.7085 35.0 210 0.6888
0.7138 36.0 216 0.6866
0.6910 37.0 222 0.6789
0.6801 38.0 228 0.6731
0.6819 39.0 234 0.6715
0.6750 40.0 240 0.6688

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
26
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support