calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.1715	1.0	14	2.5138
2.4606	2.0	28	2.4244
2.4161	3.0	42	2.4069
2.3712	4.0	56	2.3128
2.3084	5.0	70	2.3046
2.3103	6.0	84	2.3062
2.3059	7.0	98	2.3058
2.3035	8.0	112	2.3015
2.2891	9.0	126	2.2507
2.1651	10.0	140	2.0143
1.8401	11.0	154	1.4818
1.2616	12.0	168	1.0189
0.9410	13.0	182	0.8415
0.8222	14.0	196	0.7897
0.7887	15.0	210	0.7789
0.7803	16.0	224	0.7753
0.7768	17.0	238	0.7738
0.7749	18.0	252	0.7726
0.7736	19.0	266	0.7723
0.7727	20.0	280	0.7719
0.7720	21.0	294	0.7714
0.7714	22.0	308	0.7713
0.7709	23.0	322	0.7710
0.7705	24.0	336	0.7709
0.7701	25.0	350	0.7710
0.7697	26.0	364	0.7708
0.7692	27.0	378	0.7709
0.7689	28.0	392	0.7710
0.7684	29.0	406	0.7711
0.7682	30.0	420	0.7710
0.7677	31.0	434	0.7712
0.7676	32.0	448	0.7712
0.7673	33.0	462	0.7717
0.7670	34.0	476	0.7715
0.7666	35.0	490	0.7716
0.7664	36.0	504	0.7716
0.7661	37.0	518	0.7718
0.7658	38.0	532	0.7719
0.7657	39.0	546	0.7719
0.7656	40.0	560	0.7719

Safetensors

Model size

7.82M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support