calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4045	1.0	6	2.7587
2.3917	2.0	12	1.9900
1.8734	3.0	18	1.6958
1.6324	4.0	24	1.6081
1.5676	5.0	30	1.5619
1.5436	6.0	36	1.6197
1.5139	7.0	42	1.4991
1.4614	8.0	48	1.4779
1.4407	9.0	54	1.4234
1.3644	10.0	60	1.3460
1.3096	11.0	66	1.3823
1.2634	12.0	72	1.2711
1.1912	13.0	78	1.2382
1.1856	14.0	84	1.1337
1.1019	15.0	90	1.2100
1.1441	16.0	96	1.1382
1.0611	17.0	102	1.0282
0.9967	18.0	108	0.9920
0.9765	19.0	114	0.9946
0.9517	20.0	120	0.9478
0.9374	21.0	126	0.9441
0.8931	22.0	132	0.9748
0.8756	23.0	138	0.8511
0.8523	24.0	144	0.8759
0.8757	25.0	150	0.8253
0.8209	26.0	156	0.8182
0.8190	27.0	162	0.7820
0.7795	28.0	168	0.7740
0.8097	29.0	174	0.7571
0.7626	30.0	180	0.7584
0.7491	31.0	186	0.7444
0.7320	32.0	192	0.7177
0.7235	33.0	198	0.7124
0.7145	34.0	204	0.7032
0.7085	35.0	210	0.6888
0.7138	36.0	216	0.6866
0.6910	37.0	222	0.6789
0.6801	38.0	228	0.6731
0.6819	39.0	234	0.6715
0.6750	40.0	240	0.6688

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support