1
0
Fork 0

Re-run tinygrad examples with HIP

main
Jeff Moe 2024-02-06 14:16:19 -07:00
parent b90b70840d
commit 8ab214b0b5
26 changed files with 1224 additions and 1237 deletions

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,12 +1,12 @@
NUM:2 BS:8 CNT:10
0%| | 0/10 [00:00<?, ?it/s] 10%|█ | 1/10 [00:01<00:10, 1.13s/it] 20%|██ | 2/10 [00:01<00:04, 1.83it/s] 30%|███ | 3/10 [00:01<00:02, 2.65it/s] 40%|████ | 4/10 [00:01<00:01, 3.56it/s] 50%|█████ | 5/10 [00:01<00:01, 4.41it/s] 60%|██████ | 6/10 [00:01<00:00, 5.15it/s] 70%|███████ | 7/10 [00:02<00:00, 5.34it/s] 80%|████████ | 8/10 [00:02<00:00, 5.89it/s] 90%|█████████ | 9/10 [00:02<00:00, 6.34it/s] 1<00:10, 1.13s/it] 1<00:10, 1.13s/it] 1<00:10, 1.13s/it] ]
15.51 ms cpy, 1089.50 ms run, 61.56 ms build, 1025.81 ms realize, 2.13 ms CL, 0.01 loss, 421 tensors, 0.04 GB used, 10.57 GFLOPS
12.29 ms cpy, 101.05 ms run, 54.71 ms build, 45.90 ms realize, 0.44 ms CL, 0.00 loss, 421 tensors, 0.04 GB used, 114.01 GFLOPS
8.48 ms cpy, 148.56 ms run, 53.64 ms build, 94.47 ms realize, 0.45 ms CL, -0.02 loss, 421 tensors, 0.04 GB used, 77.55 GFLOPS
8.89 ms cpy, 101.05 ms run, 53.11 ms build, 47.45 ms realize, 0.49 ms CL, -0.03 loss, 421 tensors, 0.04 GB used, 114.01 GFLOPS
8.85 ms cpy, 100.58 ms run, 53.40 ms build, 46.75 ms realize, 0.43 ms CL, -0.08 loss, 421 tensors, 0.04 GB used, 114.54 GFLOPS
8.54 ms cpy, 100.98 ms run, 53.75 ms build, 46.78 ms realize, 0.44 ms CL, -0.01 loss, 421 tensors, 0.04 GB used, 114.09 GFLOPS
8.55 ms cpy, 143.57 ms run, 53.24 ms build, 89.89 ms realize, 0.44 ms CL, 0.05 loss, 421 tensors, 0.04 GB used, 80.24 GFLOPS
8.49 ms cpy, 101.42 ms run, 53.58 ms build, 47.41 ms realize, 0.43 ms CL, 0.02 loss, 421 tensors, 0.04 GB used, 113.59 GFLOPS
8.48 ms cpy, 101.23 ms run, 53.99 ms build, 46.82 ms realize, 0.42 ms CL, 0.18 loss, 421 tensors, 0.04 GB used, 113.80 GFLOPS
8.50 ms cpy, 101.92 ms run, 54.60 ms build, 46.83 ms realize, 0.49 ms CL, 0.02 loss, 421 tensors, 0.04 GB used, 113.03 GFLOPS
0%| | 0/10 [00:00<?, ?it/s] 10%|█ | 1/10 [00:01<00:10, 1.13s/it] 20%|██ | 2/10 [00:01<00:04, 1.83it/s] 1<00:10, 1.13s/it] 30%|███ | 3/10 [00:01<00:02, 2.65it/s] 1<00:10, 1.13s/it] 40%|████ | 4/10 [00:01<00:01, 3.56it/s] 1<00:10, 1.13s/it] 50%|█████ | 5/10 [00:01<00:01, 4.41it/s] 1<00:10, 1.13s/it] 60%|██████ | 6/10 [00:01<00:00, 5.15it/s] 1<00:10, 1.13s/it] 70%|███████ | 7/10 [00:02<00:00, 5.34it/s] 1<00:10, 1.13s/it] 80%|████████ | 8/10 [00:02<00:00, 5.89it/s] 1<00:10, 1.13s/it] 90%|█████████ | 9/10 [00:02<00:00, 6.34it/s] 20%|██ | 2/10 [00:01<00:04, 1.83it/s] 20%|██ | 2/10 [00:01<00:04, 1.83it/s] 1<00:10, 1.13s/it] 100%|██████████| 10/10 [00:10<00:00, 1.09s/it]
175.27 ms cpy, 9470.44 ms run, 62.31 ms build, 9347.02 ms realize, 61.11 ms CL, 0.06 loss, 421 tensors, 0.04 GB used, 1.22 GFLOPS
12.54 ms cpy, 103.18 ms run, 55.31 ms build, 45.06 ms realize, 2.82 ms CL, -0.02 loss, 421 tensors, 0.04 GB used, 111.65 GFLOPS
11.01 ms cpy, 142.94 ms run, 53.91 ms build, 86.17 ms realize, 2.85 ms CL, 0.07 loss, 421 tensors, 0.04 GB used, 80.60 GFLOPS
11.05 ms cpy, 102.45 ms run, 53.68 ms build, 45.98 ms realize, 2.79 ms CL, 0.03 loss, 421 tensors, 0.04 GB used, 112.45 GFLOPS
11.07 ms cpy, 102.35 ms run, 53.75 ms build, 45.86 ms realize, 2.74 ms CL, 0.07 loss, 421 tensors, 0.04 GB used, 112.56 GFLOPS
11.14 ms cpy, 101.95 ms run, 53.89 ms build, 45.27 ms realize, 2.78 ms CL, -0.00 loss, 421 tensors, 0.04 GB used, 113.01 GFLOPS
11.14 ms cpy, 143.39 ms run, 54.09 ms build, 86.44 ms realize, 2.86 ms CL, 0.03 loss, 421 tensors, 0.04 GB used, 80.34 GFLOPS
11.97 ms cpy, 103.14 ms run, 54.24 ms build, 46.12 ms realize, 2.78 ms CL, -0.04 loss, 421 tensors, 0.04 GB used, 111.70 GFLOPS
11.29 ms cpy, 102.81 ms run, 54.46 ms build, 45.58 ms realize, 2.77 ms CL, 0.04 loss, 421 tensors, 0.04 GB used, 112.06 GFLOPS
11.15 ms cpy, 103.26 ms run, 54.59 ms build, 45.89 ms realize, 2.77 ms CL, -0.05 loss, 421 tensors, 0.04 GB used, 111.57 GFLOPS

File diff suppressed because one or more lines are too long

View File

@ -0,0 +1,8 @@
Traceback (most recent call last):
File "/home/jebba/devel/tinygrad/tinygrad/examples/compile_efficientnet.py", line 13, in <module>
prg, inp_sizes, out_sizes, state = export_model(model, mode, Tensor.randn(1,3,224,224))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jebba/devel/tinygrad/tinygrad/extra/export_model.py", line 313, in export_model
assert Device.DEFAULT in EXPORT_SUPPORTED_DEVICE, "only WEBGPU, WEBGL, CLANG, CUDA, GPU, METAL are supported"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: only WEBGPU, WEBGL, CLANG, CUDA, GPU, METAL are supported

View File

@ -1,18 +1,18 @@
2024-02-06 11:09:02.997488: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 11:09:03.035957: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-06 11:09:03.036006: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-06 11:09:03.036946: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-06 11:09:03.042522: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 11:09:03.042694: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
2024-02-06 13:09:00.382257: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 13:09:00.420457: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-06 13:09:00.420503: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-06 13:09:00.421410: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-06 13:09:00.426991: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2024-02-06 13:09:00.427163: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-06 11:09:03.830560: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-06 11:09:05.325306: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 11:09:05.325453: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-02-06 11:09:05.420786: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 11:09:05.421124: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
tinygrad: [0.270704448223114, 0.6882184743881226, 0.8074522614479065, 0.5307921767234802]
compiled: [0.270704, 0.688218, 0.807452, 0.530792]
keras: [0.2707044 0.6882185 0.8074523 0.5307921]
2024-02-06 13:09:01.185053: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-06 13:09:02.593911: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 13:09:02.594059: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-02-06 13:09:02.691872: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-02-06 13:09:02.692025: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
tinygrad: [0.29635584354400635, 0.5070338845252991, 0.6352834105491638, 0.15874029695987701]
compiled: [0.296356, 0.507034, 0.635283, 0.15874]
keras: [0.29635587 0.5070339 0.6352834 0.15874033]
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

File diff suppressed because one or more lines are too long

View File

@ -1,2 +1,2 @@
281 8.961814 tabby, tabby cat
did inference in 1105.86 ms
281 8.961816 tabby, tabby cat
did inference in 5905.02 ms

File diff suppressed because one or more lines are too long

View File

@ -1,61 +1,61 @@
optimizing for GPU
*** 2.00 ms : kernel 0 r_64_8_7_7_2_16_4_3_7_4_4_7 [49, 8, 64] [4, 16, 2] takes 2.00 ms, 7772 GFLOPS
*** 2.31 ms : kernel 1 r_2048_7_7_2_8_8_3_3 [7, 7, 2048] [8, 8, 2] takes 0.31 ms, 370 GFLOPS
*** 2.49 ms : kernel 2 r_64_2_49_8_16_16_4_4_4 [49, 2, 64] [16, 8] takes 0.18 ms, 9548 GFLOPS
*** 3.46 ms : kernel 3 r_64_2_7_7_8_8_2_64_4_4_3_3 [49, 2, 64] [2, 8, 8] takes 0.96 ms, 15461 GFLOPS
*** 4.09 ms : kernel 4 r_64_8_49_8_16_16_4_4_4 [49, 8, 64] [16, 8] takes 0.64 ms, 10300 GFLOPS
*** 4.79 ms : kernel 5 r_64_8_49_8_16_16_4_4_4n1 [49, 8, 64] [16, 8] takes 0.69 ms, 10432 GFLOPS
*** 5.40 ms : kernel 6 r_64_2_49_8_16_64_4_4_4 [49, 2, 64] [16, 8] takes 0.61 ms, 10976 GFLOPS
*** 6.36 ms : kernel 7 r_64_2_7_7_8_8_2_64_4_4_3_3n1 [49, 2, 64] [2, 8, 8] takes 0.96 ms, 15461 GFLOPS
*** 7.04 ms : kernel 8 r_64_8_49_8_16_16_4_4_4n2 [49, 8, 64] [16, 8] takes 0.68 ms, 10321 GFLOPS
*** 7.65 ms : kernel 9 r_64_2_49_8_16_64_4_4_4n1 [49, 2, 64] [16, 8] takes 0.61 ms, 10976 GFLOPS
*** 8.62 ms : kernel 10 r_64_2_7_7_8_8_2_64_4_4_3_3n2 [49, 2, 64] [2, 8, 8] takes 0.96 ms, 15461 GFLOPS
*** 9.30 ms : kernel 11 r_64_8_49_8_16_16_4_4_4n3 [49, 8, 64] [16, 8] takes 0.68 ms, 10321 GFLOPS
*** 10.15 ms : kernel 12 r_64_4_49_8_16_64_4_4_4 [49, 4, 64] [16, 8] takes 0.85 ms, 15668 GFLOPS
*** 11.66 ms : kernel 13 r_32_2_7_7_2_16_4_128_4_4_3_3 [49, 2, 32] [4, 16, 2] takes 1.51 ms, 9851 GFLOPS
*** 13.29 ms : kernel 14 r_32_8_7_7_2_16_4_64_4_4_4 [49, 8, 32] [4, 16, 2] takes 1.63 ms, 8048 GFLOPS
*** 14.12 ms : kernel 15 r_32_8_49_2_16_4_32_4_4_4 [49, 8, 32] [4, 16, 2] takes 0.83 ms, 8304 GFLOPS
*** 15.00 ms : kernel 16 r_32_2_49_2_16_4_128_4_4_4 [49, 2, 32] [4, 16, 2] takes 0.88 ms, 7567 GFLOPS
*** 16.28 ms : kernel 17 r_32_2_7_7_2_16_4_128_4_4_3_3n1 [49, 2, 32] [4, 16, 2] takes 1.28 ms, 11633 GFLOPS
*** 17.11 ms : kernel 18 r_32_8_49_2_16_4_32_4_4_4n1 [49, 8, 32] [4, 16, 2] takes 0.83 ms, 8156 GFLOPS
*** 17.99 ms : kernel 19 r_32_2_49_2_16_4_128_4_4_4n1 [49, 2, 32] [4, 16, 2] takes 0.88 ms, 7567 GFLOPS
*** 19.26 ms : kernel 20 r_32_2_7_7_2_16_4_128_4_4_3_3n2 [49, 2, 32] [4, 16, 2] takes 1.28 ms, 11633 GFLOPS
*** 20.10 ms : kernel 21 r_32_8_49_2_16_4_32_4_4_4n2 [49, 8, 32] [4, 16, 2] takes 0.83 ms, 8156 GFLOPS
*** 20.97 ms : kernel 22 r_32_2_49_2_16_4_128_4_4_4n2 [49, 2, 32] [4, 16, 2] takes 0.88 ms, 7567 GFLOPS
*** 22.25 ms : kernel 23 r_32_2_7_7_2_16_4_128_4_4_3_3n3 [49, 2, 32] [4, 16, 2] takes 1.28 ms, 11633 GFLOPS
*** 23.09 ms : kernel 24 r_32_8_49_2_16_4_32_4_4_4n3 [49, 8, 32] [4, 16, 2] takes 0.83 ms, 8156 GFLOPS
*** 24.42 ms : kernel 25 r_32_4_49_2_16_4_128_4_4_4 [49, 4, 32] [4, 16, 2] takes 1.33 ms, 9942 GFLOPS
*** 26.54 ms : kernel 26 r_16_4_7_7_16_2_2_256_4_4_3_3 [49, 4, 16] [2, 2, 16] takes 2.12 ms, 6986 GFLOPS
*** 29.37 ms : kernel 27 r_16_16_7_7_16_2_2_128_4_4_4 [49, 16, 16] [2, 2, 16] takes 2.83 ms, 4642 GFLOPS
*** 30.70 ms : kernel 28 r_8_16_49_8_16_64_4_4_4 [49, 16, 8] [16, 8] takes 1.32 ms, 5090 GFLOPS
*** 32.19 ms : kernel 29 r_8_4_49_8_16_256_4_4_4 [49, 4, 8] [16, 8] takes 1.49 ms, 4426 GFLOPS
*** 33.73 ms : kernel 30 r_16_4_7_7_16_2_2_256_4_4_3_3n1 [49, 4, 16] [2, 2, 16] takes 1.54 ms, 9654 GFLOPS
*** 35.10 ms : kernel 31 r_8_16_49_8_16_64_4_4_4n1 [49, 16, 8] [16, 8] takes 1.38 ms, 4860 GFLOPS
*** 36.59 ms : kernel 32 r_8_4_49_8_16_256_4_4_4n1 [49, 4, 8] [16, 8] takes 1.49 ms, 4426 GFLOPS
*** 38.13 ms : kernel 33 r_16_4_7_7_16_2_2_256_4_4_3_3n2 [49, 4, 16] [2, 2, 16] takes 1.54 ms, 9654 GFLOPS
*** 39.51 ms : kernel 34 r_8_16_49_8_16_64_4_4_4n2 [49, 16, 8] [16, 8] takes 1.38 ms, 4860 GFLOPS
*** 41.00 ms : kernel 35 r_8_4_49_8_16_256_4_4_4n2 [49, 4, 8] [16, 8] takes 1.49 ms, 4426 GFLOPS
*** 42.53 ms : kernel 36 r_16_4_7_7_16_2_2_256_4_4_3_3n3 [49, 4, 16] [2, 2, 16] takes 1.54 ms, 9654 GFLOPS
*** 43.91 ms : kernel 37 r_8_16_49_8_16_64_4_4_4n3 [49, 16, 8] [16, 8] takes 1.38 ms, 4860 GFLOPS
*** 45.40 ms : kernel 38 r_8_4_49_8_16_256_4_4_4n3 [49, 4, 8] [16, 8] takes 1.49 ms, 4426 GFLOPS
*** 46.94 ms : kernel 39 r_16_4_7_7_16_2_2_256_4_4_3_3n4 [49, 4, 16] [2, 2, 16] takes 1.54 ms, 9654 GFLOPS
*** 48.32 ms : kernel 40 r_8_16_49_8_16_64_4_4_4n4 [49, 16, 8] [16, 8] takes 1.38 ms, 4860 GFLOPS
*** 49.81 ms : kernel 41 r_8_4_49_8_16_256_4_4_4n4 [49, 4, 8] [16, 8] takes 1.49 ms, 4426 GFLOPS
*** 51.34 ms : kernel 42 r_16_4_7_7_16_2_2_256_4_4_3_3n5 [49, 4, 16] [2, 2, 16] takes 1.54 ms, 9654 GFLOPS
*** 52.72 ms : kernel 43 r_8_16_49_8_16_64_4_4_4n5 [49, 16, 8] [16, 8] takes 1.38 ms, 4860 GFLOPS
*** 55.75 ms : kernel 44 r_8_8_49_8_16_256_4_4_4 [49, 8, 8] [16, 8] takes 3.03 ms, 4363 GFLOPS
*** 57.27 ms : kernel 45 r_8_8_8_16_512_3_3_7_7_4 [8, 8] [16, 8] takes 1.52 ms, 9721 GFLOPS
*** 61.02 ms : kernel 46 r_2_32_7_7_8_16_256_4_4_4 [49, 32, 2] [16, 8] takes 3.75 ms, 3506 GFLOPS
*** 62.81 ms : kernel 47 r_2_32_49_8_16_128_4_4_4 [49, 32, 2] [16, 8] takes 1.78 ms, 3732 GFLOPS
*** 64.99 ms : kernel 48 r_2_8_49_8_16_512_4_4_4 [49, 8, 2] [16, 8] takes 2.18 ms, 3019 GFLOPS
*** 66.95 ms : kernel 49 r_8_8_8_16_512_3_3_7_7_4n1 [8, 8] [16, 8] takes 1.96 ms, 7570 GFLOPS
*** 68.78 ms : kernel 50 r_2_32_49_8_16_128_4_4_4n1 [49, 32, 2] [16, 8] takes 1.83 ms, 3619 GFLOPS
*** 70.96 ms : kernel 51 r_2_8_49_8_16_512_4_4_4n1 [49, 8, 2] [16, 8] takes 2.18 ms, 3019 GFLOPS
*** 72.92 ms : kernel 52 r_8_8_8_16_512_3_3_7_7_4n2 [8, 8] [16, 8] takes 1.96 ms, 7570 GFLOPS
*** 74.75 ms : kernel 53 r_2_32_49_8_16_128_4_4_4n2 [49, 32, 2] [16, 8] takes 1.83 ms, 3619 GFLOPS
*** 74.97 ms : kernel 54 r_1024_32_49_4 [1024] [32] takes 0.22 ms, 30 GFLOPS
*** 75.15 ms : kernel 55 r_125_16_2_512_4_4_4 [125] [2, 16] takes 0.17 ms, 1503 GFLOPS
*** 75.16 ms : kernel 56 r_2_32_250_4 [2] [32] takes 0.01 ms, 5 GFLOPS
*** 75.20 ms : kernel 57 r_2_32_250_4n1 [2] [32] takes 0.04 ms, 7 GFLOPS
*** 75.20 ms : kernel 58 E_2_125_32_2_4 [125, 2] [2, 32] takes 0.00 ms, 42 GFLOPS
******* total 75.20 ms, 7037 GFLOPS
optimizing for HIP
*** 2.25 ms : kernel 0 r_64_8_7_7_2_16_4_3_7_4_4_7 [49, 8, 64] [4, 16, 2] takes 2.25 ms, 6881 GFLOPS
*** 2.58 ms : kernel 1 r_2048_7_7_2_8_8_3_3 [7, 7, 2048] [8, 8, 2] takes 0.33 ms, 351 GFLOPS
*** 2.77 ms : kernel 2 r_64_2_49_8_16_16_4_4_4 [49, 2, 64] [16, 8] takes 0.19 ms, 9245 GFLOPS
*** 3.72 ms : kernel 3 r_64_2_7_7_8_8_2_64_4_4_3_3 [49, 2, 64] [2, 8, 8] takes 0.95 ms, 15765 GFLOPS
*** 4.38 ms : kernel 4 r_64_8_49_8_16_16_4_4_4 [49, 8, 64] [16, 8] takes 0.66 ms, 9984 GFLOPS
*** 5.09 ms : kernel 5 r_64_8_49_8_16_16_4_4_4n1 [49, 8, 64] [16, 8] takes 0.71 ms, 10228 GFLOPS
*** 5.66 ms : kernel 6 r_64_2_49_8_16_64_4_4_4 [49, 2, 64] [16, 8] takes 0.57 ms, 11626 GFLOPS
*** 6.60 ms : kernel 7 r_64_2_7_7_8_8_2_64_4_4_3_3n1 [49, 2, 64] [2, 8, 8] takes 0.95 ms, 15765 GFLOPS
*** 7.32 ms : kernel 8 r_64_8_49_8_16_16_4_4_4n2 [49, 8, 64] [16, 8] takes 0.71 ms, 9875 GFLOPS
*** 7.89 ms : kernel 9 r_64_2_49_8_16_64_4_4_4n1 [49, 2, 64] [16, 8] takes 0.57 ms, 11626 GFLOPS
*** 8.84 ms : kernel 10 r_64_2_7_7_8_8_2_64_4_4_3_3n2 [49, 2, 64] [2, 8, 8] takes 0.95 ms, 15765 GFLOPS
*** 9.55 ms : kernel 11 r_64_8_49_8_16_16_4_4_4n3 [49, 8, 64] [16, 8] takes 0.71 ms, 9875 GFLOPS
*** 10.38 ms : kernel 12 r_64_4_49_8_16_64_4_4_4 [49, 4, 64] [16, 8] takes 0.83 ms, 16069 GFLOPS
*** 11.95 ms : kernel 13 r_32_2_7_7_2_16_4_128_4_4_3_3 [49, 2, 32] [4, 16, 2] takes 1.57 ms, 9461 GFLOPS
*** 13.60 ms : kernel 14 r_32_8_7_7_2_16_4_64_4_4_4 [49, 8, 32] [4, 16, 2] takes 1.65 ms, 7984 GFLOPS
*** 14.37 ms : kernel 15 r_32_8_49_2_16_4_32_4_4_4 [49, 8, 32] [4, 16, 2] takes 0.77 ms, 8986 GFLOPS
*** 15.19 ms : kernel 16 r_32_2_49_2_16_4_128_4_4_4 [49, 2, 32] [4, 16, 2] takes 0.82 ms, 8049 GFLOPS
*** 16.47 ms : kernel 17 r_32_2_7_7_2_16_4_128_4_4_3_3n1 [49, 2, 32] [4, 16, 2] takes 1.28 ms, 11623 GFLOPS
*** 17.25 ms : kernel 18 r_32_8_49_2_16_4_32_4_4_4n1 [49, 8, 32] [4, 16, 2] takes 0.78 ms, 8731 GFLOPS
*** 18.07 ms : kernel 19 r_32_2_49_2_16_4_128_4_4_4n1 [49, 2, 32] [4, 16, 2] takes 0.82 ms, 8049 GFLOPS
*** 19.35 ms : kernel 20 r_32_2_7_7_2_16_4_128_4_4_3_3n2 [49, 2, 32] [4, 16, 2] takes 1.28 ms, 11623 GFLOPS
*** 20.13 ms : kernel 21 r_32_8_49_2_16_4_32_4_4_4n2 [49, 8, 32] [4, 16, 2] takes 0.78 ms, 8731 GFLOPS
*** 20.95 ms : kernel 22 r_32_2_49_2_16_4_128_4_4_4n2 [49, 2, 32] [4, 16, 2] takes 0.82 ms, 8049 GFLOPS
*** 22.23 ms : kernel 23 r_32_2_7_7_2_16_4_128_4_4_3_3n3 [49, 2, 32] [4, 16, 2] takes 1.28 ms, 11623 GFLOPS
*** 23.01 ms : kernel 24 r_32_8_49_2_16_4_32_4_4_4n3 [49, 8, 32] [4, 16, 2] takes 0.78 ms, 8731 GFLOPS
*** 24.26 ms : kernel 25 r_32_4_49_2_16_4_128_4_4_4 [49, 4, 32] [4, 16, 2] takes 1.25 ms, 10619 GFLOPS
*** 26.39 ms : kernel 26 r_16_4_7_7_16_2_2_256_4_4_3_3 [49, 4, 16] [2, 2, 16] takes 2.13 ms, 6957 GFLOPS
*** 29.24 ms : kernel 27 r_16_16_7_7_16_2_2_128_4_4_4 [49, 16, 16] [2, 2, 16] takes 2.85 ms, 4618 GFLOPS
*** 30.42 ms : kernel 28 r_8_16_49_8_16_64_4_4_4 [49, 16, 8] [16, 8] takes 1.18 ms, 5702 GFLOPS
*** 31.83 ms : kernel 29 r_8_4_49_8_16_256_4_4_4 [49, 4, 8] [16, 8] takes 1.41 ms, 4669 GFLOPS
*** 33.35 ms : kernel 30 r_16_4_7_7_16_2_2_256_4_4_3_3n1 [49, 4, 16] [2, 2, 16] takes 1.52 ms, 9776 GFLOPS
*** 34.55 ms : kernel 31 r_8_16_49_8_16_64_4_4_4n1 [49, 16, 8] [16, 8] takes 1.20 ms, 5569 GFLOPS
*** 35.97 ms : kernel 32 r_8_4_49_8_16_256_4_4_4n1 [49, 4, 8] [16, 8] takes 1.41 ms, 4669 GFLOPS
*** 37.48 ms : kernel 33 r_16_4_7_7_16_2_2_256_4_4_3_3n2 [49, 4, 16] [2, 2, 16] takes 1.52 ms, 9776 GFLOPS
*** 38.68 ms : kernel 34 r_8_16_49_8_16_64_4_4_4n2 [49, 16, 8] [16, 8] takes 1.20 ms, 5569 GFLOPS
*** 40.10 ms : kernel 35 r_8_4_49_8_16_256_4_4_4n2 [49, 4, 8] [16, 8] takes 1.41 ms, 4669 GFLOPS
*** 41.61 ms : kernel 36 r_16_4_7_7_16_2_2_256_4_4_3_3n3 [49, 4, 16] [2, 2, 16] takes 1.52 ms, 9776 GFLOPS
*** 42.82 ms : kernel 37 r_8_16_49_8_16_64_4_4_4n3 [49, 16, 8] [16, 8] takes 1.20 ms, 5569 GFLOPS
*** 44.23 ms : kernel 38 r_8_4_49_8_16_256_4_4_4n3 [49, 4, 8] [16, 8] takes 1.41 ms, 4669 GFLOPS
*** 45.75 ms : kernel 39 r_16_4_7_7_16_2_2_256_4_4_3_3n4 [49, 4, 16] [2, 2, 16] takes 1.52 ms, 9776 GFLOPS
*** 46.95 ms : kernel 40 r_8_16_49_8_16_64_4_4_4n4 [49, 16, 8] [16, 8] takes 1.20 ms, 5569 GFLOPS
*** 48.36 ms : kernel 41 r_8_4_49_8_16_256_4_4_4n4 [49, 4, 8] [16, 8] takes 1.41 ms, 4669 GFLOPS
*** 49.88 ms : kernel 42 r_16_4_7_7_16_2_2_256_4_4_3_3n5 [49, 4, 16] [2, 2, 16] takes 1.52 ms, 9776 GFLOPS
*** 51.08 ms : kernel 43 r_8_16_49_8_16_64_4_4_4n5 [49, 16, 8] [16, 8] takes 1.20 ms, 5569 GFLOPS
*** 53.78 ms : kernel 44 r_8_8_49_8_16_256_4_4_4 [49, 8, 8] [16, 8] takes 2.70 ms, 4896 GFLOPS
*** 100.26 ms : kernel 45 r_8_8_8_16_512_3_3_7_7_4 [8, 8] [16, 8] takes 46.48 ms, 319 GFLOPS
*** 104.56 ms : kernel 46 r_2_32_7_7_8_16_256_4_4_4 [49, 32, 2] [16, 8] takes 4.31 ms, 3055 GFLOPS
*** 106.54 ms : kernel 47 r_2_32_49_8_16_128_4_4_4 [49, 32, 2] [16, 8] takes 1.98 ms, 3367 GFLOPS
*** 108.96 ms : kernel 48 r_2_8_49_8_16_512_4_4_4 [49, 8, 2] [16, 8] takes 2.41 ms, 2731 GFLOPS
*** 126.40 ms : kernel 49 r_8_8_8_16_512_3_3_7_7_4n1 [8, 8] [16, 8] takes 17.44 ms, 849 GFLOPS
*** 128.35 ms : kernel 50 r_2_32_49_8_16_128_4_4_4n1 [49, 32, 2] [16, 8] takes 1.95 ms, 3401 GFLOPS
*** 130.76 ms : kernel 51 r_2_8_49_8_16_512_4_4_4n1 [49, 8, 2] [16, 8] takes 2.41 ms, 2731 GFLOPS
*** 148.20 ms : kernel 52 r_8_8_8_16_512_3_3_7_7_4n2 [8, 8] [16, 8] takes 17.44 ms, 849 GFLOPS
*** 150.15 ms : kernel 53 r_2_32_49_8_16_128_4_4_4n2 [49, 32, 2] [16, 8] takes 1.95 ms, 3401 GFLOPS
*** 150.24 ms : kernel 54 r_1024_32_49_4 [1024] [32] takes 0.08 ms, 79 GFLOPS
*** 150.43 ms : kernel 55 r_125_16_2_512_4_4_4 [125] [2, 16] takes 0.19 ms, 1382 GFLOPS
*** 150.45 ms : kernel 56 r_2_32_250_4 [2] [32] takes 0.03 ms, 2 GFLOPS
*** 150.53 ms : kernel 57 r_2_32_250_4n1 [2] [32] takes 0.07 ms, 3 GFLOPS
*** 150.54 ms : kernel 58 E_2_125_32_2_4 [125, 2] [2, 32] takes 0.01 ms, 9 GFLOPS
******* total 150.54 ms, 3515 GFLOPS

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
using GPU backend
using HIP backend
using LLaMA-7B model
Traceback (most recent call last):
File "/home/jebba/devel/tinygrad/tinygrad/examples/llama.py", line 386, in <module>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,44 +1,41 @@
0%| | 0/50 [00:00<?, ?it/s] loss 2.35 accuracy 0.05: 0%| | 0/50 [00:07<?, ?it/s] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] 5: 0%| | 0/50 [00:07<?, ?it/s] 5: 0%| | 0/50 [00:07<?, ?it/s] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 0.48 accuracy 0.87: 54%|█████▍ | 27/50 [00:09<00:03, 5.97it/s] loss 0.49 accuracy 0.86: 54%|█████▍ | 27/50 [00:09<00:03, 5.97it/s] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 0.47 accuracy 0.86: 80%|████████ | 40/50 [00:09<00:00, 10.55it/s] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] it/s]
0%| | 0/16 [00:00<?, ?it/s] 6%|▋ | 1/16 [00:01<00:26, 1.77s/it] 62%|██████▎ | 10/16 [00:01<00:00, 7.23it/s] 100%|██████████| 16/16 [00:03<00:00, 4.55it/s] 100%|██████████| 16/16 [00:03<00:00, 4.27it/s]
test set accuracy is 0.868250
0%| | 0/50 [00:00<?, ?it/s] loss 2.35 accuracy 0.07: 0%| | 0/50 [00:04<?, ?it/s] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 0.49 accuracy 0.87: 60%|██████ | 30/50 [00:06<00:02, 9.73it/s] loss 0.45 accuracy 0.89: 60%|██████ | 30/50 [00:06<00:02, 9.73it/s] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.19 accuracy 0.18: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.15 accuracy 0.22: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.09 accuracy 0.27: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.02 accuracy 0.31: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.21 accuracy 0.20: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 0.42 accuracy 0.87: 60%|██████ | 30/50 [00:06<00:02, 9.73it/s] loss 1.99 accuracy 0.30: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 1.99 accuracy 0.30: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] 5: 0%| | 0/50 [00:07<?, ?it/s] loss 1.99 accuracy 0.30: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.35 accuracy 0.05: 2%|▏ | 1/50 [00:07<06:05, 7.47s/it] loss 1.99 accuracy 0.30: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 2%|▏ | 1/50 [00:09<06:05, 7.47s/it] loss 1.99 accuracy 0.30: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 2.26 accuracy 0.17: 4%|▍ | 2/50 [00:09<03:24, 4.26s/it] loss 0.39 accuracy 0.86: 88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s] loss 0.35 accuracy 0.89: 88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s] loss 0.38 accuracy 0.87: 88%|████████▊ | 44/50 [00:06<00:00, 16.68it/s] loss 0.38 accuracy 0.87: 100%|██████████| 50/50 [00:06<00:00, 7.55it/s]
0%| | 0/16 [00:00<?, ?it/s] 6%|▋ | 1/16 [00:01<00:17, 1.15s/it] 62%|██████▎ | 10/16 [00:01<00:00, 10.61it/s] 100%|██████████| 16/16 [00:02<00:00, 6.73it/s] 100%|██████████| 16/16 [00:02<00:00, 6.35it/s]
test set accuracy is 0.867750
reducing lr to 0.0025
0%| | 0/50 [00:00<?, ?it/s] loss 0.39 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.41 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.37 accuracy 0.87: 30%|███ | 15/50 [00:00<00:00, 56.16it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.37 accuracy 0.87: 30%|███ | 15/50 [00:00<00:00, 56.16it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.36 accuracy 0.88: 30%|███ | 15/50 [00:00<00:00, 56.16it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.35 accuracy 0.87: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.37 accuracy 0.86: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.34 accuracy 0.88: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.35 accuracy 0.88: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.34 accuracy 0.89: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.32 accuracy 0.88: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.31 accuracy 0.89: 56%|█████▌ | 28/50 [00:00<00:00, 82.18it/s] loss 0.31 accuracy 0.89: 82%|████████▏ | 41/50 [00:00<00:00, 98.03it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.33 accuracy 0.88: 82%|████████▏ | 41/50 [00:00<00:00, 98.03it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.32 accuracy 0.90: 82%|████████▏ | 41/50 [00:00<00:00, 98.03it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.31 accuracy 0.90: 82%|████████▏ | 41/50 [00:00<00:00, 98.03it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.33 accuracy 0.89: 100%|██████████| 50/50 [00:00<00:00, 84.11it/s]
0%| | 0/16 [00:00<?, ?it/s] 62%|██████▎ | 10/16 [00:00<00:00, 91.08it/s] 100%|██████████| 16/16 [00:00<00:00, 91.47it/s]
test set accuracy is 0.893917
0%| | 0/50 [00:00<?, ?it/s] loss 0.36 accuracy 0.88: 0%| | 0/50 [00:00<?, ?it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.40 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 10.36it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 10.36it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 10.36it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.36 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 10.36it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 10.36it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 10.36it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.85: 32%|███▏ | 16/50 [00:00<00:00, 64.82it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.33 accuracy 0.88: 32%|███▏ | 16/50 [00:00<00:00, 64.82it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] 9 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.36 accuracy 0.86: 32%|███▏ | 16/50 [00:00<00:00, 64.82it/s] loss 0.34 accuracy 0.88: 32%|███▏ | 16/50 [00:00<00:00, 64.82it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.33 accuracy 0.89: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.45 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.35 accuracy 0.87: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.40 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.29 accuracy 0.90: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.30 accuracy 0.89: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.29 accuracy 0.90: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.30 accuracy 0.89: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.31 accuracy 0.88: 60%|██████ | 30/50 [00:00<00:00, 92.64it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.53 accuracy 0.83: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.28 accuracy 0.90: 88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.44 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.27 accuracy 0.90: 88%|████████▊ | 44/50 [00:00<00:00, 108.77it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.43 accuracy 0.85: 4%|▍ | 2/50 [00:00<00:05, 9.08it/s] loss 0.25 accuracy 0.91: 100%|██████████| 50/50 [00:00<00:00, 92.45it/s]
0%| | 0/16 [00:00<?, ?it/s] 38%|███▊ | 6/16 [00:00<00:00, 56.07it/s] 94%|█████████▍| 15/16 [00:00<00:00, 74.99it/s] 100%|██████████| 16/16 [00:00<00:00, 72.89it/s]
test set accuracy is 0.898583
reducing lr to 0.0021
0%| | 0/50 [00:00<?, ?it/s] loss 0.32 accuracy 0.89: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.29 accuracy 0.90: 30%|███ | 15/50 [00:00<00:00, 64.02it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.26 accuracy 0.91: 30%|███ | 15/50 [00:00<00:00, 64.02it/s] loss 0.26 accuracy 0.92: 30%|███ | 15/50 [00:00<00:00, 64.02it/s] loss 0.23 accuracy 0.92: 30%|███ | 15/50 [00:00<00:00, 64.02it/s] loss 0.23 accuracy 0.92: 56%|█████▌ | 28/50 [00:00<00:00, 89.01it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.22 accuracy 0.92: 56%|█████▌ | 28/50 [00:00<00:00, 89.01it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.23 accuracy 0.92: 56%|█████▌ | 28/50 [00:00<00:00, 89.01it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.18 accuracy 0.93: 82%|████████▏ | 41/50 [00:00<00:00, 102.92it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.26 accuracy 0.93: 82%|████████▏ | 41/50 [00:00<00:00, 102.92it/s] loss 0.20 accuracy 0.93: 82%|████████▏ | 41/50 [00:00<00:00, 102.92it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] it/s]
0%| | 0/16 [00:00<?, ?it/s] 38%|███▊ | 6/16 [00:00<00:00, 59.78it/s] 100%|██████████| 16/16 [00:00<00:00, 79.86it/s] 100%|██████████| 16/16 [00:00<00:00, 76.92it/s]
test set accuracy is 0.941333
0%| | 0/50 [00:00<?, ?it/s] loss 0.32 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.28 accuracy 0.90: 4%|▍ | 2/50 [00:00<00:04, 10.42it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.30 accuracy 0.90: 4%|▍ | 2/50 [00:00<00:04, 10.42it/s] loss 0.30 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 10.42it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.25 accuracy 0.90: 32%|███▏ | 16/50 [00:00<00:00, 65.34it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.23 accuracy 0.91: 32%|███▏ | 16/50 [00:00<00:00, 65.34it/s] loss 0.24 accuracy 0.91: 32%|███▏ | 16/50 [00:00<00:00, 65.34it/s] loss 0.22 accuracy 0.93: 32%|███▏ | 16/50 [00:00<00:00, 65.34it/s] loss 0.24 accuracy 0.90: 32%|███▏ | 16/50 [00:00<00:00, 65.34it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.23 accuracy 0.91: 32%|███▏ | 16/50 [00:00<00:00, 65.34it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.39 accuracy 0.86: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.31 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 0%| | 0/50 [00:00<?, ?it/s] loss 0.16 accuracy 0.93: 62%|██████▏ | 31/50 [00:00<00:00, 95.09it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.22 accuracy 0.91: 62%|██████▏ | 31/50 [00:00<00:00, 95.09it/s] loss 0.21 accuracy 0.92: 62%|██████▏ | 31/50 [00:00<00:00, 95.09it/s] loss 0.33 accuracy 0.89: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.35 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.19 accuracy 0.92: 92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] 9: 0%| | 0/50 [00:00<?, ?it/s] loss 0.21 accuracy 0.92: 92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s] loss 0.22 accuracy 0.92: 92%|█████████▏| 46/50 [00:00<00:00, 111.79it/s] loss 0.34 accuracy 0.88: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] loss 0.38 accuracy 0.87: 4%|▍ | 2/50 [00:00<00:04, 11.41it/s] it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 86.95it/s] 100%|██████████| 16/16 [00:00<00:00, 73.27it/s]
test set accuracy is 0.929500
reducing lr to 0.0017
0%| | 0/50 [00:00<?, ?it/s] loss 0.21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.14 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.17 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.10 accuracy 0.96: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.08 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] loss 0.12 accuracy 0.95: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] loss 0.11 accuracy 0.97: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] loss 0.12 accuracy 0.95: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.14 accuracy 0.94: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.07 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 89.16it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.08 accuracy 0.97: 56%|█████▌ | 28/50 [00:00<00:00, 89.16it/s] loss 0.08 accuracy 0.97: 56%|█████▌ | 28/50 [00:00<00:00, 89.16it/s] loss 0.08 accuracy 0.98: 56%|█████▌ | 28/50 [00:00<00:00, 89.16it/s] loss 0.08 accuracy 0.98: 56%|█████▌ | 28/50 [00:00<00:00, 89.16it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.08 accuracy 0.98: 56%|█████▌ | 28/50 [00:00<00:00, 89.16it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.06 accuracy 0.98: 82%|████████▏ | 41/50 [00:00<00:00, 103.00it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 89.80it/s] 100%|██████████| 16/16 [00:00<00:00, 76.47it/s]
test set accuracy is 0.981167
0%| | 0/50 [00:00<?, ?it/s] loss 0.loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.92: 4%|▍ | 2/50 [00:00<00:04, 10.44it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 10.44it/s] loss 0.18 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 10.44it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.92: 4%|▍ | 2/50 [00:00<00:04, 10.44it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.17 accuracy 0.92: 32%|███▏ | 16/50 [00:00<00:00, 65.19it/s] loss 0.18 accuracy 0.93: 32%|███▏ | 16/50 [00:00<00:00, 65.19it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.12 accuracy 0.96: 32%|███▏ | 16/50 [00:00<00:00, 65.19it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.12 accuracy 0.95: 32%|███▏ | 16/50 [00:00<00:00, 65.19it/s] loss 0.12 accuracy 0.95: 60%|██████ | 30/50 [00:00<00:00, 92.90it/s] loss 0.14 accuracy 0.93: 60%|██████ | 30/50 [00:00<00:00, 92.90it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.12 accuracy 0.95: 60%|██████ | 30/50 [00:00<00:00, 92.90it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.95: 60%|██████ | 30/50 [00:00<00:00, 92.90it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] 21 accuracy 0.92: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.98: 60%|██████ | 30/50 [00:00<00:00, 92.90it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 0%| | 0/50 [00:00<?, ?it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.19 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.95: 60%|██████ | 30/50 [00:00<00:00, 92.90it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.15 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.13 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.18 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.21 accuracy 0.93: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.94: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.09 accuracy 0.97: 90%|█████████ | 45/50 [00:00<00:00, 110.13it/s] loss 0.16 accuracy 0.95: 4%|▍ | 2/50 [00:00<00:04, 11.49it/s] loss 0.07 accuracy 0.98: 90%|█████████ | 45/50 [00:00<00:00, 110.13it/s] loss 0.07 accuracy 0.98: 100%|██████████| 50/50 [00:00<00:00, 93.05it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 87.34it/s] 100%|██████████| 16/16 [00:00<00:00, 73.32it/s]
test set accuracy is 0.979917
reducing lr to 0.0014
0%| | 0/50 [00:00<?, ?it/s] loss 0.06 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.06 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] loss 0.06 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 30%|███ | 15/50 [00:00<00:00, 64.21it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.04 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 89.20it/s] loss 0.04 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 89.20it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.03 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 89.20it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.03 accuracy 0.99: 82%|████████▏ | 41/50 [00:00<00:00, 103.32it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.04 accuracy 0.99: 82%|████████▏ | 41/50 [00:00<00:00, 103.32it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.03 accuracy 0.99: 100%|██████████| 50/50 [00:00<00:00, 91.01it/s]
0%| | 0/16 [00:00<?, ?it/s] 62%|██████▎ | 10/16 [00:00<00:00, 91.17it/s] 100%|██████████| 16/16 [00:00<00:00, 91.00it/s]
test set accuracy is 0.993833
0%| | 0/50 [00:00<?, ?it/s] loss 0.0loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.11 accuracy 0.96: 4%|▍ | 2/50 [00:00<00:04, 10.35it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.10 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 10.35it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 10.35it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] 6 accuracy 0.98: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 32%|███▏ | 16/50 [00:00<00:00, 65.06it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.98: 32%|███▏ | 16/50 [00:00<00:00, 65.06it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 32%|███▏ | 16/50 [00:00<00:00, 65.06it/s] loss 0.08 accuracy 0.97: 32%|███▏ | 16/50 [00:00<00:00, 65.06it/s] loss 0.08 accuracy 0.97: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.10 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.97: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.07 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.08 accuracy 0.98: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.07 accuracy 0.98: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.10 accuracy 0.97: 60%|██████ | 30/50 [00:00<00:00, 92.63it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.09 accuracy 0.97: 0%| | 0/50 [00:00<?, ?it/s] loss 0.06 accuracy 0.99: 90%|█████████ | 45/50 [00:00<00:00, 109.96it/s] loss 0.05 accuracy 0.99: 90%|█████████ | 45/50 [00:00<00:00, 109.96it/s] loss 0.06 accuracy 0.99: 90%|█████████ | 45/50 [00:00<00:00, 109.96it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:04, 11.46it/s] loss 0.06 accuracy 0.98: 90%|█████████ | 45/50 [00:00<00:00, 109.96it/s] loss 0.06 accuracy 0.98: 90%|█████████ | 45/50 [00:00<00:00, 109.96it/s] loss 0.06 accuracy 0.98: 100%|██████████| 50/50 [00:00<00:00, 92.77it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 86.92it/s] 100%|██████████| 16/16 [00:00<00:00, 87.32it/s]
test set accuracy is 0.988917
reducing lr to 0.0012
0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.08 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.02 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.04 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.05 accuracy 0.98: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.04 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.02 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.94it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.03 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.02 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.03 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 90.27it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] it/s]
0%| | 0/16 [00:00<?, ?it/s] 62%|██████▎ | 10/16 [00:00<00:00, 90.17it/s] 100%|██████████| 16/16 [00:00<00:00, 90.08it/s]
test set accuracy is 0.996000
0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:06, 8.05it/s] loss 0.13 accuracy 0.95: 2%|▏ | 1/50 [00:00<00:06, 8.05it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:05, 8.86it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.07 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:05, 8.86it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:05, 8.86it/s] loss 0.08 accuracy 0.97: 4%|▍ | 2/50 [00:00<00:05, 8.86it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.18it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.05 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.06 accuracy 0.98: 32%|███▏ | 16/50 [00:00<00:00, 64.18it/s] loss 0.04 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.18it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.03 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.18it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.06 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.04 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.03 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.04 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.05 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.03 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.05 accuracy 0.98: 60%|██████ | 30/50 [00:00<00:00, 92.25it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] 4 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.03 accuracy 0.99: 88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.08 accuracy 0.97: 2%|▏ | 1/50 [00:00<00:05, 8.52it/s] loss 0.06 accuracy 0.99: 88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s] loss 0.03 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s] loss 0.03 accuracy 0.99: 88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s] loss 0.03 accuracy 0.99: 88%|████████▊ | 44/50 [00:00<00:00, 108.73it/s] loss 0.05 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.05 accuracy 0.98: 8%|▊ | 4/50 [00:00<00:02, 19.79it/s] loss 0.04 accuracy 0.99: 100%|██████████| 50/50 [00:00<00:00, 86.77it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 87.70it/s] 100%|██████████| 16/16 [00:00<00:00, 87.74it/s]
test set accuracy is 0.995000
reducing lr to 0.0010
0%| | 0/50 [00:00<?, ?it/s] loss 0.03 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.03 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.20it/s] loss 0.05 accuracy 0.98: 2%|▏ | 1/50 [00:00<00:05, 8.20it/s] loss 0.06 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:05, 8.20it/s] loss 0.04 accuracy 1.00: 2%|▏ | 1/50 [00:00<00:05, 8.20it/s] loss 0.04 accuracy 1.00: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.02 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.08 accuracy 0.97: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.02 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.04 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.02 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.02 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.02 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 8%|▊ | 4/50 [00:00<00:02, 19.49it/s] loss 0.03 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.02 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.02 accuracy 0.99: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 34%|███▍ | 17/50 [00:00<00:00, 66.35it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.02 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.03 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 89.87it/s] loss 0.01 accuracy 0.99: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.00 accuracy 1.00: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.02 accuracy 1.00: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.01 accuracy 1.00: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.01 accuracy 1.00: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.01 accuracy 1.00: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.01 accuracy 0.99: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.02 accuracy 0.99: 86%|████████▌ | 43/50 [00:00<00:00, 103.28it/s] loss 0.02 accuracy 0.99: 100%|██████████| 50/50 [00:00<00:00, 85.26it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 89.31it/s] 100%|██████████| 16/16 [00:00<00:00, 89.70it/s]
test set accuracy is 0.999917
reducing lr to 0.0008
0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.03 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.03 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.59it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.04 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.02 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.03 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.90it/s] loss 0.03 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.02 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.04 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.50it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.03 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.02 accuracy 0.99: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.68it/s] loss 0.01 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 85.24it/s]
0%| | 0/16 [00:00<?, ?it/s] 62%|██████▎ | 10/16 [00:00<00:00, 90.34it/s] 100%|██████████| 16/16 [00:00<00:00, 91.04it/s]
0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.04 accuracy 1.00: 2%|▏ | 1/50 [00:00<00:06, 7.91it/s] loss 0.03 accuracy 0.99: 2%|▏ | 1/50 [00:00<00:06, 7.91it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.04 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.05 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.04 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.06 accuracy 0.98: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.04 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.07 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.05 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.81it/s] loss 0.03 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.03 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 64.03it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.03 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.02 accuracy 0.99: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 91.97it/s] loss 0.01 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.02 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.03 accuracy 0.99: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.01 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.02 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.02 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.01 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 108.20it/s] loss 0.01 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 86.26it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 86.56it/s] 100%|██████████| 16/16 [00:00<00:00, 87.15it/s]
test set accuracy is 0.999583
reducing lr to 0.0008
0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.02 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.03 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.03 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.69it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.03 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.03 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.02 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.53it/s] loss 0.02 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.02 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.02 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 76.19it/s] loss 0.01 accuracy 1.00: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.01 accuracy 1.00: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.01 accuracy 1.00: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.01 accuracy 1.00: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.02 accuracy 1.00: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.01 accuracy 1.00: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.02 accuracy 0.99: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.06 accuracy 0.99: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.01 accuracy 0.99: 84%|████████▍ | 42/50 [00:00<00:00, 96.43it/s] loss 0.01 accuracy 0.99: 100%|██████████| 50/50 [00:00<00:00, 82.55it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 86.23it/s] 100%|██████████| 16/16 [00:00<00:00, 87.04it/s]
test set accuracy is 1.000000
reducing lr to 0.0007
0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.02 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.02 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.03 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.03 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.01 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.02 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 83.23it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.02 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.01 accuracy 0.99: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.01 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 98.74it/s] loss 0.00 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 85.08it/s]
0%| | 0/16 [00:00<?, ?it/s] 62%|██████▎ | 10/16 [00:00<00:00, 91.26it/s] 100%|██████████| 16/16 [00:00<00:00, 91.49it/s]
test set accuracy is 0.999833
0%| | 0/50 [00:00<?, ?it/s] loss 0.0loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] 0 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.63it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.03 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.02 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 0.99: 32%|███▏ | 16/50 [00:00<00:00, 58.20it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.38it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.02 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 103.89it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.02 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 57.33it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 9.42it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 87.82it/s] 100%|██████████| 16/16 [00:00<00:00, 88.10it/s]
test set accuracy is 1.000000
reducing lr to 0.0006
0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.02 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.01 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.01 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.02 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.02 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.02 accuracy 0.99: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.02 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.01 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.02 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.00 accuracy 1.00: 56%|█████▌ | 28/50 [00:00<00:00, 88.72it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.00 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.02 accuracy 1.00: 82%|████████▏ | 41/50 [00:00<00:00, 102.54it/s] loss 0.02 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 90.37it/s]
0%| | 0/16 [00:00<?, ?it/s] 44%|████▍ | 7/16 [00:00<00:00, 63.65it/s] 100%|██████████| 16/16 [00:00<00:00, 77.29it/s]
test set accuracy is 0.999750
0%| | 0/50 [00:00<?, ?it/s] loss 0.00 accuracy 1.00: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] 0.99: 0%| | 0/50 [00:00<?, ?it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.01 accuracy 1.00: 4%|▍ | 2/50 [00:00<00:05, 8.75it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.01 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 32%|███▏ | 16/50 [00:00<00:00, 58.80it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.01 accuracy 0.99: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.01 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.02 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.00 accuracy 1.00: 60%|██████ | 30/50 [00:00<00:00, 86.77it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.01 accuracy 0.99: 4%|▍ | 2/50 [00:00<00:04, 11.42it/s] loss 0.00 accuracy 1.00: 30%|███ | 15/50 [00:00<00:00, 63.95it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 88%|████████▊ | 44/50 [00:00<00:00, 104.25it/s] loss 0.00 accuracy 1.00: 100%|██████████| 50/50 [00:00<00:00, 86.80it/s]
0%| | 0/16 [00:00<?, ?it/s] 56%|█████▋ | 9/16 [00:00<00:00, 87.44it/s] 100%|██████████| 16/16 [00:00<00:00, 87.77it/s]
test set accuracy is 1.000000
reducing lr to 0.0005
04 + 04 = 004 (correct: 008)
09 + 09 = 019 (correct: 018)
00 + 10 = 000 (correct: 010)
Wrong predictions: 3, acc = 0.9998
Wrong predictions: 0, acc = 1.0000

View File

@ -1 +1 @@
(1, 1000) 208 16.274181 Labrador retriever
(1, 1000) 208 16.274183 Labrador retriever

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -1,18 +1,3 @@
Downloading https://github.com/ultralytics/assets/releases/download/v8.1.0/yolov8n-seg.pt to 'yolov8n-seg.pt'...
0%| | 0.00/6.73M [00:00<?, ?B/s] 7%|▋ | 512k/6.73M [00:00<00:01, 4.53MB/s] 20%|██ | 1.38M/6.73M [00:00<00:00, 6.83MB/s] 35%|███▌ | 2.38M/6.73M [00:00<00:00, 8.12MB/s] 50%|█████ | 3.38M/6.73M [00:00<00:00, 8.86MB/s] 69%|██████▊ | 4.62M/6.73M [00:00<00:00, 9.90MB/s] 87%|████████▋ | 5.88M/6.73M [00:00<00:00, 10.9MB/s] 100%|██████████| 6.73M/6.73M [00:00<00:00, 9.95MB/s]
Ultralytics YOLOv8.1.8 🚀 Python-3.11.2 torch-2.3.0.dev20240206+rocm5.7 CPU (AMD EPYC 7662 64-Core Processor)
YOLOv8n-seg summary (fused): 195 layers, 3404320 parameters, 0 gradients, 12.6 GFLOPs
PyTorch: starting from 'yolov8n-seg.pt' with input shape (1, 3, 480, 640) BCHW and output shape(s) ((1, 116, 6300), (1, 32, 120, 160)) (6.7 MB)
ONNX: starting export with onnx 1.15.0 opset 17...
ONNX: export success ✅ 0.7s, saved as 'yolov8n-seg.onnx' (13.2 MB)
Export complete (2.2s)
Results saved to /tmp
Predict: yolo predict task=segment model=yolov8n-seg.onnx imgsz=480,640
Validate: yolo val task=segment model=yolov8n-seg.onnx imgsz=480,640 data=coco.yaml WARNING ⚠️ non-PyTorch val requires square images, 'imgsz=[480, 640]' will not work. Use export 'imgsz=640' if val is required.
Visualize: https://netron.app
{'images': (1, 3, 480, 640)}
0: op Conv shape [(1, 3, 480, 640), (16, 3, 3, 3), (16,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (2, 2)}
1: op Sigmoid shape [(1, 16, 240, 320)] opt {}
@ -23,7 +8,7 @@ Visualize: https://netron.app
6: op Conv shape [(1, 32, 120, 160), (32, 32, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
7: op Sigmoid shape [(1, 32, 120, 160)] opt {}
8: op Mul shape [(1, 32, 120, 160), (1, 32, 120, 160)] opt {}
9: op Constant shape [] opt {'value': <Tensor <LB GPU (2,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
9: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
10: op Split shape [(1, 32, 120, 160), (2,)] opt {'axis': 1}
11: op Conv shape [(1, 16, 120, 160), (16, 16, 3, 3), (16,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
12: op Sigmoid shape [(1, 16, 120, 160)] opt {}
@ -42,7 +27,7 @@ Visualize: https://netron.app
25: op Conv shape [(1, 64, 60, 80), (64, 64, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
26: op Sigmoid shape [(1, 64, 60, 80)] opt {}
27: op Mul shape [(1, 64, 60, 80), (1, 64, 60, 80)] opt {}
28: op Constant shape [] opt {'value': <Tensor <LB GPU (2,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
28: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
29: op Split shape [(1, 64, 60, 80), (2,)] opt {'axis': 1}
30: op Conv shape [(1, 32, 60, 80), (32, 32, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
31: op Sigmoid shape [(1, 32, 60, 80)] opt {}
@ -68,7 +53,7 @@ Visualize: https://netron.app
51: op Conv shape [(1, 128, 30, 40), (128, 128, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
52: op Sigmoid shape [(1, 128, 30, 40)] opt {}
53: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
54: op Constant shape [] opt {'value': <Tensor <LB GPU (2,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
54: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
55: op Split shape [(1, 128, 30, 40), (2,)] opt {'axis': 1}
56: op Conv shape [(1, 64, 30, 40), (64, 64, 3, 3), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
57: op Sigmoid shape [(1, 64, 30, 40)] opt {}
@ -94,7 +79,7 @@ Visualize: https://netron.app
77: op Conv shape [(1, 256, 15, 20), (256, 256, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
78: op Sigmoid shape [(1, 256, 15, 20)] opt {}
79: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
80: op Constant shape [] opt {'value': <Tensor <LB GPU (2,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
80: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
81: op Split shape [(1, 256, 15, 20), (2,)] opt {'axis': 1}
82: op Conv shape [(1, 128, 15, 20), (128, 128, 3, 3), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
83: op Sigmoid shape [(1, 128, 15, 20)] opt {}
@ -117,7 +102,7 @@ Visualize: https://netron.app
100: op Conv shape [(1, 512, 15, 20), (256, 512, 1, 1), (256,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
101: op Sigmoid shape [(1, 256, 15, 20)] opt {}
102: op Mul shape [(1, 256, 15, 20), (1, 256, 15, 20)] opt {}
103: op Constant shape [] opt {'value': <Tensor <LB GPU (4,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
103: op Constant shape [] opt {'value': <Tensor <LB HIP (4,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
104: op Resize shape [(1, 256, 15, 20), None, (4,)] opt {'coordinate_transformation_mode': 'asymmetric', 'cubic_coeff_a': -0.75, 'mode': 'nearest', 'nearest_mode': 'floor'}
105: op Concat shape [(1, 256, 30, 40), (1, 128, 30, 40)] opt {'axis': 1}
106: op Conv shape [(1, 384, 30, 40), (128, 384, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
@ -134,7 +119,7 @@ Visualize: https://netron.app
117: op Conv shape [(1, 192, 30, 40), (128, 192, 1, 1), (128,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
118: op Sigmoid shape [(1, 128, 30, 40)] opt {}
119: op Mul shape [(1, 128, 30, 40), (1, 128, 30, 40)] opt {}
120: op Constant shape [] opt {'value': <Tensor <LB GPU (4,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
120: op Constant shape [] opt {'value': <Tensor <LB HIP (4,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
121: op Resize shape [(1, 128, 30, 40), None, (4,)] opt {'coordinate_transformation_mode': 'asymmetric', 'cubic_coeff_a': -0.75, 'mode': 'nearest', 'nearest_mode': 'floor'}
122: op Concat shape [(1, 128, 60, 80), (1, 64, 60, 80)] opt {'axis': 1}
123: op Conv shape [(1, 192, 60, 80), (64, 192, 1, 1), (64,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
@ -204,9 +189,9 @@ Visualize: https://netron.app
187: op Sigmoid shape [(1, 32, 60, 80)] opt {}
188: op Mul shape [(1, 32, 60, 80), (1, 32, 60, 80)] opt {}
189: op Conv shape [(1, 32, 60, 80), (32, 32, 1, 1), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
190: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
191: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
192: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
190: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
191: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
192: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
193: op Reshape shape [(1, 32, 60, 80), (3,)] opt {'allowzero': 0}
194: op Conv shape [(1, 128, 30, 40), (32, 128, 3, 3), (32,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (3, 3), 'pads': (1, 1, 1, 1), 'strides': (1, 1)}
195: op Sigmoid shape [(1, 32, 30, 40)] opt {}
@ -270,37 +255,37 @@ Visualize: https://netron.app
253: op Mul shape [(1, 80, 15, 20), (1, 80, 15, 20)] opt {}
254: op Conv shape [(1, 80, 15, 20), (80, 80, 1, 1), (80,)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
255: op Concat shape [(1, 64, 15, 20), (1, 80, 15, 20)] opt {'axis': 1}
256: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
257: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
258: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
256: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
257: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
258: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
259: op Reshape shape [(1, 144, 60, 80), (3,)] opt {'allowzero': 0}
260: op Reshape shape [(1, 144, 30, 40), (3,)] opt {'allowzero': 0}
261: op Reshape shape [(1, 144, 15, 20), (3,)] opt {'allowzero': 0}
262: op Concat shape [(1, 144, 4800), (1, 144, 1200), (1, 144, 300)] opt {'axis': 2}
263: op Constant shape [] opt {'value': <Tensor <LB GPU (2,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
263: op Constant shape [] opt {'value': <Tensor <LB HIP (2,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
264: op Split shape [(1, 144, 6300), (2,)] opt {'axis': 1}
265: op Constant shape [] opt {'value': <Tensor <LB GPU (4,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
265: op Constant shape [] opt {'value': <Tensor <LB HIP (4,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
266: op Reshape shape [(1, 64, 6300), (4,)] opt {'allowzero': 0}
267: op Transpose shape [(1, 4, 16, 6300)] opt {'perm': (0, 2, 1, 3)}
268: op Softmax shape [(1, 16, 4, 6300)] opt {'axis': 1}
269: op Conv shape [(1, 16, 4, 6300), (1, 16, 1, 1)] opt {'dilations': (1, 1), 'group': 1, 'kernel_shape': (1, 1), 'pads': (0, 0, 0, 0), 'strides': (1, 1)}
270: op Constant shape [] opt {'value': <Tensor <LB GPU (3,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
270: op Constant shape [] opt {'value': <Tensor <LB HIP (3,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
271: op Reshape shape [(1, 1, 4, 6300), (3,)] opt {'allowzero': 0}
272: op Shape shape [(1, 4, 6300)] opt {}
273: op Constant shape [] opt {'value': <Tensor <LB GPU (1,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
273: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
274: op Gather shape [(3,), (1,)] opt {'axis': 0}
275: op Constant shape [] opt {'value': <Tensor <LB GPU (1,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
276: op Constant shape [] opt {'value': <Tensor <LB GPU (1,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
275: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
276: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
277: op Add shape [(1,), (1,)] opt {}
278: op Constant shape [] opt {'value': <Tensor <LB GPU (1,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
278: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
279: op Div shape [(1,), (1,)] opt {}
280: op Constant shape [] opt {'value': <Tensor <LB GPU (1,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
280: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
281: op Mul shape [(1,), (1,)] opt {}
282: op Slice shape [(1, 4, 6300), (1,), (1,), (1,)] opt {}
283: op Constant shape [] opt {'value': <Tensor <LB GPU (1,) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
283: op Constant shape [] opt {'value': <Tensor <LB HIP (1,) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
284: op Mul shape [(1,), (1,)] opt {}
285: op Slice shape [(1, 4, 6300), (1,), (1,), (1,)] opt {}
286: op Constant shape [] opt {'value': <Tensor <LB GPU (1, 2, 6300) contig:True (<LoadOps.COPY: 3>, None)> on GPU with grad None>}
286: op Constant shape [] opt {'value': <Tensor <LB HIP (1, 2, 6300) contig:True (<LoadOps.COPY: 3>, None)> on HIP with grad None>}
287: op Sub shape [(1, 2, 6300), (1, 3, 6300)] opt {}
Traceback (most recent call last):
File "/home/jebba/devel/tinygrad/tinygrad/examples/yolov8-onnx.py", line 18, in <module>

View File

@ -3,7 +3,6 @@ cd rocsolver/
git checkout rocm-6.0.2
rm -rf build
cmake -B build -G Ninja \
-DAMDGPU_TARGETS=gfx1100 \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_INSTALL_PREFIX=/opt/rocm \
@ -17,7 +16,8 @@ cmake -B build -G Ninja \
-DCPACK_SOURCE_TBZ2=OFF \
-DCPACK_SOURCE_TGZ=OFF \
-DCPACK_SOURCE_TXZ=OFF \
-DCPACK_SOURCE_TZ=OFF
-DCPACK_SOURCE_TZ=OFF \
-DROCM_DEP_ROCMCORE=OFF \
ninja -C build package
sudo dpkg -i build/rocsolver-dev_3.24.0-447a52f_amd64.deb \

View File

@ -8,7 +8,7 @@ msgid ""
msgstr ""
"Project-Id-Version: tinyrocs 0\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-02-06 13:04-0700\n"
"POT-Creation-Date: 2024-02-06 14:15-0700\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: en\n"
@ -336,19 +336,23 @@ msgstr ""
msgid "``rocsolver`` for hipBLAS."
msgstr ""
#: ../../../_source/toolchain-6.0.2.rst:266
#: ../../../_source/toolchain-6.0.2.rst:261
msgid "Has option: ``-DAMDGPU_TARGETS=gfx1100``, but builds for all targets."
msgstr ""
#: ../../../_source/toolchain-6.0.2.rst:268
msgid "hipBLAS"
msgstr ""
#: ../../../_source/toolchain-6.0.2.rst:267
#: ../../../_source/toolchain-6.0.2.rst:269
msgid "``hipBLAS`` plz."
msgstr ""
#: ../../../_source/toolchain-6.0.2.rst:274
#: ../../../_source/toolchain-6.0.2.rst:276
msgid "LLVM Pass Two"
msgstr ""
#: ../../../_source/toolchain-6.0.2.rst:275
#: ../../../_source/toolchain-6.0.2.rst:277
msgid ""
"XXX Skip this for now ? XXX. Needed for flang (fortran). Needed for OpenMP."
msgstr ""

View File

@ -258,6 +258,8 @@ rocsolver
---------
``rocsolver`` for hipBLAS.
Has option: ``-DAMDGPU_TARGETS=gfx1100``, but builds for all targets.
.. literalinclude:: _static/toolchain/rocm-6.0.2/build-rocsolver.sh
:language: bash