1
0
Fork 0
tinygrab/tinygrad/shape
chenyu ac183568be
llama JIT python runtime speedup (#1633)
* no JIT call in TransformerBlock

* idea

* move 2 reshapes to jitted function

shrink inside jitted too, 6.3ms

remove back reshapes, 5.5ms

isinstance -> __class__ 4.99ms

* think

revert ops_gpu.py

revert symbolic.py too

PYOPENCL_COMPILER_OUTPUT=1

* cleanup

* fix cache shape for conversational model

only reshape if start_pos > 0

* small cleanup

* include var_vals.keys() to st.key

* add comments

* llama small update

* everything jitted again, similar structure to gpt2

* fix typing

* add TODO for in place update cache
2023-08-30 07:51:05 -07:00
..
shapetracker.py llama JIT python runtime speedup (#1633) 2023-08-30 07:51:05 -07:00
symbolic.py llama JIT python runtime speedup (#1633) 2023-08-30 07:51:05 -07:00