tinygrab

deepcrayon

tinygrab

History

chenyu ac183568be llama JIT python runtime speedup (#1633 ) * no JIT call in TransformerBlock * idea * move 2 reshapes to jitted function shrink inside jitted too, 6.3ms remove back reshapes, 5.5ms isinstance -> __class__ 4.99ms * think revert ops_gpu.py revert symbolic.py too PYOPENCL_COMPILER_OUTPUT=1 * cleanup * fix cache shape for conversational model only reshape if start_pos > 0 * small cleanup * include var_vals.keys() to st.key * add comments * llama small update * everything jitted again, similar structure to gpt2 * fix typing * add TODO for in place update cache	2023-08-30 07:51:05 -07:00
..
shapetracker.py	llama JIT python runtime speedup (#1633 )	2023-08-30 07:51:05 -07:00
symbolic.py	llama JIT python runtime speedup (#1633 )	2023-08-30 07:51:05 -07:00

llama JIT python runtime speedup (#1633 )

* no JIT call in TransformerBlock

* idea

* move 2 reshapes to jitted function

shrink inside jitted too, 6.3ms

remove back reshapes, 5.5ms

isinstance -> __class__ 4.99ms

* think

revert ops_gpu.py

revert symbolic.py too

PYOPENCL_COMPILER_OUTPUT=1

* cleanup

* fix cache shape for conversational model

only reshape if start_pos > 0

* small cleanup

* include var_vals.keys() to st.key

* add comments

* llama small update

* everything jitted again, similar structure to gpt2

* fix typing

* add TODO for in place update cache

2023-08-30 07:51:05 -07:00

shapetracker.py

llama JIT python runtime speedup (#1633 )

2023-08-30 07:51:05 -07:00

symbolic.py

llama JIT python runtime speedup (#1633 )

2023-08-30 07:51:05 -07:00