1
0
Fork 0
Commit Graph

1455 Commits (7a7046f2643dddf63c4161b274b609045f4ef64c)

Author SHA1 Message Date
George Hotz 7a7046f264 sum_combine_num 2023-02-11 14:48:31 -08:00
Kirill a4f5f2ff8b
Add missing packages to setup.py (#554) 2023-02-11 14:41:56 -08:00
George Hotz 20a351a3c6 hand optim CONVW 2023-02-11 14:41:08 -08:00
George Hotz 89499b303d oops, bad else. why didn't linter catch 2023-02-11 12:02:09 -08:00
George Hotz 7d33f2d659 CL.CACHE is over, GlobalCounters.cache is it 2023-02-11 12:00:14 -08:00
George Hotz b9eae94ae9 move Device back into lazy 2023-02-11 11:26:53 -08:00
George Hotz 9152bb5b4a momentum support in SGD 2023-02-11 10:22:37 -08:00
George Hotz 0a2035e015 oops, GPU isn't defined 2023-02-11 10:10:02 -08:00
George Hotz 3421d4af10 the jit has a test 2023-02-11 10:04:03 -08:00
George Hotz 031edd01e6 switch openpilot compile to TinyJit 2023-02-11 09:51:44 -08:00
jspieler 8f912c3966
added deep deterministic policy gradient example (#531) 2023-02-11 10:10:46 -06:00
George Hotz 608fd730d3 put the JIT in extra 2023-02-11 00:35:18 -06:00
George Hotz ed8ae7522a tinyjit 2023-02-11 00:22:36 -06:00
George Hotz 4c90a15689 make the fake data actually learnable 2023-02-10 23:35:21 -06:00
George Hotz 07629d7476 fakedata and move to new cache 2023-02-10 23:32:31 -06:00
George Hotz 63fa7daf30 wrong place for CL 2023-02-10 23:22:24 -06:00
George Hotz 6f9b103878 fix opencl types 2023-02-10 23:18:39 -06:00
George Hotz fed95119dc CL.mem_used -> GlobalCounters.mem_used 2023-02-10 23:13:29 -06:00
George Hotz 51037815b9 add comment so we don't remove self.t tensor again 2023-02-10 23:07:07 -06:00
George Hotz c0ea538ba0 Revert "revert t as tensor, constant folding should be done better"
This reverts commit 1d800a94ad.
2023-02-10 23:06:00 -06:00
George Hotz 1d800a94ad revert t as tensor, constant folding should be done better 2023-02-10 22:58:39 -06:00
George Hotz 77988e3236 fix str() line count bug in scc 2023-02-10 22:53:30 -06:00
George Hotz 1fb5b8069b simpler processed check 2023-02-10 22:49:20 -06:00
George Hotz 609477656e clean up lazy processing_op 2023-02-10 22:41:52 -06:00
George Hotz a4cb161bd4 log_kernel 2023-02-10 21:51:53 -06:00
George Hotz b9f02671d3 oops, broke torch speed test 2023-02-10 16:13:53 -06:00
George Hotz 0efe1e435f no need to render to check valid 2023-02-10 15:35:12 -06:00
Kirill 27154db99a
Downloads weights in examples/stable_diffusion.py (#537)
* Downloads weights in examples/stable_diffusion.py

* use download_file_if_not_exists in fetch

* make consistent with previous NOCACHE behavior
2023-02-10 14:37:04 -06:00
George Hotz 4459cde68b minor matmul location cleanup 2023-02-10 14:12:42 -06:00
George Hotz a007145ac4 oops, should be __itruediv__. we should add test for this 2023-02-10 14:04:40 -06:00
Jacky Lee 5c51ae8dbf
Show where tinygrad is faster in speed test vs torch (#549)
* show where tinygrad is faster

* don't change text color
2023-02-10 14:01:07 -06:00
George Hotz 87a7717222 LLVM backend uses shapetracker 2023-02-10 13:53:33 -06:00
George Hotz c3cf17c6d0
Symbolic render (#550)
* render symbolic

* valid

* fix shapetracker tests

* render_python is the default

* expr is gone

* remove legacy behavior
2023-02-10 13:22:26 -06:00
George Hotz 5ed3622965 add dump to kernel_search 2023-02-10 12:13:30 -06:00
Jacky Lee f08187526f
Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
Lucas Keller 56a06280c5
Testing/utils (#548)
* New unittest for utils.py

Unit test fetch in basic ways. Would have tested more fetches, but
downloading stuff for tests is annoying and mocking is more
dependencies.

* Remove unused imports
2023-02-10 12:08:20 -06:00
George Hotz 1257b0433a should fix tests 2023-02-09 13:12:14 -06:00
George Hotz e6f19d4ce2 assume all generic exec ast have ProcessingOp 2023-02-09 13:03:48 -06:00
George Hotz 78795e3507 reduce line count by simplifying DeviceBuffer 2023-02-09 12:52:14 -06:00
George Hotz 5de850f6d5
assign buffer reuse (#547)
* assign buffer reuse works

* fix assign for torch and cpu

* allow assign from numpy

* fix llvm output_buffer

* add some assign tests

* fix assignment test

* test should fail without lazy

* env var to disable assign
2023-02-09 11:53:02 -06:00
George Hotz 473bbd3e35 fix graphs 2023-02-09 09:40:46 -06:00
George Hotz 16a7edc775 move base_fxn_for_op to ops_cpu 2023-02-08 18:23:48 -06:00
George Hotz c642f5e72b less lines for torch 2023-02-08 18:15:59 -06:00
George Hotz 58a03eb693 generic processing op 2023-02-08 18:09:17 -06:00
George Hotz 4c2faa4140 functools.partial keeps mypy compiler working 2023-02-08 18:04:32 -06:00
George Hotz cfd13c083b refactor GenericShape for a big line reduction 2023-02-08 18:01:08 -06:00
George Hotz c656513591 GPURunner class will replace CL cache eventually 2023-02-08 17:31:36 -06:00
George Hotz a5a55ac19e GlobalCounters cache + assign in optim 2023-02-08 17:10:55 -06:00
George Hotz d9555bc478 that turned out to be dumb 2023-02-08 16:52:29 -06:00
George Hotz 3d63934995
refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00