1
0
Fork 0
Commit Graph

3070 Commits (deepcrayon)

Author SHA1 Message Date
George Hotz 065aff747e
make webgpu test reliable (#2502)
* remove retry that doesn't work

* fix cleanup

* process exit in cleanup

* add space
2023-11-29 10:02:24 -08:00
George Hotz 6707f2588e
use copyin (#2500)
* it's always copyin

* all RawBuffer are RawBufferCopyIn

* cleanups

* this fixes it

* requirements='C'

* more correct
2023-11-29 09:34:00 -08:00
George Hotz 947711a532
split metal and webgpu tests (#2501) 2023-11-29 09:32:09 -08:00
chenyu 3eb3c74675
metal ci tests everything (#2499)
* metal ci tests everything

* pretty good

* METAL
2023-11-29 12:04:37 -05:00
George Hotz 889acefe85
Support weird loads in Image (#2498)
* image support weird loads

* umm, that was always wrong

* openpilot compile fails with a weird error

* image test passes

* we have valids now

* clean that up

* no more required opts

* add fastvits test, fix bug

* minor cleanups
2023-11-29 08:30:46 -08:00
George Hotz e333672675
realize cleanup (#2496)
* move that logic

* revert that change

* clean up transfer and asserts

* what's that junk
2023-11-28 21:08:39 -08:00
George Hotz 5629fc368c
Use Buffer.STORE at the end of ASTs (#2494)
* work

* store broken

* interpreteds work

* this passes

* symbolic cpu

* fix tests

* fix opt tests

* images fail

* fix InterpretedFlopCounter

* stupid hack for images
2023-11-28 20:11:37 -08:00
Liam cf0c9096a9
Removing METAL Skips as CI works (#2488)
* Test metal CI

* remove metal and CI restrictions

* enable dtype tests for metal ci
2023-11-28 19:46:59 -08:00
Jake 5588922884
Update cuda_matmul.py (#2495) 2023-11-28 19:46:01 -08:00
George Hotz cdc3b95729 if you don't appreciate a 15 second timeout, you get a 10 second timeout 2023-11-28 17:44:09 -08:00
George Hotz d87a246439
move to new cached fetch (#2493)
* move to new cached fetch

* extra.utils is over

* loads

* bump download cache

* bump timeout
2023-11-28 17:36:55 -08:00
George Hotz ab5d14d4ba
MEM -> LOAD (#2492)
* MEM -> LOAD

* keep legacy working
2023-11-28 16:46:37 -08:00
chenyu a739c6646e
fp16 in gpt2 attention (#2491)
* fp16 in gpt2 attention

* HALF
2023-11-28 19:27:03 -05:00
chenyu 847f0a02b1
non-simplifiable mod should result in ModNode (#2490)
* non-simplifiable mod should result in ModNode

* space
2023-11-28 16:52:19 -05:00
George Hotz 3f137b134a jax parallel matmul example 2023-11-28 13:48:11 -08:00
mmmkkaaayy ddb6a33ae5
improve test assertions for jit cache len with graph executor (#2476)
* improve test assertions for jit cache len with graph executor

* delete newline

* unused import

* another unused import
2023-11-27 23:02:45 -08:00
chenyu 28a67106ca
enable symbolic ops tests for hip (#2485) 2023-11-27 22:33:41 -08:00
Christopher Mauri Milan 7f01dd04f0
Apply ruff linting rules to tests (#2473)
* everything except F821

* enable F821 with noqa

* dumb fix

* fix remaining imports and (former) lambdas

* replace _ with noqa to avoid gc
2023-11-27 21:24:06 -08:00
Davi Silva 136dbd8b36
HIP CI that compiles (to RDNA3) but doesn't have to run (#2482)
* hip amd compilation

* gate the test properly

* cleanup unused import

* remove superfluous numpy conversion

* add SpeedyNet tests (f32 [passes] & f16 [fails])

* make CI verbose (error log from hip compiler)

* test the real ops_hip

* Merge branch 'tinygrad:master' into ci/hip-compilation

* fix CI

* cleanup

* really fix CI

* Fix CI Three: the refixening

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-11-27 21:17:06 -08:00
George Hotz 756b01f46f
why were these ever called buffer (#2483) 2023-11-27 21:02:07 -08:00
George Hotz acbe6d1b53
Revert "HIP compilation on CI targeting RDNA3 (#2459)" (#2481)
This reverts commit d275ff930a.
2023-11-27 20:41:21 -08:00
qtkite cb507a9389
Remove the toCPU copy (#2445)
* Remove the rawbuffer copy in runtime/lib.py on line 44

* remove buffer view

* added metadata back, oops

* delayed cpu testcase

* whitespace

* whitespace

* buffer behavior as is

* Update test_jit.py
2023-11-27 20:37:13 -08:00
Davi Silva d275ff930a
HIP compilation on CI targeting RDNA3 (#2459)
* hip amd compilation

* gate the test properly

* cleanup unused import

* remove superfluous numpy conversion

* add SpeedyNet tests (f32 [passes] & f16 [fails])

* make CI verbose (error log from hip compiler)

* test the real ops_hip

* Merge branch 'tinygrad:master' into ci/hip-compilation

* fix CI

* cleanup

* really fix CI
2023-11-27 20:33:11 -08:00
Yingbo Ma d43485ae9e
Fix `graph_uops` (#2457)
* Load networkx when we need to graph uops

* Document GRAPHUOPS

* import nx in `graph_uops`
2023-11-27 18:42:48 -08:00
Paul Gustafson 98cd9e8926
Add assertion to prevent nonsense mod values (#2474) 2023-11-27 18:37:44 -08:00
Davi Silva 186ac77ec3
Update hip_matmul.py (#2480) 2023-11-27 18:36:19 -08:00
chenyu 7f9a4c1285
fp16 and noshow flags for gpt2 (#2470) 2023-11-27 16:23:03 -05:00
qazal e267a93124
reset seed on every run (#2468) 2023-11-27 12:55:54 -08:00
George Hotz 9e07824542
move device to device.py (#2466)
* move device to device.py

* pylint test --disable R,C,W,E --enable E0611

* fix tests
2023-11-27 11:34:37 -08:00
qazal 262cd26d28
Simplify openpilot kernel (#2460)
* a conditional with the same results either way is a noop

* add unit test
2023-11-27 10:02:27 -08:00
chenyu 61a80a0675
asserts LtNodes of SumNode with MulNode of Nodes (#2465) 2023-11-27 12:56:59 -05:00
chenyu c4dfde761e
remove the commented import (#2463) 2023-11-27 11:50:41 -05:00
Akshay Kashyap a031afb2f6
Update display_name in resnet50 example (#2454) 2023-11-26 16:07:36 -08:00
Paul Gustafson 1d89c018fa
Add isinstance check before gcd call in SumNode.__lt__ (#2450)
* Add isinstance check before gcd call

* Delete blank lines

* Fix unit test typo

* Delete blank lines again

---------

Co-authored-by: Paul Gustafson <paul.gustafson@theambrusgroup.com>
2023-11-26 13:05:04 -08:00
Paul Gustafson 58b1dd463e
Add error code to type: ignore (#2451)
Co-authored-by: Paul Gustafson <paul.gustafson@theambrusgroup.com>
2023-11-26 13:04:10 -08:00
George Hotz 8e9cdef61f
clean up the buffers (#2447)
* clean up the buffers

* remove allocate_output

* functools.lru_cache is methodcache

* add TestShapeTrackerSize

* cache_clear

* no 0 sz buffer, add _ on functions that shouldn't be imported

* fix size

* if -> while
2023-11-26 11:02:29 -08:00
George Hotz f6f712e609
split out the three steps of exec_ast (#2446)
* split out the three steps of exec_ast

* clean up extra args

* cleanups, bugfix

* allocate is a more normal name

* get_optimized_linearizer is better
2023-11-26 09:07:37 -08:00
chenyu 511310737e
test_linearizer_failures to run on all backends (#2443)
* test_linearizer_failures to run on all backends

* test ubuntu and cuda

* failed only in CUDA CI

* move asserts
2023-11-26 01:17:29 -05:00
George Hotz c42d2c4731 strip whitespace 2023-11-25 14:09:06 -08:00
George Hotz 9eb2746d62
fix copy issue + add regression test (#2441) 2023-11-25 14:06:08 -08:00
andresgit 259a869fc1
Fix UnicodeDecodeError when debugging on Intel APU (#2421)
* test DEBUG=5

* print prg if NVIDIA, fixes error on Intel APU
2023-11-25 12:30:50 -08:00
George Hotz 7170a9a057
coder.py can write and run code (#2439)
* wip mistral

* coder

* touchups

* cleanups

* mistral cleanups

* clean up cache create

* download the weights, fix tests

* fix llama loading

* global fixup

* clean up all

* move llama model

* cleanups

* Revert "cleanups"

This reverts commit a71c5d59eb.

* fine, leave it
2023-11-25 12:27:54 -08:00
Davi Silva df41a57e09
Fix: missing n_kv_heads for smaller models from huggingface (#2438)
* fix: missing n_kv_heads for smaller models from huggingface

* a lil golfing
2023-11-25 10:29:04 -08:00
George Hotz 96c12fdeab
multibatch gpt2 (#2432)
* support multibatch gpt-2

* multi output

* no default JIT in CI
2023-11-24 18:10:10 -08:00
Tobias Fischer 5326bbc9a6
fix causal mask in Tensor class (#2425)
* fixed causal mask in Tensor class

* added tests for scaled attention against pytorch

* cleaned up test formatting

* removed duplicate test
2023-11-24 18:38:18 -05:00
chenyu 9a5d0e70de
Device.DEFAULT instead of getenv to exclude tests (#2429) 2023-11-24 17:10:24 -05:00
chenyu 6223f8894d
clean up ast_parse (#2428)
* clean up ast_parse

* separate loops
2023-11-24 16:43:32 -05:00
George Hotz 8ff2e13550
From teeny (#2426)
* changes from teenygrad work

* support not supporting ImageDType/PtrDType

* fixups from teeny
2023-11-24 12:50:56 -08:00
chenyu 9ae83fba04
flatten instead of reduce, improve type inference (#2423) 2023-11-24 13:19:22 -05:00
Francis Lata 7169de57e2
Update VITS to use fetch helper (#2422)
* use fetch helper on vits

* remove duplicate weight loading
2023-11-24 08:50:03 -08:00