tinygrab

deepcrayon

tinygrab

Author	SHA1	Message	Date
Roelof van Dijk	8f2e2f5ee2	style: else-after-return (#1216 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-07-12 10:26:38 -07:00
George Hotz	67e34b356a	good stuff from tensor cores branch (#1199 )	2023-07-08 16:58:26 -07:00
George Hotz	793a670187	from tensor cores + lb touchup (#1127 )	2023-07-04 15:45:20 -07:00
Anselm Coogan	a22aad7d32	Use generators instead of lists in `any`s and `all`s (#1111 ) * Use generators in any(..) instead of lists for better best-case * Use generators in all(...) instead of lists * enable R1729 in .pylintrc * revert import sorting --------- Co-authored-by: Anselm Coogan <anselm@scandit.com>	2023-07-03 16:06:06 -07:00
Roelof van Dijk	542b2d93a5	Perf/cache string ops (#1078 ) * perf: remove extra function, include in cached getitem * perf: only calculate hash once per node --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-06-29 13:23:11 -07:00
George Hotz	d16c16ec28	new upcast works (#1066 ) * new upcast works * float4 try * fix unaligned float4 * disallow unaligned access * upcast dim * maybe good now * fix gpu half * vstore_half4 * fix deep image bugs * improve symbolic to fix issues * fix symbolic * cl test * this maybe * gcd of 1 is 1 * real fix for old python * improve fuzzer	2023-06-27 19:34:53 -07:00
George Hotz	c8d87eb8d4	strip whitespace	2023-06-27 10:11:43 -07:00
Roelof van Dijk	c604ef4beb	symbolic.py: faster Node.sum, faster SumNode.div (#1014 ) * refactor: replace isinstance with class check where possible * refactor: faster partition * fix; flake8 * feat: rework node.sum, correct list typing * fix: typo * feat: refactor sum * fix: pylint * refactor: simpler sum and factorize * feat; clean up sumnode div, all cpu tests pass * feat: simplify floordiv, cache factorization * don't factor numnodes at all * python 3.8 functools does not yet have @cache * fix: restore assert * refactor, fix failing tests * fix: address review comments * feat: rework, add specialization, remove cache * fix: remove specialization * feat: no tuple conversion, faster loop --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-06-26 09:47:17 -07:00
George Hotz	ba4eadb04c	PTX assembly support (#977 ) * ptx assembly * all ops tests pass * fix tests	2023-06-13 12:31:42 -07:00
George Hotz	c62c64f0b7	remove GeNode (#965 )	2023-06-09 21:48:56 -07:00
Rayan Hatout	8b2c2d6896	Optimizations in `symbolic.py` (#796 ) * optimizations in symbolic.py * fix infinite recursion when expanding sums * add test case to make sure NumNodes are hoisted up in cases where MulNodes cancel eachother out	2023-05-26 12:59:53 -07:00
George Hotz	8b7ecd63bb	Remove Zeroview (#748 ) * no zeroview start * closer * stride mask * st tests pass, delete ZeroView * byebye zv * close to working * not contiguous with mask * subtract, don't add * mask on view * ugh, that shouldn't have been in there * shape merge * bugfixes * fuzzer + 4 fuzzer failures * fuzzer for symbolic * more fuzzing and nothing * that fuzzer doesn't hit either * fixes padding...ugh * no more offsets * working * rewrite load and store * all checks * fix idxs * progress * bugfix * float4_axis * works * cleanups * complex valids_okay	2023-04-17 08:21:46 -07:00
George Hotz	5495c7d64e	linearizer! (#714 ) * linearizer outputs something * working ish * cstyle codegen * clang mostly works * fix load valid * fix numberless loop * fancy gen * working * fix enet compiler * cleanups * float4 upcasting * less lines * supports_float4 * constant folding * mulacc * internet tests flaky in CI * 90% image support * fix image generic * bugs exposed with shapetracker and single view * new llvm * use vload, remove OLD * that's really poorly done * ending up being more lines	2023-03-19 23:43:49 -07:00
George Hotz	f5467cfedc	Devicebufferless (#708 ) * runs one metal kernel * conv2d works * ops tests are passing * const folding * all ops work * pre commit always passes * torch works * working still * fix graph test * tests passing * image almost works * image conv works * most images * fix custom * fix assignment * fix compile enet * clean up comments * fix realize return value * include shapetracker in LB repr * copy should make a copy * reenable method cache * fix lna * dtypes in graph * forward only for IMAGE=2 * simple realize * getting close * fixup new api, it's good except the kernel count * back to 197 kernels * tests should pass * go to a real float * no type_on_cpu * fix the docs * put shapetracker back in it's proper place	2023-03-18 14:40:23 -07:00
George Hotz	c594a0a835	fix flip bug, add new unit tests	2023-03-12 23:55:31 -07:00
Cyril Roumégous	3f08613a2a	apply flake8 E203 rule (#684 )	2023-03-11 11:35:16 -08:00
George Hotz	f3ac52aee8	Mypyc (#680 ) * building shapetracker * default ENABLE_METHOD_CACHE * symbolic compiles * improve types * tensor compiles * oops, that's a bug * best of both worlds * find legit typing bugs * pad2d can take list or tuple * sub 200ms when compiled	2023-03-11 07:33:30 -08:00
George Hotz	22905dd657	speedups from llama branch	2023-03-10 22:01:32 -08:00
George Hotz	fb5ee9260f	add pad tests to shapetracker	2023-03-09 12:51:18 -08:00
George Hotz	382f346523	clean up opt (#649 ) * clean up opt * don't let global kernels get too small * 8192 -> 1024 * disable local shape for clang * fix can_merge * unroll the 5x5 depthwise convs in op * load float4 check	2023-03-05 20:49:36 -08:00
Cyril Roumégous	c10131ddf5	reduce number of lines (#645 )	2023-03-05 15:42:32 -08:00
George Hotz	b5b4edf59b	comments	2023-03-03 22:39:31 -08:00
George Hotz	cfb050e2d1	simple modrange, thanks Jacky	2023-03-03 22:37:04 -08:00
George Hotz	7a1d96fd76	No negative (#632 ) * behavior is correct without VALIDHACKS * simple div and mod * fix tests * no negative variables * alt form is correct * still correct * bug in mulnode * at least validhacks works now * cleanups * test validhacks, and to_image_idx * cache compare key * tests and __neg__	2023-03-03 16:48:14 -08:00
George Hotz	b9ce20c374	openpilot test wasn't running, factor out image idx	2023-03-03 07:41:53 -08:00
George Hotz	3915c89fb6	symbolic improvements (#629 ) * fixups * shorter diff * wow, okay removing that had side effects * more numeric tests * MIN MAX tests	2023-03-02 19:50:38 -08:00
George Hotz	28f52f7c24	improve symbolic	2023-02-28 16:21:58 -08:00
George Hotz	1702a5779f	remove hacks from can_merge	2023-02-28 15:30:20 -08:00
George Hotz	e21df1701b	distribute + refactor merge_views	2023-02-28 14:57:56 -08:00
George Hotz	8478a61cdb	simplify in shapetracker	2023-02-28 00:35:26 -08:00
George Hotz	f3386c7f09	improve symbolic, hlop conv output is simple now	2023-02-24 22:20:40 -08:00
George Hotz	446442dbb3	fix tests symbolic	2023-02-11 15:16:47 -08:00
George Hotz	7a7046f264	sum_combine_num	2023-02-11 14:48:31 -08:00
George Hotz	87a7717222	LLVM backend uses shapetracker	2023-02-10 13:53:33 -06:00
George Hotz	c3cf17c6d0	Symbolic render (#550 ) * render symbolic * valid * fix shapetracker tests * render_python is the default * expr is gone * remove legacy behavior	2023-02-10 13:22:26 -06:00
George Hotz	aebe75d9a2	remove val expansion (#539 ) * remove val expansion * types for all shapetracker functions: * more typing * add all the parens to the test * more types * fix tests * very minor speedup	2023-02-07 15:14:05 -06:00
George Hotz	c073271f20	more symbolic correctness	2023-02-07 00:03:14 -06:00
George Hotz	e961fd3a04	more symbolic test, ModNode is wrong	2023-02-06 23:43:21 -06:00
George Hotz	8cfeb118d6	symbolic new test	2023-02-06 23:27:26 -06:00
George Hotz	7c5a5ecdac	even simpler symbolic	2023-02-06 22:47:00 -06:00
George Hotz	8b05de1841	symbolic cleanups	2023-02-06 22:12:11 -06:00
Andrey	4977d6f225	using tuples in isinstance (#534 )	2023-02-06 14:40:26 -06:00
George Hotz	b1dec64815	new types and fixup ShapeTracker type mismatches	2023-01-25 19:39:36 -08:00
George Hotz	708215d06b	Typing (#468 ) * we typing * types look good in theory * most tests pass * gpu tests pass * TEST_AST * delete comments * i must have written that bug so many times * bugfix * don't merge the small ones * add f to constants * commits from reduce * don't GCD the mod nodes * broken and a hack IMAGE=3 * group for reduce * fix linter + mypy * move out test ast * insource TENSOR_TYPE_TO_NP_TYPE * does this fix it? * move imports out	2023-01-21 09:09:22 -08:00
George Hotz	0881d504c1	move shapetracker (#466 ) * move shapetracker * shapetracker test * move ast * move a few things * fix print kernel * fix test * symbolic fixups	2023-01-19 09:56:31 -08:00

45 Commits (8f2e2f5ee2c9fe8dd6d7d3b64375484d415a5b0d)