tinygrab

deepcrayon

tinygrab

Author	SHA1	Message	Date
George Hotz	2844482a60	Mypy fun (#541 ) * mypy fun * things are just faster * running fast * mypy is fast * compile.sh * no gpu hack * refactor ops_cpu and ops_torch to not subclass * make weak buffer work * tensor works * fix test failing * cpu/torch cleanups * no or operator on dict in python 3.8 * that was junk * fix warnings * comment and touchup	2023-02-08 09:56:51 -06:00
George Hotz	682dc64430	works at work	2022-09-06 08:06:11 -07:00
George Hotz	121d5a17ee	use tinynn for Conv2d	2021-10-30 19:40:44 -07:00
Skosh	78aa147b39	[WIP] YOLO working on tinygrad! (#245 ) * Some progress on yolov3 * Removed some debugging comments… Also, the forward pass eats all RAM for some reason * forward pass almost runs * forward pass runs almost * forward pass runs, now we gotta load the weights * loading weights works * fetches config and weights * everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done * some changes * fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly * Something is wrong with the forward pass, Conv2d tests added * forward pass almost outputs correct values, gotta fix one more thign * yolo works * some final changes * reverting changes * removed dataloader * fixed some indentation * comment out failing test, somehow it fails CI even though it passes on my computer… * fixed wrong probabilities * added webcam option to YOLO, now just need to add bounding boxes and speed it up * some progress towards adding bounding boxes * trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage * Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image * removed some debugging print statements * updated result image * something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…	2021-04-25 18:06:52 -07:00
George Hotz	1dcaecacc4	Support for Apple Neural Engine (#130 ) * ane query is success * cite and build instructions * low level access, need to disable AMFI * coreml_ane works * coreml fun * more work * compiled example * progress * compiler works * model flow * TODOs in the readme * put some real weights in * we are learning objc * much progress i think * signed model still doesn't work * working example * there are float16 * clean up: part 1 * h11ane header, more cleanup * cleanup DeviceController creation * remove the stupid sleep * notes * start a hwx parser * no tabs * compare stuff * hmm, why don't inputs work * cache doesn't seem to fix it * hmm, the issue was the compiler * fix the compiler, guess i didn't put in weights * logging for compiler * uselessness in plist * remove hwx before compile, weights are converted to float16 * better compare * better compare * last line in comparE * opcodes from compiler * notes	2020-12-03 10:32:26 -08:00
George Hotz	94d44c97bf	add pad2d on GPU	2020-11-07 10:46:36 -08:00
Rene Delgado	cd54697fd8	fix gpu sum forward (#61 ) * ignore venv * add sum test * fix sum forward	2020-11-05 21:59:16 -08:00
Göktuğ Karakaşlı	cc9bd45b44	add setup.py and change imports to relative	2020-10-26 18:19:50 +03:00
George Hotz	1bb2583500	start tinygrad	2020-10-17 22:57:01 -07:00

9 Commits (e6f19d4ce2a072d3dc3ba95c80c21c552f1753c8)