1
0
Fork 0
Commit Graph

9 Commits (e6f19d4ce2a072d3dc3ba95c80c21c552f1753c8)

Author SHA1 Message Date
George Hotz 2844482a60
Mypy fun (#541)
* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup
2023-02-08 09:56:51 -06:00
George Hotz 682dc64430 works at work 2022-09-06 08:06:11 -07:00
George Hotz 121d5a17ee use tinynn for Conv2d 2021-10-30 19:40:44 -07:00
Skosh 78aa147b39
[WIP] YOLO working on tinygrad! (#245)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…
2021-04-25 18:06:52 -07:00
George Hotz 1dcaecacc4
Support for Apple Neural Engine (#130)
* ane query is success

* cite and build instructions

* low level access, need to disable AMFI

* coreml_ane works

* coreml fun

* more work

* compiled example

* progress

* compiler works

* model flow

* TODOs in the readme

* put some real weights in

* we are learning objc

* much progress i think

* signed model still doesn't work

* working example

* there are float16

* clean up: part 1

* h11ane header, more cleanup

* cleanup DeviceController creation

* remove the stupid sleep

* notes

* start a hwx parser

* no tabs

* compare stuff

* hmm, why don't inputs work

* cache doesn't seem to fix it

* hmm, the issue was the compiler

* fix the compiler, guess i didn't put in weights

* logging for compiler

* uselessness in plist

* remove hwx before compile, weights are converted to float16

* better compare

* better compare

* last line in comparE

* opcodes from compiler

* notes
2020-12-03 10:32:26 -08:00
George Hotz 94d44c97bf add pad2d on GPU 2020-11-07 10:46:36 -08:00
Rene Delgado cd54697fd8
fix gpu sum forward (#61)
* ignore venv

* add sum test

* fix sum forward
2020-11-05 21:59:16 -08:00
Göktuğ Karakaşlı cc9bd45b44 add setup.py and change imports to relative 2020-10-26 18:19:50 +03:00
George Hotz 1bb2583500 start tinygrad 2020-10-17 22:57:01 -07:00