1
0
Fork 0

feat: reword (#2549)

pull/2332/head
wozeparrot 2023-12-01 13:56:18 -05:00 committed by GitHub
parent 4c984bba7e
commit 28183c7438
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 3 additions and 3 deletions

View File

@ -128,7 +128,7 @@ Training neural networks in tinygrad is super simple.
All we need to do is define our neural network, define our loss function, and then call `.backward()` on the loss function to compute the gradients.
They can then be used to update the parameters of our neural network using one of the many optimizers in [optim.py](/tinygrad/nn/optim.py).
For our loss function we will be using sparse categorical cross entropy loss.
For our loss function we will be using sparse categorical cross entropy loss. The implementation below is taken from [tensor.py](/tinygrad/tensor.py), it's copied below to highlight an important detail of tinygrad.
```python
def sparse_categorical_crossentropy(self, Y, ignore_index=-1) -> Tensor:
@ -138,9 +138,9 @@ def sparse_categorical_crossentropy(self, Y, ignore_index=-1) -> Tensor:
return self.log_softmax().mul(y).sum() / loss_mask.sum()
```
As we can see in this implementation of cross entropy loss, there are certain operations that tinygrad does not support.
As we can see in this implementation of cross entropy loss, there are certain operations that tinygrad does not support natively.
Namely, operations that are load/store or assigning a value to a tensor at a certain index.
Load/store ops are not supported in tinygrad because they add complexity when trying to port to different backends and 90% of the models out there don't use/need them.
Load/store ops are not supported in tinygrad natively because they add complexity when trying to port to different backends, 90% of the models out there don't use/need them, and they can be implemented like it's done above with an `arange` mask.
For our optimizer we will be using the traditional stochastic gradient descent optimizer with a learning rate of 3e-4.