Scripts for models for use with Parrot. https://parrot.codes/
Go to file
Jeff Moe 59befe6c07 Table of models pic 2023-11-23 11:49:13 -07:00
img Table of models pic 2023-11-23 11:49:13 -07:00
.gitattributes Use LFS for spreasheets 2023-11-17 08:44:51 -07:00
.gitignore Ignore LibreOffice temp files 2023-11-17 08:39:16 -07:00
CHANGELOG.txt v0.0.1 2023-11-16 14:22:02 -07:00
LICENSE.txt Creative Commons Attribution-ShareAlike 4.0 International 2023-11-16 11:21:47 -07:00
README.md Falcon, cc-by-sa, common crawl, etc 2023-11-17 08:40:36 -07:00
models.ods updates to table... 2023-11-17 22:56:53 -07:00

README.md

Parrot Models

Models for Parrot Libre AI IDE.

https://parrot.codes

Libre Models

Parrot is a Libre AI, so it needs a good free content ("open source") data model.

There are endless AI models, only some suitable for this application. Of the available models, a smaller subset claims to be "open source". Of the "open source" subset, only a much smaller subset actually uses an "open source" license. Of the "open source" subset that actually uses "open source" licenses, only a smaller subset uses exclusively "open source" training data.

AFAICT, there are exactly ZERO (0) AI instruct models are actually "open source" trained on "open source" datasets.

Parrot Model Licensing

Preferably, there is some truly libre instruct model suitable for Parrot. Likely, one will have to be created from scratch.

The model may use data that is under a license that appears on one of these three lists as an acceptable free/open license:

Models to Evaluate

Models freely available to download, but may not have suitable license. Determine which, if any, are ok.

  • DeepSeek
  • EleutherAI
  • EverythingLM
  • GPT-NeoX
  • Llemma
  • Mistral
  • Orca-mini
  • Pythia
  • StarCoder

Suitable Models

The following models may be suitable, but need further evalution.

Suitable Licenses

Models that appear to have libre licenses.

Apache 2.0

The following models use the Apache 2.0 license:

  • Falcon 7B
  • Falcon 40B

CC-BY-SA

The following models use the CC by SA license:

  • Open-Assistant StableLM
  • SQLCoder
  • StableLM

Suitable Model, Unsuitable Dataset

The following have libre licenses, but are built from non-libre datasets.

  • Falcon 7B --- Refinedweb (Common Crawl).
  • Falcon 40B --- Refinedweb (Common Crawl).

Suitable Model, Suitable Dataset

None.

Unsuitable Licenses

Licenses that are not free, libre, open, even if they may claim to be "open source".

These are not "Wikipedia Commons compatible", for example:

  • BigCode OpenRAIL-M.
  • Creative Commons Non-commercial (NC).
  • Llama 2 Community License Agreement.
  • OpenRAIL.
  • OpenRAIL++.
  • StableCode Research License.
  • Falcon 180B TII License.
  • Proprietary licenses.
  • Any "custom" license that hasn't been reviewed by the general community.

Unsuitable Models

Models that are not free, libre, open, even if they may claim to be "open source".

Unsuitable Model License

The following models are unsuitable due to using an unsuitable license.

Llama 2

The following are unsuitable due to Llama 2 license:

  • CodeLlama
  • Llama 2
  • NexusRaven
  • Phind-CodeLlama
  • WizardCoder
  • Vicuna
  • WizardLM
  • Wizard-Math

OpenRAIL

The following are unsuitable due to OpenRAIL license:

  • CodeUp-Llama
  • StarCoder
  • WizardCoder

Falcon 180B TII License

The following are unsuitable due to Falcon 180B TII license:

  • Falcon-180B

Proprietary

The following are unsuitable due to proprietary licenses:

  • Claude
  • GPT
  • PaLM
  • Qwen

Model Table

See models.ods LibreOffice spreadsheet.

License

Creative Commons Attribution-ShareAlike 4.0 International

Copyright © 2023, Jeff Moe.