Рет қаралды 14,709
LLM Quantization: GPTQ - AutoGPTQ
llama.cpp - ggml.c - GGUL - C++
Compare to HF transformers in 4-bit quantization.
Download Web UI wrappers for your heavily quantized LLM to your local machine (PC, Linux, Apple).
LLM on Apple Hardware, w/ M1, M2 or M3 chip.
Run inference of your LLMs on your local PC, with heavy quantization applied.
Plus: 8 Web UI for GTPQ, llama.cpp or AutoGPTQ, exLLama or GGUF.c
koboldcpp
oobabooga text-generation-webui
ctransformers
lmstudio.ai/
github.com/marella/ctransformers
github.com/ggerganov/ggml
github.com/rustformers/llm/bl...
huggingface.co/TheBloke/Llama...
github.com/PanQiWei/AutoGPTQ
cloud.google.com/model-garden
huggingface.co/autotrain
h2o.ai/platform/ai-cloud/make...
#quantization
#ai
#webui