cgus
/

Homunculus-abliterated-exl2

@@ -2,9 +2,9 @@
 language:
 - en
 license: apache-2.0
-library_name: transformers
 base_model:
-  - arcee-ai/Homunculus
 tags:
 - distillation
 - /think
@@ -14,25 +14,26 @@ tags:
 - chat
 - abliterated
 - uncensored
-extra_gated_prompt: >-
-    **Usage Warnings**
-    “**Risk of Sensitive or Controversial Outputs**“: This model’s safety filtering has been significantly reduced, potentially generating sensitive, controversial, or inappropriate content. Users should exercise caution and rigorously review generated outputs.
-    “**Not Suitable for All Audiences**:“ Due to limited content filtering, the model’s outputs may be inappropriate for public settings, underage users, or applications requiring high security.
-    “**Legal and Ethical Responsibilities**“: Users must ensure their usage complies with local laws and ethical standards. Generated content may carry legal or ethical risks, and users are solely responsible for any consequences.
-    “**Research and Experimental Use**“: It is recommended to use this model for research, testing, or controlled environments, avoiding direct use in production or public-facing commercial applications.
-    “**Monitoring and Review Recommendations**“: Users are strongly advised to monitor model outputs in real-time and conduct manual reviews when necessary to prevent the dissemination of inappropriate content.
-    “**No Default Safety Guarantees**“: Unlike standard models, this model has not undergone rigorous safety optimization. huihui.ai bears no responsibility for any consequences arising from its use.
 ---
 # huihui-ai/Homunculus-abliterated
 This is an uncensored version of [arcee-ai/Homunculus](https://huggingface.co/arcee-ai/Homunculus) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
 This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.

 language:
 - en
 license: apache-2.0
+library_name: exllamav2
 base_model:
+  - huihui-ai/Homunculus-abliterated
 tags:
 - distillation
 - /think
 - chat
 - abliterated
 - uncensored
 ---
+# Homunculus-abliterated-exl2
+Original model: [Homunculus-abliterated](https://huggingface.co/huihui-ai/Homunculus-abliterated) by [huihui.ai](https://huggingface.co/huihui-ai)
+Based on: [Homunculus](https://huggingface.co/arcee-ai/Homunculus) by [Arcee AI](https://huggingface.co/arcee-ai)
+Foundation model: [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407) by [Mistral AI](https://huggingface.co/mistralai) with data and tokenizer from [Qwen3-235B-A22B](https://huggingface.co/Qwen/Qwen3-235B-A22B) by [Qwen](https://huggingface.co/Qwen)
+## Quants
+[4bpw h6 (main)](https://huggingface.co/cgus/Homunculus-abliterated-exl2/tree/main)
+[4.5bpw h6](https://huggingface.co/cgus/Homunculus-abliterated-exl2/tree/4.5bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/Homunculus-abliterated-exl2/tree/5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/Homunculus-abliterated-exl2/tree/6bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/Homunculus-abliterated-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with Exllamav2 0.3.1 with default dataset.
+These quants can be used with RTX GPU on Windows or RTX/ROCm GPU on Linux with TabbyAPI or Text-Generation-WebUI.
+Exllamav2 quants must fully fit your GPU to be usable or to maintain maximum performance.
+For example, I use Mistral-Nemo-12B models with RTX3060/12GB 6bpw quant and 16k context (Q6 cache) or RTX4060TI/16GB with 6bpw 32k (Q8 cache).
+# Original model card.
 # huihui-ai/Homunculus-abliterated
 This is an uncensored version of [arcee-ai/Homunculus](https://huggingface.co/arcee-ai/Homunculus) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
 This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.