overview for Blaed

42

CodeLlama-34B - the First Open-Source Model Beating GPT-4 on HumanEvals (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/worldnews@lemmy.ml

0 comments fedilink

cross-posted from: https://lemmy.world/post/3879861

Beating GPT-4 on HumanEval with a Fine-Tuned CodeLlama-34B

Hello everyone! This post marks an exciting moment for !fosai@lemmy.world and everyone in the open-source large language model and AI community.

We appear to have a new contender on the block, a model apparently capable of surpassing OpenAI's state of the art ChatGPT-4 in coding evals (evaluations).

This is huge. Not too long ago I made an offhand comment on us catching up to GPT-4 within a year. I did not expect that prediction to end up being reality in half the time. Let's hope this isn't a one-off scenario and that we see a new wave of open-source models that begin to challenge OpenAI.

Buckle up, it's going to get interesting!

Here's some notes from the blog, which you should visit and read in its entirety:

https://www.phind.com/blog/code-llama-beats-gpt4

Blog Post

We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67% according to their official technical report in March. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.

The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.

CodeLlama-34B achieved 48.8% pass@1 on HumanEval

CodeLlama-34B-Python achieved 53.7% pass@1 on HumanEval

We have fine-tuned both models on a proprietary dataset of ~80k high-quality programming problems and solutions. Instead of code completion examples, this dataset features instruction-answer pairs, setting it apart structurally from HumanEval. We trained the Phind models over two epochs, for a total of ~160k examples. LoRA was not used — both models underwent a native fine-tuning. We employed DeepSpeed ZeRO 3 and Flash Attention 2 to train these models in three hours using 32 A100-80GB GPUs, with a sequence length of 4096 tokens.

Furthermore, we applied OpenAI's decontamination methodology to our dataset to ensure valid results, and found no contaminated examples.

The methodology is:

For each evaluation example, we randomly sampled three substrings of 50 characters or used the entire example if it was fewer than 50 characters.

A match was identified if any sampled substring was a substring of the processed training example.

For further insights on the decontamination methodology, please refer to Appendix C of OpenAI's technical report. Presented below are the pass@1 scores we achieved with our fine-tuned models:

Phind-CodeLlama-34B-v1 achieved 67.6% pass@1 on HumanEval

Phind-CodeLlama-34B-Python-v1 achieved 69.5% pass@1 on HumanEval

Download

We are releasing both models on Huggingface for verifiability and to bolster the open-source community. We welcome independent verification of results.

https://huggingface.co/Phind/Phind-CodeLlama-34B-v1

https://huggingface.co/Phind/Phind-CodeLlama-34B-Python-v1

If you get a chance to try either of these models out, let us know how it goes in the comments below!

If you found anything about this post interesting, consider subscribing to !fosai@lemmy.world.

Cheers to the power of open-source! May we continue the fight for optimization, efficiency, and performance.

2

Free Open-Source AI LLM Guide (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/selfhosted@lemmy.world

0 comments fedilink

cross-posted from: https://lemmy.world/post/2219010

Hello everyone!

We have officially hit 1,000 subscribers! How exciting!! Thank you for being a member of !fosai@lemmy.world. Whether you're a casual passerby, a hobby technologist, or an up-and-coming AI developer - I sincerely appreciate your interest and support in a future that is free and open for all.

It can be hard to keep up with the rapid developments in AI, so I have decided to pin this at the top of our community to be a frequently updated LLM-specific resource hub and model index for all of your adventures in FOSAI.

The ultimate goal of this guide is to become a gateway resource for anyone looking to get into free open-source AI (particularly text-based large language models). I will be doing a similar guide for image-based diffusion models soon!

In the meantime, I hope you find what you're looking for! Let me know in the comments if there is something I missed so that I can add it to the guide for everyone else to see.

Getting Started With Free Open-Source AI

Have no idea where to begin with AI / LLMs? Try starting with our Lemmy Crash Course for Free Open-Source AI.

When you're ready to explore more resources see our FOSAI Nexus - a hub for all of the major FOSS & FOSAI on the cutting/bleeding edges of technology.

If you're looking to jump right in, I recommend downloading oobabooga's text-generation-webui and installing one of the LLMs from TheBloke below.

Try both GGML and GPTQ variants to see which model type performs to your preference. See the hardware table to get a better idea on which parameter size you might be able to run (3B, 7B, 13B, 30B, 70B).

8-bit System Requirements

Model VRAM Used Minimum Total VRAM Card Examples RAM/Swap to Load*

LLaMA-7B 9.2GB 10GB 3060 12GB, 3080 10GB 24 GB

LLaMA-13B 16.3GB 20GB 3090, 3090 Ti, 4090 32 GB

LLaMA-30B 36GB 40GB A6000 48GB, A100 40GB 64 GB

LLaMA-65B 74GB 80GB A100 80GB 128 GB

4-bit System Requirements

Model Minimum Total VRAM Card Examples RAM/Swap to Load*

LLaMA-7B 6GB GTX 1660, 2060, AMD 5700 XT, RTX 3050, 3060 6 GB

LLaMA-13B 10GB AMD 6900 XT, RTX 2060 12GB, 3060 12GB, 3080, A2000 12 GB

LLaMA-30B 20GB RTX 3080 20GB, A4500, A5000, 3090, 4090, 6000, Tesla V100 32 GB

LLaMA-65B 40GB A100 40GB, 2x3090, 2x4090, A40, RTX A6000, 8000 64 GB

*System RAM (not VRAM), is utilized to initially load a model. You can use swap space if you do not have enough RAM to support your LLM.

When in doubt, try starting with 3B or 7B models and work your way up to 13B+.

FOSAI Resources

Fediverse / FOSAI

The Internet is Healing

FOSAI Welcome Message

FOSAI Crash Course

FOSAI Nexus Resource Hub

LLM Leaderboards

HF Open LLM Leaderboard

LMSYS Chatbot Arena

LLM Search Tools

LLM Explorer

Open LLMs

Model	VRAM Used	Minimum Total VRAM	Card Examples	RAM/Swap to Load*
LLaMA-7B	9.2GB	10GB	3060 12GB, 3080 10GB	24 GB
LLaMA-13B	16.3GB	20GB	3090, 3090 Ti, 4090	32 GB
LLaMA-30B	36GB	40GB	A6000 48GB, A100 40GB	64 GB
LLaMA-65B	74GB	80GB	A100 80GB	128 GB

Model	Minimum Total VRAM	Card Examples	RAM/Swap to Load*
LLaMA-7B	6GB	GTX 1660, 2060, AMD 5700 XT, RTX 3050, 3060	6 GB
LLaMA-13B	10GB	AMD 6900 XT, RTX 2060 12GB, 3060 12GB, 3080, A2000	12 GB
LLaMA-30B	20GB	RTX 3080 20GB, A4500, A5000, 3090, 4090, 6000, Tesla V100	32 GB
LLaMA-65B	40GB	A100 40GB, 2x3090, 2x4090, A40, RTX A6000, 8000	64 GB

Large Language Model Hub

Download Models

oobabooga

text-generation-webui - a big community favorite gradio web UI by oobabooga designed for running almost any free open-source and large language models downloaded off of HuggingFace which can be (but not limited to) models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and many others. Its goal is to become the AUTOMATIC1111/stable-diffusion-webui of text generation. It is highly compatible with many formats.

Exllama

A standalone Python/C++/CUDA implementation of Llama for use with 4-bit GPTQ weights, designed to be fast and memory-efficient on modern GPUs.

gpt4all

Open-source assistant-style large language models that run locally on your CPU. GPT4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer-grade processors.

TavernAI

The original branch of software SillyTavern was forked from. This chat interface offers very similar functionalities but has less cross-client compatibilities with other chat and API interfaces (compared to SillyTavern).

SillyTavern

Developer-friendly, Multi-API (KoboldAI/CPP, Horde, NovelAI, Ooba, OpenAI+proxies, Poe, WindowAI(Claude!)), Horde SD, System TTS, WorldInfo (lorebooks), customizable UI, auto-translate, and more prompt options than you'd ever want or need. Optional Extras server for more SD/TTS options + ChromaDB/Summarize. Based on a fork of TavernAI 1.2.8

Koboldcpp

A self contained distributable from Concedo that exposes llama.cpp function bindings, allowing it to be used via a simulated Kobold API endpoint. What does it mean? You get llama.cpp with a fancy UI, persistent stories, editing tools, save formats, memory, world info, author's note, characters, scenarios and everything Kobold and Kobold Lite have to offer. In a tiny package around 20 MB in size, excluding model weights.

KoboldAI-Client

This is a browser-based front-end for AI-assisted writing with multiple local & remote AI models. It offers the standard array of tools, including Memory, Author's Note, World Info, Save & Load, adjustable AI settings, formatting options, and the ability to import existing AI Dungeon adventures. You can also turn on Adventure mode and play the game like AI Dungeon Unleashed.

h2oGPT

h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. Documents help to ground LLMs against hallucinations by providing them context relevant to the instruction. h2oGPT is fully permissive Apache V2 open-source project for 100% private and secure use of LLMs and document embeddings for document question-answer.

Models

The Bloke

The Bloke is a developer who frequently releases quantized (GPTQ) and optimized (GGML) open-source, user-friendly versions of AI Large Language Models (LLMs).

These conversions of popular models can be configured and installed on personal (or professional) hardware, bringing bleeding-edge AI to the comfort of your home.

Support TheBloke here.

https://ko-fi.com/TheBlokeAI

70B

Llama-2-70B-chat-GPTQ

Llama-2-70B-Chat-GGML

Llama-2-70B-GPTQ

Llama-2-70B-GGML

llama-2-70b-Guanaco-QLoRA-GPTQ

30B

30B-Epsilon-GPTQ

13B

Llama-2-13B-chat-GPTQ

Llama-2-13B-chat-GGML

Llama-2-13B-GPTQ

Llama-2-13B-GGML

llama-2-13B-German-Assistant-v2-GPTQ

llama-2-13B-German-Assistant-v2-GGML

13B-Ouroboros-GGML

13B-Ouroboros-GPTQ

13B-BlueMethod-GGML

13B-BlueMethod-GPTQ

llama-2-13B-Guanaco-QLoRA-GGML

llama-2-13B-Guanaco-QLoRA-GPTQ

Dolphin-Llama-13B-GGML

Dolphin-Llama-13B-GPTQ

MythoLogic-13B-GGML

MythoBoros-13B-GPTQ

WizardLM-13B-V1.2-GPTQ

WizardLM-13B-V1.2-GGML

OpenAssistant-Llama2-13B-Orca-8K-3319-GGML

7B

Llama-2-7B-GPTQ

Llama-2-7B-GGML

Llama-2-7b-Chat-GPTQ

LLongMA-2-7B-GPTQ

llama-2-7B-Guanaco-QLoRA-GPTQ

llama-2-7B-Guanaco-QLoRA-GGML

llama2_7b_chat_uncensored-GPTQ

llama2_7b_chat_uncensored-GGML

More Models

Any of KoboldAI's Models

Luna-AI-Llama2-Uncensored-GPTQ

Nous-Hermes-Llama2-GGML

Nous-Hermes-Llama2-GPTQ

FreeWilly2-GPTQ

GL, HF!

Are you an LLM Developer? Looking for a shoutout or project showcase? Send me a message and I'd be more than happy to share your work and support links with the community.

If you haven't already, consider subscribing to the free open-source AI community at !fosai@lemmy.world where I will do my best to make sure you have access to free open-source artificial intelligence on the bleeding edge.

Thank you for reading!

6

Introducing Llama 2 - Meta's Next-Generation Commercially Viable Open-Source AI & LLM (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/worldnews@lemmy.ml

4 comments fedilink

Introducing Llama 2 - Meta's Next Generation Free Open-Source Artificially Intelligent Large Language Model

Llama 2

It's incredible it's already here! This is great news for everyone in free open-source artificial intelligence.

Llama 2 unleashes Meta's (previously) closed model (Llama) to become free open-source AI, accelerating access and development for large language models (LLMs).

This marks a significant step in machine learning and deep learning technologies. With this move, a widely supported LLM can become a viable choice for businesses, developers, and entrepreneurs to innovate our future using a model that the community has been eagerly awaiting since its initial leak earlier this year.

Here are some highlights from the official Meta AI announcement:

Llama 2

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases.

Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closedsource models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.

Llama 2 pretrained models are trained on 2 trillion tokens, and have double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations.

Inside the Model

Technical details

With each model download you'll receive:

Model code
Model Weights
README (User Guide)
Responsible Use Guide
License
Acceptable Use Policy
Model Card

Benchmarks

Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests. It was pretrained on publicly available online data sources. The fine-tuned model, Llama-2-chat, leverages publicly available instruction datasets and over 1 million human annotations.

RLHF & Training

Llama-2-chat uses reinforcement learning from human feedback to ensure safety and helpfulness. Training Llama-2-chat: Llama 2 is pretrained using publicly available online data. An initial version of Llama-2-chat is then created through the use of supervised fine-tuning. Next, Llama-2-chat is iteratively refined using Reinforcement Learning from Human Feedback (RLHF), which includes rejection sampling and proximal policy optimization (PPO).

The License

Our model and weights are licensed for both researchers and commercial entities, upholding the principles of openness. Our mission is to empower individuals, and industry through this opportunity, while fostering an environment of discovery and ethical AI advancements.

Partnerships

We have a broad range of supporters around the world who believe in our open approach to today’s AI — companies that have given early feedback and are excited to build with Llama 2, cloud providers that will include the model as part of their offering to customers, researchers committed to doing research with the model, and people across tech, academia, and policy who see the benefits of Llama and an open platform as we do.

The/CUT

With the release of Llama 2, Meta has opened up new possibilities for the development and application of large language models. This free open-source AI not only accelerates access but also allows for greater innovation in the field.

Take Three:

Video Game Analogy: Just like getting a powerful, rare (or previously banned) item drop in a game, Llama 2's release gives developers a powerful tool they can use and customize for their unique quests in the world of AI.
Cooking Analogy: Imagine if a world-class chef decided to share their secret recipe with everyone. That's Llama 2, a secret recipe now open for all to use, adapt, and improve upon in the kitchen of AI development.
Construction Analogy: Llama 2 is like a top-grade construction tool now available to all builders. It opens up new possibilities for constructing advanced AI structures that were previously hard to achieve.

Links

Here are the key resources discussed in this post:

Want to get started with free open-source artificial intelligence, but don't know where to begin?

Try starting here:

If you found anything else about this post interesting - consider subscribing to !fosai@lemmy.world where I do my best to keep you in the know about the most important updates in free open-source artificial intelligence.

This particular announcement is exciting to me because it may popularize open-source principles and practices for other enterprises and corporations to follow.

We should see some interesting models emerge out of Llama 2. I for one am looking forward to seeing where this will take us next. Get ready for another wave of innovation! This one is going to be big.

2

Introducing LongLLaMA: Focused Transformer (FoT) Training for Ultra Long LLM Context Scaling (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/worldnews@lemmy.ml

0 comments fedilink

cross-posted from: https://lemmy.world/post/1306474

CStanKonrad has Released an Early Version of long_llama: Focused Transformer (FoT) Training for Context Scaling

https://github.com/CStanKonrad/long_llama

This repository contains the research preview of LongLLaMA, a large language model capable of handling long contexts of 256k tokens or even more.

LongLLaMA is built upon the foundation of OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method. We release a smaller 3B variant of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on Hugging Face. Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models. Stay tuned for further updates.

This is an awesome resource to pair alongside the recent FoT breakthroughs covered in this paper/post here.

Focused Transformer: Contrastive Training for Context Scaling (FoT) presents a simple method for endowing language models with the ability to handle context consisting possibly of millions of tokens while training on significantly shorter input. FoT permits a subset of attention layers to access a memory cache of (key, value) pairs to extend the context length. The distinctive aspect of FoT is its training procedure, drawing from contrastive learning. Specifically, we deliberately expose the memory attention layers to both relevant and irrelevant keys (like negative samples from unrelated documents). This strategy incentivizes the model to differentiate keys connected with semantically diverse values, thereby enhancing their structure. This, in turn, makes it possible to extrapolate the effective context length much beyond what is seen in training.

LongLLaMA is an OpenLLaMA model finetuned with the FoT method, with three layers used for context extension. Crucially, LongLLaMA is able to extrapolate much beyond the context length seen in training: . E.g., in the passkey retrieval task, it can handle inputs of length.

This is an incredible advancement in context lengths for LLMs. Less than a month ago we were excited to celebrate 6k context lengths. We are now blowing these metrics out of the water. It is only a matter of time before compute and efficiency gains follow and support these new possibilities.

If you found any of this interesting, please consider subscribing to /c/FOSAI where I do my best to keep you up to date with the most important updates and developments in the space.

Want to get started with FOSAI, but don't know how? Try starting with my Welcome Message and/or The FOSAI Nexus & Lemmy Crash Course to Free Open-Source AI.

News: OpenAI Introduces Superalignment in c/worldnews@lemmy.ml

[–] Blaed@lemmy.world 2 points 1 year ago* (last edited 1 year ago) (1 children)

OpenAI has launched a new initiative, Superalignment, aimed at guiding and controlling ultra-intelligent AI systems. Recognizing the imminent arrival of AI that surpasses human intellect, the project will dedicate significant resources to ensure these advanced systems act in accordance with human intent. It's a crucial step in managing the transformative and potentially dangerous impact of superintelligent AI.

I like to think this starts to explore interesting philosophical questions like human intent, consciousness, and the projection of will into systems that are far beyond our capabilities in raw processing power and input/output. What may happen from this intended alignment is yet to be seen, but I think we can all agree the last thing we want in these emerging intelligent machines is to do things we don't want them to do.

'Superalignment' is OpenAI's response in how to put up these safeguards. Whether or not this is the best method is to be determined.

26

News: OpenAI Introduces Superalignment (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/worldnews@lemmy.ml

12 comments fedilink

cross-posted from: https://lemmy.world/post/1102882

On 07/05/23, OpenAI Has Announced a New Initiative:

Superalignment

https://openai.com/blog/introducing-superalignment

Here are a few notes from their article, which you should read in its entirety.

Introducing Superalignment

We need scientific and technical breakthroughs to steer and control AI systems much smarter than us. To solve this problem within four years, we’re starting a new team, co-led by Ilya Sutskever and Jan Leike, and dedicating 20% of the compute we’ve secured to date to this effort. We’re looking for excellent ML researchers and engineers to join us.

Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.

While superintelligence seems far off now, we believe it could arrive this decade.

Here we focus on superintelligence rather than AGI to stress a much higher capability level. We have a lot of uncertainty over the speed of development of the technology over the next few years, so we choose to aim for the more difficult target to align a much more capable system.

Managing these risks will require, among other things, new institutions for governance and solving the problem of superintelligence alignment:

How do we ensure AI systems much smarter than humans follow human intent?

Currently, we don't have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.

Other assumptions could also break down in the future, like favorable generalization properties during deployment or our models’ inability to successfully detect and undermine supervision during training.

Our approach

Our goal is to build a roughly human-level automated alignment researcher. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence.

To align the first automated alignment researcher, we will need to 1) develop a scalable training method, 2) validate the resulting model, and 3) stress test our entire alignment pipeline:

1.) To provide a training signal on tasks that are difficult for humans to evaluate, we can leverage AI systems to assist evaluation of other AI systems (scalable oversight). In addition, we want to understand and control how our models generalize our oversight to tasks we can’t supervise (generalization).

2.) To validate the alignment of our systems, we automate search for problematic behavior (robustness) and problematic internals (automated interpretability).

3.) Finally, we can test our entire pipeline by deliberately training misaligned models, and confirming that our techniques detect the worst kinds of misalignments (adversarial testing).

We expect our research priorities will evolve substantially as we learn more about the problem and we’ll likely add entirely new research areas. We are planning to share more on our roadmap in the future.

The new team

We are assembling a team of top machine learning researchers and engineers to work on this problem.

We are dedicating 20% of the compute we’ve secured to date over the next four years to solving the problem of superintelligence alignment. Our chief basic research bet is our new Superalignment team, but getting this right is critical to achieve our mission and we expect many teams to contribute, from developing new methods to scaling them up to deployment.

Click Here Read More.

I believe this is an important notch in the timeline to AGI and Synthetic Superintelligence. I find it very interesting OpenAI is ready to admit the proximity of breakthroughs we are quickly encroaching as a species. I hope we can all benefit from this bright future together.

If you found any of this interesting, please consider subscribing to /c/FOSAI!

Thank you for reading!

Hugging Face and AMD partner on accelerating state-of-the-art models for CPU and GPU platforms in c/technology@beehaw.org

[–] Blaed@lemmy.world 3 points 1 year ago

For anyone unaware, this is probably one of the better short and sweet explanations in regards to what HuggingFace is.

It is a hub for many code repositories hosting AI specific files and configurations, which has become a core ecosystem of many artificial intelligence breakthroughs, platforms, and applications.

Hugging Face and AMD partner on accelerating state-of-the-art models for CPU and GPU platforms in c/technology@beehaw.org

[–] Blaed@lemmy.world 2 points 1 year ago

🤗

25

Hugging Face and AMD partner on accelerating state-of-the-art models for CPU and GPU platforms (huggingface.co)

submitted 1 year ago* (last edited 1 year ago) by Blaed@lemmy.world to c/technology@beehaw.org

9 comments fedilink

cross-posted from: https://lemmy.world/post/135600

For anyone following the AI space of technology - this is pretty cool - especially since AMD has fallen behind its NVIDIA CUDA competitors.

(full article for convenience)

Hugging Face and AMD partner on accelerating state-of-the-art models for CPU and GPU platforms

Whether language models, large language models, or foundation models, transformers require significant computation for pre-training, fine-tuning, and inference. To help developers and organizations get the most performance bang for their infrastructure bucks, Hugging Face has long been working with hardware companies to leverage acceleration features present on their respective chips.

Today, we're happy to announce that AMD has officially joined our Hardware Partner Program. Our CEO Clement Delangue gave a keynote at AMD's Data Center and AI Technology Premiere in San Francisco to launch this exciting new collaboration.

AMD and Hugging Face work together to deliver state-of-the-art transformer performance on AMD CPUs and GPUs. This partnership is excellent news for the Hugging Face community at large, which will soon benefit from the latest AMD platforms for training and inference.

The selection of deep learning hardware has been limited for years, and prices and supply are growing concerns. This new partnership will do more than match the competition and help alleviate market dynamics: it should also set new cost-performance standards.

Supported hardware platforms

On the GPU side, AMD and Hugging Face will first collaborate on the enterprise-grade Instinct MI2xx and MI3xx families, then on the customer-grade Radeon Navi3x family. In initial testing, AMD recently reported that the MI250 trains BERT-Large 1.2x faster and GPT2-Large 1.4x faster than its direct competitor.

On the CPU side, the two companies will work on optimizing inference for both the client Ryzen and server EPYC CPUs. As discussed in several previous posts, CPUs can be an excellent option for transformer inference, especially with model compression techniques like quantization.

Lastly, the collaboration will include the Alveo V70 AI accelerator, which can deliver incredible performance with lower power requirements.

Supported model architectures and frameworks

We intend to support state-of-the-art transformer architectures for natural language processing, computer vision, and speech, such as BERT, DistilBERT, ROBERTA, Vision Transformer, CLIP, and Wav2Vec2. Of course, generative AI models will be available too (e.g., GPT2, GPT-NeoX, T5, OPT, LLaMA), including our own BLOOM and StarCoder models. Lastly, we will also support more traditional computer vision models, like ResNet and ResNext, and deep learning recommendation models, a first for us.

We'll do our best to test and validate these models for PyTorch, TensorFlow, and ONNX Runtime for the above platforms. Please remember that not all models may be available for training and inference for all frameworks or all hardware platforms.

The road ahead

Our initial focus will be ensuring the models most important to our community work great out of the box on AMD platforms. We will work closely with the AMD engineering team to optimize key models to deliver optimal performance thanks to the latest AMD hardware and software features. We will integrate the AMD ROCm SDK seamlessly in our open-source libraries, starting with the transformers library.

Along the way, we'll undoubtedly identify opportunities to optimize training and inference further, and we'll work closely with AMD to figure out where to best invest moving forward through this partnership. We expect this work to lead to a new Optimum library dedicated to AMD platforms to help Hugging Face users leverage them with minimal code changes, if any.

Conclusion

We're excited to work with a world-class hardware company like AMD. Open-source means the freedom to build from a wide range of software and hardware solutions. Thanks to this partnership, Hugging Face users will soon have new hardware platforms for training and inference with excellent cost-performance benefits. In the meantime, feel free to visit the AMD page on the Hugging Face hub. Stay tuned!

Your Lemmy Crash Course to Free Open-Source AI in c/technology@beehaw.org

[–] Blaed@lemmy.world 1 points 1 year ago

FWIW, it's a new term I am trying to coin in FOSS communities (Free, Open-Source Software communities). It's a spin off of 'FOSS', but for AI.

There's literally nothing wrong with FOSS as an acronym, I just wanted to use one more focused in regards to AI tech to set the right expectations for everything shared in /c/FOSAI

I felt it was a term worth coining given the varied requirements and dependancies AI/LLMs tend to have compared to typical FOSS stacks. Making this differentiation is important in some of the semantics these conversations carry.

Welcome to Free Open-Source Artificial Intelligence! in c/technology@beehaw.org

[–] Blaed@lemmy.world 2 points 1 year ago

Big brain moment.

Ironically, I think using this technology to do exactly that is one of its greatest strengths...

GL, HF!

Your Lemmy Crash Course to Free Open-Source AI in c/technology@beehaw.org

[–] Blaed@lemmy.world 3 points 1 year ago

Great suggestions! I've actually never interfaced with that first channel (SECourses). Looks like some solid tutorials. Definitely going to check that out. Thanks for sharing!

Your Lemmy Crash Course to Free Open-Source AI in c/technology@beehaw.org

[–] Blaed@lemmy.world 2 points 1 year ago

Lol, you had me in the first half not gonna lie. Well done, you almost fooled me!

Glad you had some fun! gpt4all is by far the easiest to get going with imo.

I suggest trying any of the GGML models if you haven't already! They outperform almost every other model format at the moment.

If you're looking for more models, TheBloke and KoboldAI are doing a ton for the community in this regard. Eric Hartford, too. Although TheBloke is typically the one who converts these into more accessible formats for the masses.

Your Lemmy Crash Course to Free Open-Source AI in c/technology@beehaw.org

[–] Blaed@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Thank you! I appreciate the kind words. Please consider subscribing to /c/FOSAI if you want to stay in the loop with the latest and greatest news for AI.

This stuff is developing at breakneck speeds. Very excited to see what the landscape will look like by the end of this year.

Your Lemmy Crash Course to Free Open-Source AI in c/technology@beehaw.org

[–] Blaed@lemmy.world 3 points 1 year ago* (last edited 1 year ago)

Absolutely! I'm having a blast launching /c/FOSAI over at Lemmy.world. I'll do my best to consistently cross-post to everyone over here too!

48

Your Lemmy Crash Course to Free Open-Source AI (lemmy.world)

submitted 1 year ago* (last edited 1 year ago) by Blaed@lemmy.world to c/technology@beehaw.org

16 comments fedilink

cross-posted from: https://lemmy.world/post/76020

Greetings Reddit Refugees!

I hope your migration is going well! If you haven't been here before, Welcome to FOSAI! Your new Lemmy landing page for all things artificial intelligence.

This is a follow-up post to my first Welcome Message.

Here I will share insights and instructions on how to set up some of the tools and applications in the aforementioned AI Suite.

Please note that I did not develop any of these, but I do have each one of them working on my local PC, which I interface with regularly. I will plan to do posts exploring each software in detail - but first - let's get a better idea what we're working with.

As always, please don't hesitate to comment or share your thoughts if I missed something (or you want to understand or see a concept in more detail).

Getting Started with FOSAI

What is oobabooga?

How-To-Install-oobabooga

In short, oobabooga is a free and open source web client someone (oobabooga) made to interface with HuggingFace LLMs (large language models). As far as I understand, this is the current standard for many AI tinkerers and those who wish to run models locally. This client allows you to easily download, chat, and configure with text-based models that behave like Chat-GPT, however, not all models on HuggingFace are at the same level of Chat-GPT out-of-the-box. Many require 'fine-tuning' or 'training' to produce consistent, coherent results. The benefit using HuggingFace (instead of Chat-GPT) is that you have much more options to choose from regarding your AI model, including the option to choose a censored or uncensored version of a model, untrained or pre-trained, etc. Oobabooga is an interface that let's you do all this (theoretically), but can have a bit of a learning curve if you don't know anything about AI/LLMs.

What is gpt4all?

How-To-Install-gpt4all

gpt4all is the closest thing you can currently download to have a Chat-GPT style interface that is compatible with some of the latest open-source LLM models available to the community. Some models can be downloaded in quantized formats, unquantized formats, and base formats (which typically run GPU only), but there are new model formats that are emerging (GGML), which enable GPU + CPU compute. This GGML format seems to be the growing standard for consumer-grade hardware. Some prefer the user experience of gpt4all over oobabooga, and some feel the exact opposite. For me - I prefer the options oobabooga provides - so I use that as my 'daily driver' while gpt4all is a backup client I run for other tests.

What is Koboldcpp?

How-To-Install-Koboldcpp

Koboldcpp, like oobabooga and gpt4all is another web-based interface you can run to chat with LLMs locally. It enables GGML inference, which can be hard to get running on oobabooga depending on the version of your client and updates from the developer. Koboldcpp, however, is part of a totally different platform and team of developers who typically focus on the roleplaying aspect of generative AI and LLMs. Koboldcpp feels more like NovelAI than anything I've ran locally, and has similar functionality and vibes as AI Dungeon. In fact, you can download some of the same models and settings that they use to emulate something very similar (but 100% local, assuming you have capable hardware).

What is TavernAI?

How-To-Install-TavernAI

TavernAI is a customized web-client that seems as functional as gpt4all in most regards. You can use TavernAI to connect with Kobold's API - as well as insert your own Chat-GPT API key to talk with OpenAI's GPT-3 (and GPT-4 if you have API access).

What is Stable Diffusion?

How-To-Install-StableDiffusion (Automatic1111)

Stable Diffusion is a groundbreaking and popular AI model that enables text to image generation. When someone thinks of "Stable Diffusion" people tend to picture Automatic1111's UI/UX, which is the same interface oobabooga is inspired by. This UI/UX has become the defacto standard for almost all Stable Diffusion workflows. Fun factoid - it is widely believed MidJourney is a highly tuned version of a Stable Diffusion model, but one who's weights, LoRAs, and configurations made closed-source after training and alignment.

What is ControlNet?

How-To-Install-ControlNet

ControlNet is a way you can manually control models of Stable Diffusion, allowing you to have complete freedom over your generative AI workflow. The best example of what this is (and what it can do) can be seen in this video. Notice how it combines an array of tools you can use as pre-processors for your prompts, enhancing the composition of your image by giving you options to bring out any detail you wish to manifest.

What is TemporalKit?

How-To-Install-TemporalKit

This is another Stable Diffusion extension that allows you to create custom videos using generative AI. In short, it takes an input video and chops them into dozens (or hundreds) of frames that can then be batch edited with Stable Diffusion, amassing new key frames and sequences which are stitched back together with EbSynth using your new images, resulting a stylized video that was generated and edited based on your Stable Diffusion prompt/workflow.

Where to Start?

Unsure where to begin? Do you have no idea what you're doing? Or have paralysis by analysis? That's okay, we've all been there.

Start small, don't install everything at once, and instead, ask yourself what sounds like the most fun? Pick one of the tools I've mentioned above and spend as much time as you need to get it working. This work takes patience, cultivation, and motion. The first two parts of that (patience, cultivation) typically take the longest to get over.

If you end up at your wit's end installing or troubleshooting these tools - remind yourself this is bleeding edge artificial intelligence technology. It shouldn't be easy in these early phases. The good news is I have a strong feeling it will become easier than any of us could imagine over time. If you cannot get something working, consider posting your issue here with information regarding your problem.

To My Esteemed Lurkers...

If you're a lurker (like I used to be), please consider taking a popcorn break and stepping out of your shell, making a post, and asking questions. This is a safe space to share your results and interests with AI - or make a post about your epic project or goal. All progress is welcome here, all conversations about this tech are fair and waiting to be discussed.

Over the course of this next week I will continue releasing general information to catch this community up to some of its more-established counterparts.

Consider subscribing if you liked the content of this post or want to stay in the loop with Free, Open-Source Artificial Intelligence.

10

Welcome to Free Open-Source Artificial Intelligence! (lemmy.world)

submitted 1 year ago by Blaed@lemmy.world to c/technology@beehaw.org

4 comments fedilink

cross-posted from: https://lemmy.world/post/67758

#Hello World!

Welcome to your new Lemmy landing page for AI news, software, applications, research, and development updates regarding all FOSS (free open-source software) designed for AI (FOSAI). The goal of this community is to be a mixture of r/LocalLLaMA, r/oobabooga, and r/StableDiffusion with the intention of spreading awareness and accessibility for all free open-source artificial intelligence applications, tools, platforms, and strategies from the bleeding edges of AI.

Those of this community should feel welcome treating this place as another casual discussion hub for concepts and topics similar to what r/Singularity and r/Futurology - used to chat about, but the sentiment I want to encourage here is one of optimism and creative empowerment, discovery and exploration.

Whether you like it or not - AI is here (and it's here to stay). What that means, what you do with it, (and why) is completely up to you to figure out within your own life and aspirations. If you want a second opinion about something, let's have a discussion about it with intellectual empathy and open-minded kindness.

Know that I don't have the time and energy for moderating 'me vs you' narratives - or pushing the doom and gloom sentiment that this emerging technology typically gets a bad rap for. I understand the risks we're facing; I leave it up to you to come to your own conclusions and live your life accordingly. At the end of the day, I'm here to help facilitate honest and open discussion, and to share the joy of technology in whatever shape or form it presents itself to me. At the end of this information transaction, I hope you retain a small piece that you find interesting and use it to brighten a life, whether yours or someone else's.

Those of our community are focused on the silver linings of life, the excitement of our future, the hope of a brighter tomorrow. We are not afraid of anything but the failure of compassion and humanity in its smallest, most defining moments. To innovate is to think, to think is to live. Be mindful of what you say. Be mindful of what you do. Stay patient with others and cultivate a trust within yourself that never forgets to put thoughts into motion.

I want everyone to know that I am far from being an expert in Machine Learning (and Deep Learning in general). I don't know everything, but I teach myself what I can. That being said, if you know something about AI, LLMs, python, json, data-processing, data-engineering, neural networks, machine learning, deep learning, full-stack development, front-end development, or any DevOps / IaC practices - please consider subscribing and contributing to the discussion.

Together perhaps we can unite with other FOSS communities to usher in an era of FOSAI that evolves with the rapidly emerging ideas filling this sector. Or perhaps we remain a niche corner of AI development and discussion. Regardless - all technologists are welcome here, from engineers to data scientists, to hobbyists, and complete newbies.

I will do my best to keep everyone here in the loop with the latest and greatest news to hit the space. At the same time, you should voice your opinion if I am doing something wrong, or if you want to be heard.

The more like-minded individuals we gather here the more chances we will have to shape what AI will look like for generations to come. If you know anything about AI, LLMs, or Machine/Deep Learning in general, please consider subbing and contributing to the discussion by sharing your project and/or subject approach!

The Bleeding Edge

The applications I am about to recommend are limited to the suite of AI tools I've collected the last few months and the knowledge I've accumulated up to this point (June 2023). If you notice something missing that is worth sharing, please by all means don't hesitate to add a link or share your thoughts in the comments. I will be expanding on these tools in a subsequent post Getting Started with FOSAI / Your Lemmy Crash Course to Artificial Intelligence

My AI Suite / Workshop / Tech Stack

Generative Text

oobabooga

gpt4all

koboldcpp

TavernAI

Generative Image & Video

Stable Diffusion (Automatic1111)

ControlNet

TemporalKit

EbSynth

Runway (Gen 2)

Hosting

Runpod (Rent-a-Server / GPU)

Self-Hosting [Windows/Linux/MacOS] (GPU, CPU, RAM)