Building Safe and Secure LLM Applications | Risks of LLM | Machine Unlearning for Responsible AI

  Рет қаралды 65

Anybody Can Prompt (ABCP) | AI News and Trends

Anybody Can Prompt (ABCP) | AI News and Trends

22 күн бұрын

Building a Safer LLM
How to Build a Secure LLM
Ensuring AI Safety
Safe and responsible development with generative language models
Machine Unlearning
Developing Safe and Responsible Large Language Models
Risks of Large Language Models (LLM)
Discover how machine unlearning is revolutionizing the field of AI safety in this groundbreaking video. We dive deep into the cutting-edge research paper "Towards Safer Large Language Models through Machine Unlearning" by Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, and Meng Jiang from the University of Notre Dame and the University of Pennsylvania.
As large language models (LLMs) become increasingly powerful, there are growing concerns about their potential to generate harmful content. However, the authors propose a novel solution called Selective Knowledge negation Unlearning (SKU), which aims to remove harmful knowledge from LLMs while preserving their overall performance and capabilities.
In this video, we explore the two-stage process of SKU, starting with the harmful knowledge acquisition stage. This stage utilizes three innovative modules to identify and learn harmful information within the model from different perspectives. We then move on to the knowledge negation stage, where the isolated harmful knowledge is strategically removed, resulting in a safer and more reliable language model.
The results of SKU are truly impressive. The authors demonstrate that SKU can reduce the harmful response rate to just 3% on unlearned prompts and 4% on unseen prompts, a significant improvement compared to the original model's 57% harmful rate. Moreover, SKU maintains low perplexity scores, indicating that it can still generate coherent and fluent text, and achieves high BLEURT scores, showing semantic similarity to the safe original model outputs.
We also take a closer look at the three key modules in the harmful knowledge acquisition stage: the guided distortion module, the random disassociation module, and the preservation divergence module. Each module plays a crucial role in identifying and learning diverse harmful knowledge that can be effectively unlearned in the negation stage.
The impact of SKU on the future of AI safety cannot be overstated. By enabling targeted unlearning of harmful knowledge in LLMs while maintaining their core capabilities, SKU paves the way for safer and more trustworthy AI systems. This pioneering research opens up new possibilities for responsible deployment of LLMs in real-world applications.
Don't miss this opportunity to learn about the cutting-edge techniques that are shaping the future of AI safety. Watch now and discover how machine unlearning with SKU can help us create smarter, safer, and more reliable language models.
For more information on this research, visit the GitHub repository at github.com/franciscoliu/SKU or read the full paper "Towards Safer Large Language Models through Machine Unlearning" by Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, and Meng Jiang.
#MachineUnlearning #AISafety #LanguageModels #LLM #SKU #ResponsibleAI #UnlearningHarmfulKnowledge #SelectiveKnowledgeUnlearning #SaferAI #TrustworthyAI
About this Channel:
Welcome to Anybody Can Prompt (ABCP), your source for the latest Artificial Intelligence news, trends, and technology updates. By AI, for AI, and of AI, we bring you groundbreaking news in AI Trends, AI Research, Machine Learning, and AI Technology. Stay updated with daily content on AI breakthroughs, academic research, and AI ethics.
Do you ever feel overwhelmed by the rapid advancements in AI, especially Gen AI?
Upgrade your life with a daily dose of the biggest tech news - broken down in AI breakthroughs, AI ethics, and AI academia. Be the first to know about cutting-edge AI tools and the latest LLMs. Join over 15,000 minds who rely on ABCP for the latest in generative AI.
Subscribe to our newsletter for FREE to get updates straight to your inbox:
anybodycanprompt.substack.com...
Check out our latest list of Gen AI Tools [Updated May 2024]
sites.google.com/view/anybody...
Let's stay connected on any of the following platforms of your choice:
anybodycanprompt.substack.com
/ anybodycanprompt
/ anybodycanprompt
/ 61559330045287
x.com/abcp_community
github.com/anybodycanprompt/A...
Please share this channel & the videos you liked with like-minded Gen AI enthusiasts.
#AI #ArtificialIntelligence #AINews #GenerativeAI #TechNews #ABCP #aiupdates
Subscribe here- anybodycanprompt.substack.com...

Пікірлер
Goodbye, Chain of Thought. Hello, Buffer of Thoughts: The Game-Changing Approach to LLM Reasoning
3:20
Anybody Can Prompt (ABCP) | AI News and Trends
Рет қаралды 63
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 77 М.
I CAN’T BELIEVE I LOST 😱
00:46
Topper Guild
Рет қаралды 53 МЛН
I wish I could change THIS fast! 🤣
00:33
America's Got Talent
Рет қаралды 74 МЛН
🌊Насколько Глубокий Океан ? #shorts
00:42
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
New LLM BEATS LLaMA3 - Fully Tested
17:03
Matthew Berman
Рет қаралды 41 М.
The Attention Mechanism in Large Language Models
21:02
Serrano.Academy
Рет қаралды 82 М.
How to Improve LLMs with RAG (Overview + Python Code)
21:41
Shaw Talebi
Рет қаралды 29 М.
A Complete Overview of Word Embeddings
17:17
AssemblyAI
Рет қаралды 99 М.
Andrew Ng On AI Agentic Workflows And Their Potential For Driving AI Progress
30:54
AI Leader Reveals The Future of AI AGENTS (LangChain CEO)
16:22
Matthew Berman
Рет қаралды 98 М.
Geoffrey Hinton | Will digital intelligence replace biological intelligence?
1:58:38
Schwartz Reisman Institute
Рет қаралды 149 М.
Simple maintenance. #leddisplay #ledscreen #ledwall #ledmodule #ledinstallation
0:19
LED Screen Factory-EagerLED
Рет қаралды 2,3 МЛН
Ждёшь обновление IOS 18? #ios #ios18 #айоэс #apple #iphone #айфон
0:57