(Go: >> BACK << -|- >> HOME <<)

SlideShare a Scribd company logo
How Does Generative AI Actually Work?
(a quick semi-technical introduction to Large Language Models)
William H. Press
Department of Computer Science
The University of Texas at Austin
May 12, 2023
What word
comes after
“Happy…” Is it “aardvark”? Or “birthday”?
Simple probabilities (but trillions of them) are
at the root of all large language models (LLMs)
Me: ChatGPT, I want to explain sequential word guessing to someone. If a
sentence starts with "The boy" can you tell me what your next 10
sequential guesses of words would be, one word at a time?
Sure! Here are 10 sequential word guesses, one at a time, based on the sentence
starting with "The boy":
1. went, 2. to, 3. the, 4. park, 5. to, 6. play, 7. with, 8. his, 9.friends, 10. and
Keep in mind that as a language model, my guesses are based on patterns and
probabilities derived from the text corpus I was trained on. The actual guesses may
vary depending on the specific context and the model version being used.
What if the sentence instead begins with "The secret behind GPT-4 is"?
Here are 10 sequential word guesses:
1. "its", 2. "ability", 3. "to", 4. "understand", 5. "and", 6. "generate", 7. "human-
like", 8. "text", 9. "based", 10. "on“
Again, these are just guesses based on common patterns and do not represent an
actual sentence.
The distant ancestor of all chatbots is sequential word-
guessing in natural languages. Hence “generative”.
LLMs as “lossy text compression”
• Represent training corpus more compactly by finding and
encoding its structures and relationships
• eliminate redundancy at all levels: syntactic, semantic, multiple
sources, etc.
• the result encoded in ~109-1012 matrix “weights”
• “Lossy” because information is irretrievably lost
• prompts are answered by (sort-of) decompressing into highly
probable responses that could have been in the training data, but,
in general, weren’t exactly so verbatim
• The decompressed data maintains accuracy when…
• it is “common sense” or “conventional wisdom”
• because then a huge redundancy in the training data
• But can be wildly inaccurate (like “digital artifacts” in a
defective video) if query is not well represented in the
compressed training corpus
• e.g., most probable answer comes from one (wrong) document
• or variants of a widely circulated conspiracy theory
• if uncompressing from no germane data at all, it just makes things
up (“hallucinates”) to get the most probable response
https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web
GPT-4: OpenAI’s latest released Large Language Model
• OpenAI isn’t actually open! Can for many purposes be
thought of as practically a Microsoft subsidiary.
• Microsoft is said to provide the unique hardware infrastructure for
OpenAI algorithm development.
• GPT = Generative Pre-trained Transformer
• Thought to have >5x1011 trainable parameters.
• GPT-3 had 1.75x1011
• Trained on > several terabytes of language data
• Training cost claimed to be $100 million
• but this might be including amortized R&D
• once trained, cost per query is millicents per token
• I will highlight three key elements of “secret sauce”:
• 1. transformer architecture
• 2. huge scale of parameter space training corpus
• 3. “RLHF” Reinforcement Learning from Human Feedback
• mostly not reported on
Key #1: Transformer architecture. It is a distant
descendant of document query concepts
Document retrieval:
• Input text projected onto matrix of
possible queries
• Matrix multiply to cross queries
with keys (e.g., keywords)
• Matrix multiply to map result from
keys to values (e.g., documents)
• The brilliant idea of Vaswani et al.
(2017, “Attention Is All You Need”) is
map all of Q, K, V from the same input.
• This is “Self-Attention”
• And have all of Q, K, V learned.
• Many layers allows attention to many
different levels of structure
simultaneously
• This is “Multi-headed”
https://dugas.ch/artificial_curiosity/GPT_architecture.html
input
processing
stuff
(encoder)
output
processing
stuff
(decoder)
~ 103 layers?
Key #2: Sheer scale: Only a few dare to call it
emergence, but the gain-of-function is striking
• Transformer parameters:
• trillion parameters =(?) 1000 parallel instances of billion
• billion parameters in an instance =(?) 104 each query space, key space,
value space (multiply two at a time) + “glue”
• could think of as looking at every token list 107 ways in formulating the
next response
• “stateless”: looks at whole previous dialogue as a new token list, maximum
length 32768 tokens
• Training corpus parameters:
• many terabytes?
• ~1000x number of words a human hears or reads in a lifetime
• Many proprietary tricks:
• how to propagate gradients through the huge scale?
• how to maximize parallelism in training?
• special hardware?
• rumored that GPT-5 training is on hold because not enough GPUs are
obtainable in the world.
Key #3: Reinforcement Learning with
Human Feedback (RLHF)
• Said to involve 40 contractors with (each?) hundreds of workers.
• 1. Humans generate an initial dataset of “typical” queries and “good” responses.
• 2. Humans rank a separate dataset of responses (some good, some bad).
• 3. The reward model is trained on this data.
• 4. The language model trains itself to satisfy the reward model. (How often?)
• 5. The reward model is retrained with (free!) input from hundreds of millions of
users. (How often?)
user
prompt
“most-human”
output
many
output
guesses
feedback
https://aisupremacy.substack.com/p/
what-is-reinforcement-learning-with
https://www.bradknox.net/
human-reward/tamer/
In typical usage, RLHF doesn't directly train the base GPT model. The GPT
model is first pretrained on a large corpus of text data, learning to predict
the next word in a sentence. This pretraining process is unsupervised and
uses a variant of the Transformer architecture to learn patterns and
structures in the data.
After this pretraining, the model can be fine-tuned using reinforcement
learning from human feedback. In this process, human evaluators rank
different responses generated by the model, and these rankings are used
to create a reward model. The model is then fine-tuned to optimize this
reward model, typically using a method like Proximal Policy Optimization.
In this way, the RLHF is a separate step that fine-tunes the pretrained GPT
model, rather than directly training the initial GPT weights. However,
please note that this information is based on the methodology up to
September 2021, and OpenAI's approach may have evolved since then.
Me: ChatGPT, When GPT-3 or GPT-4 uses RLHF, is it a separate model, or
does RLHF train the GPT weights directly?
“There is no moat?” (Some insiders think that the technology
can be miniaturized and will leak via open source availability to
individuals and thence startups.)
• Meta/Facebook’s entire LLM was leaked online in March, 2023.
• and later released publicly
• A May, 2023, leaked Google document (one engineer’s private
view):
• People are now running foundation models on a Pixel 6 at 5 tokens/s
• You can fine-tune a personalized AI on your laptop in an evening.
• Open-source models are [he says] faster, more customizable, more
private, and pound-for-pound more capable.
• “They are doing things with $100 and 13B params that we struggle with
at $10M and 540B.”
• LORa (low-rank factorization of large models) saves factor ~103
• and is composable to train new models, no need to go back to full model
• Big companies might be held liable for copyright infringement on
training data, or have to buy it
• Open-source leaked models might (de-facto) not be
• Scenario: A bunch of (LORa?) matrices appear on a Discord server
Me: ChatGPT, To what extent can LLMs be miniaturized, for example by low
rank factorization?
As of my last update in September 2021, there's been some promising
research showing that low-rank factorization can be used to compress neural
networks, including Transformer-based models. Weight matrices in the
Transformer's self-attention layers can be approximated well with low-rank
matrices, which can significantly reduce the model's size with minimal loss in
performance.
Low-rank factorization is just one of many techniques for model
compression. Others include knowledge distillation, where a smaller model is
trained to mimic the behavior of a larger model, and pruning, where less
important parts of the model are removed.
Me: ChatGPT, Is LLM training data subject to copyright and thus needs to be
licensed?
The use of copyrighted material in training LLMs may be considered a
copyright infringement. The models learn by consuming vast amounts of text
data which may include copyrighted works. By using this information to
generate new text, the models are indirectly reproducing and repurposing
parts of the copyrighted works they were trained on, albeit in a non-
verbatim manner. Some argue that this constitutes a derivative work, which
is a right reserved for the copyright holder under U.S. Copyright Law and
international copyright agreements. The copyright holder has exclusive rights
to reproduce and create derivative works from their material.
The counter-argument hinges on the doctrine of fair use. Advocates can
argue that LLMs, which generate new, original text, use copyrighted material
in a transformative manner, this a crucial consideration in fair use
determinations. The vast amount of training data also means any specific
copyrighted work constitutes an insubstantial portion of the input. Lastly, this
use doesn't harm the market for the original works but may encourage
further knowledge creation. However, as of September 2021, these complex
legal questions lack clear legal precedent and interpretations may vary across
jurisdictions.
The LLM Value Chain:
figure courtesy W. Dally
May turn out that most value is with the end-service provider
who specializes the model and knows the end-customer needs.
Thank you.

More Related Content

What's hot

Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
Liming Zhu
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
DianaGray10
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
Steven Van Vaerenbergh
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
PremNaraindas1
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Huahai Yang
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
David Talby
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
Cori Faklaris
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
Data Science Dojo
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
Steve Omohundro
 
Generative AI
Generative AIGenerative AI
Generative AI
Carlos J. Costa
 
Using Generative AI
Using Generative AIUsing Generative AI
Using Generative AI
Mark DeLoura
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AI
CMassociates
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
Jesus Rodriguez
 
generative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsgenerative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language models
AdventureWorld5
 
Journey of Generative AI
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
thomasjvarghese49
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
Fiza987241
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
AnastasiaSteele10
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
Loic Merckel
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
Loic Merckel
 
Generative AI
Generative AIGenerative AI
Generative AI
All Things Open
 

What's hot (20)

Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
An Introduction to Generative AI
An Introduction  to Generative AIAn Introduction  to Generative AI
An Introduction to Generative AI
 
Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Using Generative AI
Using Generative AIUsing Generative AI
Using Generative AI
 
Responsible Generative AI
Responsible Generative AIResponsible Generative AI
Responsible Generative AI
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
generative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language modelsgenerative-ai-fundamentals and Large language models
generative-ai-fundamentals and Large language models
 
Journey of Generative AI
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Build an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdfBuild an LLM-powered application using LangChain.pdf
Build an LLM-powered application using LangChain.pdf
 
Introduction to LLMs
Introduction to LLMsIntroduction to LLMs
Introduction to LLMs
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
Generative AI
Generative AIGenerative AI
Generative AI
 

Similar to How Does Generative AI Actually Work? (a quick semi-technical introduction to Large Language Models)

Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
Sri Ambati
 
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
rahul_net
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
rohitcse52
 
ML crash course
ML crash courseML crash course
ML crash course
mikaelhuss
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
HJ van Veen
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Sri Ambati
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Robert McDermott
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Robert McDermott
 
DMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slides
Paige Morgan
 
Using Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptxUsing Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptx
JonathanDietz3
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
butest
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
GoDataDriven
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
Paul Houle
 
ms_3.pdf
ms_3.pdfms_3.pdf
Patterns of Semantic Integration
Patterns of Semantic IntegrationPatterns of Semantic Integration
Patterns of Semantic Integration
Optum
 
Dmdh winter 2015 session #1
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1
sarahkh12
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
Oss swot
Oss swotOss swot
Oss swot
Bill Ott
 
Writing Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric ProfilingWriting Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric Profiling
GeorgeMikros3
 

Similar to How Does Generative AI Actually Work? (a quick semi-technical introduction to Large Language Models) (20)

Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
Breaking down the AI magic of ChatGPT: A technologist's lens to its powerful ...
 
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai searchChatGPT-and-Generative-AI-Landscape Working of generative ai search
ChatGPT-and-Generative-AI-Landscape Working of generative ai search
 
ML crash course
ML crash courseML crash course
ML crash course
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
DMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slidesDMDS Winter 2015 Workshop 1 slides
DMDS Winter 2015 Workshop 1 slides
 
Using Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptxUsing Generative AI in the Classroom .pptx
Using Generative AI in the Classroom .pptx
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
 
ms_3.pdf
ms_3.pdfms_3.pdf
ms_3.pdf
 
Patterns of Semantic Integration
Patterns of Semantic IntegrationPatterns of Semantic Integration
Patterns of Semantic Integration
 
Dmdh winter 2015 session #1
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
Oss swot
Oss swotOss swot
Oss swot
 
Writing Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric ProfilingWriting Machines: Detection and Stylometric Profiling
Writing Machines: Detection and Stylometric Profiling
 

Recently uploaded

@Call @Girls in Solapur 🤷‍♂️ XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S...
 @Call @Girls in Solapur 🤷‍♂️  XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S... @Call @Girls in Solapur 🤷‍♂️  XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S...
@Call @Girls in Solapur 🤷‍♂️ XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S...
Mona Rathore
 
Migrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS CloudMigrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS Cloud
Ortus Solutions, Corp
 
Revolutionizing Task Scheduling in ColdBox
Revolutionizing Task Scheduling in ColdBoxRevolutionizing Task Scheduling in ColdBox
Revolutionizing Task Scheduling in ColdBox
Ortus Solutions, Corp
 
Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.
shivamt017
 
WEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service ProvidersWEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service Providers
Severalnines
 
Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01
williamrobertherman
 
What is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for FreeWhat is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for Free
TwisterTools
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
DNUG e.V.
 
Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
e-Definers Technology
 
ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...
ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...
ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...
nitu gupta#N06
 
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Asher Sterkin
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
confluent
 
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeKolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Misti Soneji
 
Web Hosting with CommandBox and CommandBox Pro
Web Hosting with CommandBox and CommandBox ProWeb Hosting with CommandBox and CommandBox Pro
Web Hosting with CommandBox and CommandBox Pro
Ortus Solutions, Corp
 
@Call @Girls in Saharanpur 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas...
 @Call @Girls in Saharanpur 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas... @Call @Girls in Saharanpur 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas...
@Call @Girls in Saharanpur 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas...
AlinaDevecerski
 
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdfWhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
onemonitarsoftware
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
Hironori Washizaki
 
Intro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AIIntro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AI
Ortus Solutions, Corp
 
AI Chatbot Development – A Comprehensive Guide  .pdf
AI Chatbot Development – A Comprehensive Guide  .pdfAI Chatbot Development – A Comprehensive Guide  .pdf
AI Chatbot Development – A Comprehensive Guide  .pdf
ayushiqss
 
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
avufu
 

Recently uploaded (20)

@Call @Girls in Solapur 🤷‍♂️ XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S...
 @Call @Girls in Solapur 🤷‍♂️  XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S... @Call @Girls in Solapur 🤷‍♂️  XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S...
@Call @Girls in Solapur 🤷‍♂️ XXXXXXXX 🤷‍♂️ Tanisha Sharma Best High Class S...
 
Migrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS CloudMigrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS Cloud
 
Revolutionizing Task Scheduling in ColdBox
Revolutionizing Task Scheduling in ColdBoxRevolutionizing Task Scheduling in ColdBox
Revolutionizing Task Scheduling in ColdBox
 
Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.Shivam Pandit working on Php Web Developer.
Shivam Pandit working on Php Web Developer.
 
WEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service ProvidersWEBINAR SLIDES: CCX for Cloud Service Providers
WEBINAR SLIDES: CCX for Cloud Service Providers
 
Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01Java SE 17 Study Guide for Certification - Chapter 01
Java SE 17 Study Guide for Certification - Chapter 01
 
What is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for FreeWhat is OCR Technology and How to Extract Text from Any Image for Free
What is OCR Technology and How to Extract Text from Any Image for Free
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
 
Top 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your WebsiteTop 10 Tips To Get Google AdSense For Your Website
Top 10 Tips To Get Google AdSense For Your Website
 
ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...
ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...
ℂall Girls in Surat 🔥 +91-7023059433 🔥 Best High ℂlass Surat Esℂorts Serviℂe ...
 
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
Ported to Cloud with Wing_ Blue ZnZone app from _Hexagonal Architecture Expla...
 
Break data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud ConnectorsBreak data silos with real-time connectivity using Confluent Cloud Connectors
Break data silos with real-time connectivity using Confluent Cloud Connectors
 
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeKolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
 
Web Hosting with CommandBox and CommandBox Pro
Web Hosting with CommandBox and CommandBox ProWeb Hosting with CommandBox and CommandBox Pro
Web Hosting with CommandBox and CommandBox Pro
 
@Call @Girls in Saharanpur 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas...
 @Call @Girls in Saharanpur 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas... @Call @Girls in Saharanpur 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas...
@Call @Girls in Saharanpur 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Clas...
 
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdfWhatsApp Tracker -  Tracking WhatsApp to Boost Online Safety.pdf
WhatsApp Tracker - Tracking WhatsApp to Boost Online Safety.pdf
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
 
Intro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AIIntro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AI
 
AI Chatbot Development – A Comprehensive Guide  .pdf
AI Chatbot Development – A Comprehensive Guide  .pdfAI Chatbot Development – A Comprehensive Guide  .pdf
AI Chatbot Development – A Comprehensive Guide  .pdf
 
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
一比一原版英国牛津大学毕业证(oxon毕业证书)如何办理
 

How Does Generative AI Actually Work? (a quick semi-technical introduction to Large Language Models)

  • 1. How Does Generative AI Actually Work? (a quick semi-technical introduction to Large Language Models) William H. Press Department of Computer Science The University of Texas at Austin May 12, 2023
  • 2. What word comes after “Happy…” Is it “aardvark”? Or “birthday”? Simple probabilities (but trillions of them) are at the root of all large language models (LLMs)
  • 3. Me: ChatGPT, I want to explain sequential word guessing to someone. If a sentence starts with "The boy" can you tell me what your next 10 sequential guesses of words would be, one word at a time? Sure! Here are 10 sequential word guesses, one at a time, based on the sentence starting with "The boy": 1. went, 2. to, 3. the, 4. park, 5. to, 6. play, 7. with, 8. his, 9.friends, 10. and Keep in mind that as a language model, my guesses are based on patterns and probabilities derived from the text corpus I was trained on. The actual guesses may vary depending on the specific context and the model version being used. What if the sentence instead begins with "The secret behind GPT-4 is"? Here are 10 sequential word guesses: 1. "its", 2. "ability", 3. "to", 4. "understand", 5. "and", 6. "generate", 7. "human- like", 8. "text", 9. "based", 10. "on“ Again, these are just guesses based on common patterns and do not represent an actual sentence. The distant ancestor of all chatbots is sequential word- guessing in natural languages. Hence “generative”.
  • 4. LLMs as “lossy text compression” • Represent training corpus more compactly by finding and encoding its structures and relationships • eliminate redundancy at all levels: syntactic, semantic, multiple sources, etc. • the result encoded in ~109-1012 matrix “weights” • “Lossy” because information is irretrievably lost • prompts are answered by (sort-of) decompressing into highly probable responses that could have been in the training data, but, in general, weren’t exactly so verbatim • The decompressed data maintains accuracy when… • it is “common sense” or “conventional wisdom” • because then a huge redundancy in the training data • But can be wildly inaccurate (like “digital artifacts” in a defective video) if query is not well represented in the compressed training corpus • e.g., most probable answer comes from one (wrong) document • or variants of a widely circulated conspiracy theory • if uncompressing from no germane data at all, it just makes things up (“hallucinates”) to get the most probable response https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web
  • 5. GPT-4: OpenAI’s latest released Large Language Model • OpenAI isn’t actually open! Can for many purposes be thought of as practically a Microsoft subsidiary. • Microsoft is said to provide the unique hardware infrastructure for OpenAI algorithm development. • GPT = Generative Pre-trained Transformer • Thought to have >5x1011 trainable parameters. • GPT-3 had 1.75x1011 • Trained on > several terabytes of language data • Training cost claimed to be $100 million • but this might be including amortized R&D • once trained, cost per query is millicents per token • I will highlight three key elements of “secret sauce”: • 1. transformer architecture • 2. huge scale of parameter space training corpus • 3. “RLHF” Reinforcement Learning from Human Feedback • mostly not reported on
  • 6. Key #1: Transformer architecture. It is a distant descendant of document query concepts Document retrieval: • Input text projected onto matrix of possible queries • Matrix multiply to cross queries with keys (e.g., keywords) • Matrix multiply to map result from keys to values (e.g., documents) • The brilliant idea of Vaswani et al. (2017, “Attention Is All You Need”) is map all of Q, K, V from the same input. • This is “Self-Attention” • And have all of Q, K, V learned. • Many layers allows attention to many different levels of structure simultaneously • This is “Multi-headed” https://dugas.ch/artificial_curiosity/GPT_architecture.html input processing stuff (encoder) output processing stuff (decoder) ~ 103 layers?
  • 7. Key #2: Sheer scale: Only a few dare to call it emergence, but the gain-of-function is striking • Transformer parameters: • trillion parameters =(?) 1000 parallel instances of billion • billion parameters in an instance =(?) 104 each query space, key space, value space (multiply two at a time) + “glue” • could think of as looking at every token list 107 ways in formulating the next response • “stateless”: looks at whole previous dialogue as a new token list, maximum length 32768 tokens • Training corpus parameters: • many terabytes? • ~1000x number of words a human hears or reads in a lifetime • Many proprietary tricks: • how to propagate gradients through the huge scale? • how to maximize parallelism in training? • special hardware? • rumored that GPT-5 training is on hold because not enough GPUs are obtainable in the world.
  • 8. Key #3: Reinforcement Learning with Human Feedback (RLHF) • Said to involve 40 contractors with (each?) hundreds of workers. • 1. Humans generate an initial dataset of “typical” queries and “good” responses. • 2. Humans rank a separate dataset of responses (some good, some bad). • 3. The reward model is trained on this data. • 4. The language model trains itself to satisfy the reward model. (How often?) • 5. The reward model is retrained with (free!) input from hundreds of millions of users. (How often?) user prompt “most-human” output many output guesses feedback https://aisupremacy.substack.com/p/ what-is-reinforcement-learning-with https://www.bradknox.net/ human-reward/tamer/
  • 9. In typical usage, RLHF doesn't directly train the base GPT model. The GPT model is first pretrained on a large corpus of text data, learning to predict the next word in a sentence. This pretraining process is unsupervised and uses a variant of the Transformer architecture to learn patterns and structures in the data. After this pretraining, the model can be fine-tuned using reinforcement learning from human feedback. In this process, human evaluators rank different responses generated by the model, and these rankings are used to create a reward model. The model is then fine-tuned to optimize this reward model, typically using a method like Proximal Policy Optimization. In this way, the RLHF is a separate step that fine-tunes the pretrained GPT model, rather than directly training the initial GPT weights. However, please note that this information is based on the methodology up to September 2021, and OpenAI's approach may have evolved since then. Me: ChatGPT, When GPT-3 or GPT-4 uses RLHF, is it a separate model, or does RLHF train the GPT weights directly?
  • 10. “There is no moat?” (Some insiders think that the technology can be miniaturized and will leak via open source availability to individuals and thence startups.) • Meta/Facebook’s entire LLM was leaked online in March, 2023. • and later released publicly • A May, 2023, leaked Google document (one engineer’s private view): • People are now running foundation models on a Pixel 6 at 5 tokens/s • You can fine-tune a personalized AI on your laptop in an evening. • Open-source models are [he says] faster, more customizable, more private, and pound-for-pound more capable. • “They are doing things with $100 and 13B params that we struggle with at $10M and 540B.” • LORa (low-rank factorization of large models) saves factor ~103 • and is composable to train new models, no need to go back to full model • Big companies might be held liable for copyright infringement on training data, or have to buy it • Open-source leaked models might (de-facto) not be • Scenario: A bunch of (LORa?) matrices appear on a Discord server
  • 11. Me: ChatGPT, To what extent can LLMs be miniaturized, for example by low rank factorization? As of my last update in September 2021, there's been some promising research showing that low-rank factorization can be used to compress neural networks, including Transformer-based models. Weight matrices in the Transformer's self-attention layers can be approximated well with low-rank matrices, which can significantly reduce the model's size with minimal loss in performance. Low-rank factorization is just one of many techniques for model compression. Others include knowledge distillation, where a smaller model is trained to mimic the behavior of a larger model, and pruning, where less important parts of the model are removed.
  • 12. Me: ChatGPT, Is LLM training data subject to copyright and thus needs to be licensed? The use of copyrighted material in training LLMs may be considered a copyright infringement. The models learn by consuming vast amounts of text data which may include copyrighted works. By using this information to generate new text, the models are indirectly reproducing and repurposing parts of the copyrighted works they were trained on, albeit in a non- verbatim manner. Some argue that this constitutes a derivative work, which is a right reserved for the copyright holder under U.S. Copyright Law and international copyright agreements. The copyright holder has exclusive rights to reproduce and create derivative works from their material. The counter-argument hinges on the doctrine of fair use. Advocates can argue that LLMs, which generate new, original text, use copyrighted material in a transformative manner, this a crucial consideration in fair use determinations. The vast amount of training data also means any specific copyrighted work constitutes an insubstantial portion of the input. Lastly, this use doesn't harm the market for the original works but may encourage further knowledge creation. However, as of September 2021, these complex legal questions lack clear legal precedent and interpretations may vary across jurisdictions.
  • 13. The LLM Value Chain: figure courtesy W. Dally May turn out that most value is with the end-service provider who specializes the model and knows the end-customer needs.