Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Large Language Models: From Prototype to Production (EuroPython keynote)

Large Language Models: From Prototype to Production (EuroPython keynote)

Large Language Models (LLMs) have shown some impressive capabilities and their impact is the topic of the moment. What will the future look like? Are we going to only talk to bots? Will prompting replace programming? Or are we just hyping up unreliable parrots and burning money? In this talk, I'll present visions for NLP in the age of LLMs and a pragmatic, practical approach for how to use Large Language Models to ship more successful NLP projects from prototype to production today.

Twitter: https://twitter.com/_inesmontani/status/1681700743693172738
LinkedIn: https://www.linkedin.com/posts/inesmontani_nlp-llm-llms-activity-7087478372418625536-3VDo

Ines Montani
PRO

July 19, 2023
Tweet

More Decks by Ines Montani

Other Decks in Technology

Transcript

  1. Ines Montani
    Explosion
    LARGE LANGUAGE
    LARGE LANGUAGE MODELS ✨ CHATGPT " ARTIFICIAL INTELLIGENCE # MACHINE LEARNING ✨
    PROTOTYPE TO PRODUCTION
    MODELS FROM
    LLAMA $ NATURAL LANGUAGE PROCESSING %
    ✨ OPEN SOURCE & PYTHON ' PROMPT ENGINEERING ⚙ ZERO-SHOT LEARNING ) GPT-4
    EVALUATION * COPILOT + GENERATIVE AI ,
    Ines Montani - Explosion

    View Slide

  2. SPACY
    SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY
    Open-source library for
    industrial-strength Natural
    Language Processing
    150m+
    downloads

    View Slide

  3. SPACY
    SPACY.IO & @SPACY_IO ✍ SPACY.TV / GITHUB.COM/EXPLOSION/SPACY
    Open-source library for
    industrial-strength Natural
    Language Processing
    150m+
    downloads
    ChatGPT can write spaCy code!

    View Slide

  4. PRODIGY
    Modern scriptable annotation
    tool for machine learning
    developers
    PRODIGY.AI & GITHUB.COM/EXPLOSION/PRODIGY-RECIPES
    8k+
    users
    700+
    companies

    View Slide

  5. PRODIGY
    Modern scriptable annotation
    tool for machine learning
    developers
    PRODIGY.AI & GITHUB.COM/EXPLOSION/PRODIGY-RECIPES
    8k+
    users
    700+
    companies

    View Slide

  6. 0 single/multi-doc summarization
    ✅ problem solving
    ✍ paraphrasing
    2 reasoning
    3 style transfer
    Generative
    ❓question answering
    5 text classification 6 entity recognition
    7 relation extraction
    8 grammar & morphology ) semantic parsing
    9 coreference resolution
    % discourse structure
    Predictive
    UNDERSTANDING NLP TASKS

    View Slide

  7. 0 single/multi-doc summarization
    ✅ problem solving
    ✍ paraphrasing
    2 reasoning
    3 style transfer
    Generative
    ❓question answering
    5 text classification 6 entity recognition
    7 relation extraction
    8 grammar & morphology ) semantic parsing
    9 coreference resolution
    % discourse structure
    Predictive
    UNDERSTANDING NLP TASKS
    human-readable machine-readable

    View Slide

  8. THE HISTORY OF FUTURE TECHNOLOGY

    View Slide

  9. THE HISTORY OF FUTURE TECHNOLOGY
    How people in 1900
    imagined the year 2000

    View Slide

  10. THE HISTORY OF FUTURE TECHNOLOGY
    How people in 1900
    imagined the year 2000

    View Slide

  11. THE HISTORY OF FUTURE TECHNOLOGY

    View Slide

  12. THE HISTORY OF FUTURE TECHNOLOGY
    manual calculation vs. calculator

    View Slide

  13. THE HISTORY OF FUTURE TECHNOLOGY
    manual calculation vs. calculator

    View Slide

  14. THE HISTORY OF FUTURE TECHNOLOGY
    “knocker-uppers” vs. alarm clock
    manual calculation vs. calculator

    View Slide

  15. THE HISTORY OF FUTURE TECHNOLOGY

    View Slide

  16. THE HISTORY OF FUTURE TECHNOLOGY
    human assistant
    vs. calendar apps
    Calendly
    Fantastical

    View Slide

  17. THE HISTORY OF FUTURE TECHNOLOGY
    human assistant
    vs. calendar apps
    Calendly
    Fantastical
    WHAT’S NEXT?

    View Slide

  18. NLP IN THE AGE OF LLMS

    View Slide

  19. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %

    View Slide

  20. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "

    View Slide

  21. COMPANY
    COMPANY
    MONEY
    INVESTOR
    “Hooli raises $5m to
    revolutionize search,
    led by ACME Ventures”
    5923214
    1681056
    CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA
    Database

    View Slide

  22. COMPANY
    COMPANY
    MONEY
    INVESTOR
    “Hooli raises $5m to
    revolutionize search,
    led by ACME Ventures”
    5923214
    1681056
    CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA
    Database
    named entity recognition

    View Slide

  23. COMPANY
    COMPANY
    MONEY
    INVESTOR
    “Hooli raises $5m to
    revolutionize search,
    led by ACME Ventures”
    5923214
    1681056
    CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA
    Database
    named entity recognition
    entity disambiguation

    View Slide

  24. COMPANY
    COMPANY
    MONEY
    INVESTOR
    “Hooli raises $5m to
    revolutionize search,
    led by ACME Ventures”
    5923214
    1681056
    CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA
    Database
    named entity recognition
    entity disambiguation
    custom database lookup

    View Slide

  25. COMPANY
    COMPANY
    MONEY
    INVESTOR
    “Hooli raises $5m to
    revolutionize search,
    led by ACME Ventures”
    5923214
    1681056
    CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA
    Database
    named entity recognition
    entity disambiguation
    custom database lookup
    currency normalization

    View Slide

  26. COMPANY
    COMPANY
    MONEY
    INVESTOR
    “Hooli raises $5m to
    revolutionize search,
    led by ACME Ventures”
    5923214
    1681056
    CLASSIC NLP PROBLEM: EXTRACT STRUCTURED DATA
    Database
    named entity recognition
    entity disambiguation
    custom database lookup
    currency normalization
    entity relation extraction

    View Slide

  27. VISION #1 dialogue
    is all you need
    %
    < LLM
    = user
    actions or information
    natural language input

    View Slide

  28. VISION #1 dialogue
    is all you need
    %
    < LLM
    = user
    actions or information
    natural language input
    LLM is the system
    and needs to manage
    the whole interaction

    View Slide

  29. VISION #2 prompting
    is all you need
    "
    < LLM
    0 text % prompt
    > system
    = user
    ? structured data

    View Slide

  30. VISION #2 prompting
    is all you need
    "
    < LLM
    0 text % prompt
    > system
    = user
    LLM replaces the
    specific ML model
    ? structured data

    View Slide

  31. VISION #3 modern
    practical NLP
    -
    @ developer 2 code
    < LLM 5 training data
    > system
    = user
    ? structured data
    ⚙ ML system

    View Slide

  32. VISION #3 modern
    practical NLP
    -
    @ developer 2 code
    < LLM 5 training data
    > system
    = user
    ? structured data
    ⚙ ML system
    LLM helps with
    building the pipeline

    View Slide

  33. VISION #3 modern
    practical NLP
    -
    @ developer 2 code
    < LLM 5 training data
    > system
    = user
    ? structured data
    ⚙ ML system
    LLM helps with
    building the pipeline

    View Slide

  34. VISION #3 modern
    practical NLP
    -
    @ developer 2 code
    < LLM 5 training data
    > system
    = user
    ? structured data
    ⚙ ML system
    LLM helps with
    building the pipeline

    View Slide

  35. 0 single/multi-doc summarization
    ✅ problem solving
    ✍ paraphrasing
    2 reasoning
    3 style transfer
    Generative
    ❓question answering
    5 text classification 6 entity recognition
    7 relation extraction
    8 grammar & morphology ) semantic parsing
    9 coreference resolution
    % discourse structure
    Predictive
    UNDERSTANDING NLP TASKS

    View Slide

  36. LLMS VS. TASK-
    SPECIFIC MODELS
    Text Classification
    accuracy on
    % of examples
    SST2 AG News Banking77 GPT-3 baseline
    65
    70
    75
    80
    85
    90
    95
    100
    1% 5% 10% 20% 50% 100%
    Explosion (2023), to be released

    View Slide

  37. LLMS VS. TASK-
    SPECIFIC MODELS
    Text Classification
    accuracy on
    % of examples
    SST2 AG News Banking77 GPT-3 baseline
    65
    70
    75
    80
    85
    90
    95
    100
    1% 5% 10% 20% 50% 100%
    Explosion (2023), to be released

    View Slide

  38. LLMS VS. TASK-
    SPECIFIC MODELS
    Text Classification
    accuracy on
    % of examples
    SST2 AG News Banking77 GPT-3 baseline
    65
    70
    75
    80
    85
    90
    95
    100
    1% 5% 10% 20% 50% 100%
    Explosion (2023), to be released

    View Slide

  39. LLMS VS. TASK-
    SPECIFIC MODELS
    Text Classification
    accuracy on
    % of examples
    SST2 AG News Banking77 GPT-3 baseline
    65
    70
    75
    80
    85
    90
    95
    100
    1% 5% 10% 20% 50% 100%
    Explosion (2023), to be released

    View Slide

  40. LLMS VS. TASK-
    SPECIFIC MODELS
    F-Score Speed (words/s)
    GPT-3.5 1 78.6 < 100
    GPT-4 1 83.5 < 100
    spaCy 91.6 4,000
    Flair 93.1 1,000
    SOTA 2023 2 94.6 1,000
    SOTA 2003 3 88.8 > 20,000
    1. Ashok and Lipton (2023), 2. Wang et al. (2021),
    3. Florian et al. (2003)
    SOTA on few-
    shot prompting
    RoBERTa-base
    CoNLL 2003 NER
    Text Classification
    accuracy on
    % of examples
    SST2 AG News Banking77 GPT-3 baseline
    65
    70
    75
    80
    85
    90
    95
    100
    1% 5% 10% 20% 50% 100%
    Explosion (2023), to be released

    View Slide

  41. < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do

    View Slide

  42. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do

    View Slide

  43. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer

    View Slide

  44. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer
    prompt
    engineering

    View Slide

  45. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer
    prompt
    engineering
    problem
    definition

    View Slide

  46. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer
    prompt
    engineering
    data
    annotation
    problem
    definition

    View Slide

  47. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer
    prompt
    engineering
    data
    annotation
    model
    training
    problem
    definition

    View Slide

  48. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer
    prompt
    engineering
    data
    annotation
    evaluation
    model
    training
    problem
    definition

    View Slide

  49. ⚙ Task-Specific Model
    fine-tuning BERT etc.
    knows less about what the text means
    can encode exactly what you want it to do
    < Large Language Model
    in-context learning
    knows a lot about what the text means
    doesn’t really know what you want it to do
    @ developer
    prompt
    engineering
    data
    annotation
    evaluation
    model
    training
    + production
    problem
    definition

    View Slide

  50. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "
    modern
    practical
    NLP
    -

    View Slide

  51. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "
    modern
    practical
    NLP
    -
    structured data

    View Slide

  52. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "
    modern
    practical
    NLP
    -
    structured data
    humans in
    the loop

    View Slide

  53. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "
    modern
    practical
    NLP
    -
    structured data fast prototyping
    humans in
    the loop

    View Slide

  54. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "
    modern
    practical
    NLP
    -
    structured data fast prototyping
    humans in
    the loop
    powered by
    open source

    View Slide

  55. NLP IN THE AGE OF LLMS
    SQL
    is all you need
    dialogue
    is all you need
    : %
    lots of humans
    is all you need
    prompting
    is all you need
    ; "
    modern
    practical
    NLP
    -
    structured data fast prototyping
    humans in
    the loop
    powered by
    open source
    conversational
    and graphical
    interfaces

    View Slide

  56. LLM-POWERED NLP IN PRACTICE
    LLM-powered collaborative data development environment
    @

    View Slide

  57. LLM-POWERED NLP IN PRACTICE
    LLM-powered collaborative data development environment
    @
    Assign labeling tasks to LLMs
    "

    View Slide

  58. LLM-POWERED NLP IN PRACTICE
    LLM-powered collaborative data development environment
    @
    Assign labeling tasks to LLMs
    "
    Review label decisions, correct errors
    A

    View Slide

  59. LLM-POWERED NLP IN PRACTICE
    LLM-powered collaborative data development environment
    @
    Assign labeling tasks to LLMs
    "
    Review label decisions, correct errors
    A
    Tune prompts and compare LLMs empirically
    ?

    View Slide

  60. LLM-POWERED NLP IN PRACTICE
    LLM-powered collaborative data development environment
    @
    Assign labeling tasks to LLMs
    "
    Review label decisions, correct errors
    A
    Tune prompts and compare LLMs empirically
    ?
    Build data sets to train and evaluate e icient, production-ready pipelines
    +

    View Slide

  61. 8
    PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

    View Slide

  62. 8
    correct
    mistakes
    PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS

    View Slide

  63. PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS
    correct
    mistakes

    View Slide

  64. add correct
    answer to prompt
    to tune it
    PRODIGY.AI/FEATURES/LARGE-LANGUAGE-MODELS
    correct
    mistakes

    View Slide

  65. GITHUB.COM/EXPLOSION/SPACY-LLM
    TOWARDS STRUCTURED DATA
    Prompt Template
    < LLM
    London is bigger
    than Berlin
    LOCATION:
    London, Berlin
    LOCATION

    View Slide

  66. GITHUB.COM/EXPLOSION/SPACY-LLM
    TOWARDS STRUCTURED DATA
    Prompt Template
    < LLM
    London is bigger
    than Berlin
    LOCATION:
    London, Berlin
    LOCATION

    View Slide

  67. GITHUB.COM/EXPLOSION/SPACY-LLM
    %
    unstructured
    text input
    ?
    structured
    Doc object

    View Slide

  68. GITHUB.COM/EXPLOSION/SPACY-LLM
    Named Entity
    Recognition
    Text
    Classification
    Relation
    Extraction
    Lemma-
    tization
    %
    unstructured
    text input
    ?
    structured
    Doc object

    View Slide

  69. GITHUB.COM/EXPLOSION/SPACY-LLM
    Named Entity
    Recognition
    Text
    Classification
    Relation
    Extraction
    Lemma-
    tization
    %
    unstructured
    text input
    ?
    structured
    Doc object
    < LLM ⚙ Supervised Model ✍ Rules
    mix, match and
    replace techniques

    View Slide

  70. EASIER ISN'T
    AMBITIOUS ENOUGH.
    Let’s not settle for systems that are
    worse than what we’ve been building.

    View Slide

  71. SPECIFIC
    Task-specific models
    powered by LLMs
    IS BETTER.

    View Slide

  72. SMALLER & FASTER
    Task-specific models
    powered by LLMs
    IS BETTER.

    View Slide

  73. PRIVATE
    Task-specific models
    powered by LLMs
    IS BETTER.

    View Slide

  74. BETTER
    Task-specific models
    powered by LLMs
    IS BETTER.

    View Slide

  75. THANK YOU!
    - Explosion – explosion.ai
    B spaCy – spacy.io
    ✨ Prodigy – prodigy.ai
    C Twitter – @_inesmontani
    D Mastodon – @[email protected]
    E LinkedIn

    View Slide