Skip to content

Customising LLMs#

See also: AI

Appears there are two broad approaches: fine-tuning and embedding. With embedding seemingly the most accessible. LangChain is a framework designed to help implement embeddings using different LLMs, including (apparently) local LLMs. LangChain and the link help with creating the underlying orchestration engine for LLM interactions.

Resources

Frameworks#

Framework Code requirements Description
LangChain Pro-code
Dust ??Low-code GUI for configuring and chaining blocks
Steamship ?? Combine prompts, prompt-chains, and Python code and combine into a managed API
Retune Simpler, focus on prompt and chatbot session management & creating fine-tuned models

Other tools#

  • tiktoken - tokenise and quantify prompts, from OpenAI
  • llamabot - class hierarchy for creating bots github

    I think llamabot can help facilitate experimentation and prototyping by making some repetitive things invisible.

Examples#

LLM Embedding and Fine Tuning#

Both fine-tuning and embeddings have challenges.

Fine-tuning concentrates on teaching the model new tasks via transfer learning, while semantic embeddings involve converting the text's meaning into a numerical representation, which can be employed in tasks such as semantic search and information retrieval

Summary#

Fine-tuning GPT-3/3.5/4

  1. Teaches new tasks or patterns
  2. Originally created for image models, now applies to NLP tasks
  3. Used for classification, sentiment analysis, and named entity recognition
  4. Does not teach new information, only new tasks
  5. Prone to confabulation and hallucination
  6. Expensive, slow, and difficult to implement
  7. Not scalable for large datasets

Embedding & Semantic Search

  1. Also known as neural search or vector search
  2. Adds to the LLMs knowledge base
  3. Uses semantic embeddings to represent text meaning
  4. Scales well, fast, and cost-effective
  5. Searches based on context and topic, not just keywords
  6. Easily updates with new information
  7. Solves half of the QA problem by retrieving relevant information

Comparing Fine-tuning and Semantic Search Fine-tuning

  1. Slow, difficult, and expensive
  2. Prone to confabulation
  3. Teaches new tasks, not new information
  4. Requires constant retraining
  5. Not ideal for QA tasks

Semantic Search

  1. Fast, easy, and cheap
  2. Recalls exact information
  3. Easy to add new information
  4. Scalable and efficient
  5. Solves half of QA tasks by retrieving relevant documents

Document embeddings#

Provides an overview of a specific process. Breaking it down and pointing to related software that can help. LangChain - a Python project, but also a javascript version - appears to be an attempt to provide a higher order abstraction than rolling your own entirely.

Echoing the idea being first-llm-api-experiments

Say you have a website that has thousands of pages with rich content on financial topics and you want to create a chatbot based on the ChatGPT API that can help users navigate this content. You need a systematic approach to match users’ prompts with the right pages and use the LLM to provide context-aware responses. This is where document embeddings can help.

Presents the following high level model. Planning to use document embeddings to provide the context aware information

Which translates into this specific

1- The user enters a prompt 2- Create the embedding for the user prompt 3- Search the embedding database for the document that is nearest to the prompt embedding 4- Retrieve the actual text of the document 5- Create a new prompt that includes the user’s question as well as the context from the document 6- Give the newly crafted prompt to the language model 7- Return the answer to the user 8- Bonus: provide a link to the document where the user can further obtain information

Document embeddings#

A list of numbers (numerical vector) representing the features of some information.

Process

  1. Generate embeddings from your documents

    Options for generating an embedding include

  2. Storing the embeddings.

    1. Python associated Faiss
    2. Pinecone (online)