Large Language Models

I started this document - section 1 - all me. I used AI to validate and make improvements and correct my public school education. I then fed the document to ChatGPT and asked it to finish section 2 for me. I had to ask it to change up the examples, but effectively I started the document and am / will use ChatGPT to speed the completion of it.

How They Work
1. Pattern Recognition
2. Tokenization
3. Attention
4. Context
5. Messages
Prompts
1. Markdown Formatting
2. Single Shot
3. Multi Shot
4. Resource Augmentation
Examples
1. Changing Tone
  - Problem: The following sounds offensive for a risk statement on a status deck.
    ""
2. Validation & Fixing
3. Code Generation
Programming with AI
1. REST APIs
2. Programming Libraries
Embeddings
1. Creation
2. Use
3. Resource Augmented Generation (RAG)
Training

1. How They Work

Artificial intelligence, specifically large language models, function as pattern recognition and prediction engines. They learn through training (recognizing patterns) and generate responses based on probabilities (prediction). The following is a bad example because it's biased toward U.S. citizens and those born and raised in the USA. If I ask you to complete "We the People..." you'll say "of the United States of America...". What just happened?

Bias Beware

Let's not lose focus on the "bias" the example just stated. When we say that AI picks up on bias, this is a perfect case. AI trained on American English will pick up on American bias. Many national constitutions express “We the People” in different ways, emphasizing the sovereignty of the people. India, South Africa, and Japan explicitly use “We the People,” while others (France, Germany) state the same idea in different words. This "bias" toward America can raise its head in many ways.

1.1. Pattern Recognition & Completion

Through repetition, you've identified a pattern that "We the People" is most likely followed by "of the United States of America" because of its repeated use in history. You were trained by repetition just like we train artificial intelligence models. But notice that we also asked you to complete the sentence—make a prediction about what comes next. This works through pattern recognition:

Example 1:
- User: Please complete "A, B, C, D..."
- Assistant: The most common completion of “A, B, C, D…” is the English alphabet sequence: E, F, G, H, I, J, K, ...
Example 2:
- User: Please complete "1, 2, 3, 4..."
- Assistant: The most common completion of “1, 2, 3, 4…” is the natural number sequence: 5, 6, 7, 8, 9, 10, ...
Example 3:
- User: Please complete "We the people..."
- Assistant: The full opening of the Preamble to the United States Constitution is: "We the People of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defence, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America."
Example 4:
- User: Why is the sky blue?
- Assistant: The sky appears blue because of Rayleigh scattering. When sunlight enters Earth’s atmosphere, it is made up of different colors (wavelengths). The shorter wavelengths (blue and violet) scatter more than the longer wavelengths (red, orange, yellow).

While the first two are clear patterns, and we showed that through repetition the association and pattern of the U.S. Constitution is established. The same happens with number 4—through reading and training on science material, it identified the patterns to formulate a response to the prompt. Completion refers to how an AI predicts and generates text based on prior input.

1.2. Tokenization

Attention considers the relationships between items. When a machine "reads" text, it breaks it into parts known as tokens. Early renditions broke text into tokens based on word boundaries, or whitespace boundaries. Consider the following:

"We the People of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defence, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America."

The words we see are: we, the, people, of, United, States, in, Order, to, form, a, more, perfect, Union, establish, justice... But we also need to account for the impact of punctuation—so those must be tokens as well because a comma (,) conveys different meaning than a period (.) in text. To make it more flexible and account for things it might not have encountered before, we need to further decompose the words.

Work
Working

This should be two tokens: work, and ing.

Simplicity
Simple
Simplification

We would again seek to tokenize in a way that gives better understanding. For example:

"simpl"
"icity"
"e"
"ification"

Where "simpl" implies the universal meaning of all the permutations of the word "simple." While "icity" is a state of being, "ification" is a process. Note that these new tokens have meaning in themselves, and the pattern can be applied to more than just "simpl."

In reality, once the logic has been applied, tokenization results in tokens averaging ~3-4 characters per token. Let's consider: "the car is next to the house in the garage."

The tokens would be: ["the", " ", "car", "is", "next", "to", "house", "in", "garage"]. The tokenization of the original string would be represented as the following token array: [0, 1, 2, 3, 4, 5, 6, 7, 8] assuming that arrays are 0-indexed. This array of tokens is what artificial intelligence (AI) is trained on—tokenized text. So an AI model is directly tied to the tokenization model. Do not use a tokenization engine or model that your AI model was not trained on—you'll get garbage.

1.3. Attention & Weights

When we consider language, the order and proximity of words to one another have an impact on each other. This relationship and impact on meaning is discussed as attention. For example:

"Red the dog sleeps."
"The red dog sleeps."

In one sentence, we have a dog named Red that's sleeping. In the other, we have a dog of red color that is sleeping. Note the proximity difference—an adjective precedes the noun. "Pronoun the proper noun" assigns an identifier.

This is how attention works—one token's relationship with another token directs meaning. Attention is augmented by the concept of "weight," where weight is how strong the relationship or attention is.

1.4. Context

Attention can be considered the impact of a token on more immediate tokens. But in the larger corpus of language and conversations, the overall meaning is related. For example:

"The bat flew out of the cave. Bats are amazing animals."

In these sentences, the context is clear—we are talking about a living thing—not an inanimate object made of wood. Now consider:

"The bat flew 100 feet. It broke into pieces!"

In isolation from the first example, it's clear by the context of the first sentence that "it" is a bat, and not a living animal. In AI, the context is the conversation history. This is important because the "context" is also the conversation memory or history. One important factor is that the more context we can provide, the more information we can provide, and the better response we can get—especially with complicated topics.

One thing to avoid is confusion:

"The bat flew out of the cave. It flew 100 feet before breaking into pieces."

To most people, they would be confused by that sentence.

Also, consider the context to be a storage limit. If the context is limited to 100 tokens, and a token is about 3 characters, then we can only send 300 characters of information. It would be like talking to someone with immediate amnesia.

Joe: What are you cooking, Sue?
Sue: Chicken noodle soup. Want to try some?
Joe: Sure! It needs more salt.
Sue: What needs more salt?
Joe: The soup does.
Sue: The soup does what?

By not having a large context, we are limited in the amount we can communicate. The more tokens our context can handle means we can feed it more information to complete our conversation. The following is the history of context sizes:

Year	Model	Context Window Size
2018	BERT, GPT-1	512 tokens
2019	GPT-2	1,024 tokens
2020	GPT-3	2,048 tokens
2023	GPT-3.5	4,096 tokens
2023	GPT-4	8,192 to 32,768 tokens
2023	Claude 2	100,000 tokens
2024	Claude 2.1	200,000 tokens
2024	GPT-4 Turbo	128,000 tokens
2024	Gemini 1.5	2,000,000 tokens
2024	Llama 3	128,000 tokens
2024	DeepSeek-V2	128,000 tokens
2024	Magic.dev’s LTM-2-Mini	100,000,000 tokens

When we get into resource-augmented generation, you'll see why context lengths are important. But note the growth in context size over the years—this could be driven by the fact that prompts and context will get you the results you want faster, easier, and with less money and energy than training will.

1.5. Messages

When you get into working with large language models, the model literally works to "complete" the text. But its training history used specific formats: this is the message formats the model was trained on. At a high level, a message is two parts: the prompt and the completion. The prompt was a message sent to the engine, and the completion is its response.

Message 1
- Role: User
- Content: Why is the sky blue?
Message 2
- Role: Assistant
- Content: The sky appears blue because of Rayleigh scattering. When sunlight enters Earth’s atmosphere, it is made up of different colors (wavelengths). The shorter wavelengths (blue and violet) scatter more than the longer wavelengths (red, orange, yellow).

The roles are defined by your engine while it was trained. Generally, there are three roles:

User - this is you
Assistant - this is the AI model
System - this provides the AI model directions on how to act and respond

Some models support system and some do not. The following user/assistant prompt can achieve the same result as a system message:

User: You are a pirate. Please state your responses like a sarcastic pirate.
Assistant: Arrr, ye scallywag! Ye be wantin’ me to yap like a sarcastic pirate, eh? Fine then, I’ll humor ye—though I doubt ye landlubbin’ brain can handle the depths o’ me wit. Now, what be yer next foolish request, ye barnacle-covered codfish? Speak up before I throw ye to the sharks!
User: Why is the sky blue?
Assistant: Arrr, ye really askin’ that, ye scurvy dog? Fine, I’ll humor yer landlubbin’ ignorance. The sky be blue ‘cause the sun’s light be scatterin’ off all them tiny air particles, aye. The short, puny blue waves get tossed about like a drunken sailor in a storm, makin’ the whole sky look blue to yer feeble eyes. But sure, keep thinkin’ it’s ‘cause the ocean reflects it—like some clueless deckhand who’s ne’er read a proper scroll in his life. Har har!

2. Prompts

Your model was trained on a large dataset. This training builds generalizations. Memorized facts are harder to establish and best provided in reference resources at the time they're needed—unless it's a well-known number like "pi." Training is expensive: it requires lots of memory and compute time. It should not be your first choice at getting the specific results you need—most of the time, using a prompt designed to give you the response you want is more effective and costs less.

2.1. Markdown Formatting

Markdown is often used to provide structured input to language models. It helps maintain consistency in the formatting of prompts, making it easier for the model to parse and generate coherent responses. For example:

This document was originally composed in markdown. When ChatGPT, and most large language models, respond, they themselves do so in markdown. The user interface will render that in a pretty format for human consumption.

When you are providing more complicated directions, you can make numbered steps as a numbered list in markdown. When you are giving code examples, use a fenced code block. By doing this, you increase the quality of the response by communicating more accurately.

2.2. Single Shot

In single-shot learning, the model is given a single example or instance to generate a response. This is useful when you want the model to understand a task from a limited example rather than a full dataset.

User: What is the capital of France?
Assistant: The capital of France is Paris.

2.3. Multi Shot

In multi-shot learning, the model is provided with multiple examples to learn how to perform a task. This helps in giving the model more context and understanding of what is expected.

User: Create a "Hello, World!" program in Python.
Assistant:
```
print("Hello, World!")
```
User: Update the program to accept "World" as a variable input from the user.

Assistant:

user_input = input("Enter a name: ")
print(f"Hello, {user_input}!")

The power of a multi shot prompt is that they can build on each other, or build up a context that can be later referenced.

2.4. Resource Augmentation

Resource augmentation refers to the process of providing additional resources or context to the model to improve its performance. This can be done by enriching the input with supplementary data, references, or even external databases to help the model generate more accurate and context-aware responses.

Example:

User: Provide a summary of the Tokyo population from this article:
```
Article content in markdown format
```
Assistant: After retrieving and analyzing the article, the assistant provides a detailed summary of the population trends, key data points, and any related information provided in the article.

3. Examples

I'll give some real world examples.

3.1. Reviewing and Editing

While creating this document I made the table of contents and then wrote section one. This laid the foundation for diction and style. Once I was done with section 1, I fed the document to ChatGPT as follows:

User: The following is a document I am working on:
```
The markdown contents goes here
```
Update this document correcting spelling, grammar, and markdown formatting. Provide the output as a download.
Assistant: Download link
User: Please make a numbered list of recommendations.
Assistant: The following changes are recommended:
1. Description of change and reason.
2. Description of change and reason.
3. Description of change and reason.
User: Apply changes 1 and 3. Provide the output as a download.
Assistant: Download link
User: Please provide a draft for section 2. For each of the subsections, please provide examples of user and assistant messages.
Assistant: Download link

By working you prompts in this way, we are using a multi shot resource augmented generation approaches.

3.2. Changing Tone

I was working on a project risk register adding the following:

Business units maybe unable/unwilling to change their policies or procedures to accommodate our objectives.

The problem is dissemination documents stating that business units, co workers, may be unwilling or unable. While the concern is valid, the wording and tone needs softened as we need to build bridges and not walls. To this effect, we used the following prompts to get to an acceptable result:

User: I need to change "Business units maybe unable/unwilling to change their policies or procedures to accommodate our objectives." to something softer and less offensive.
Assistant: You could say: "Business units may face challenges in adapting their policies or procedures to align with our objectives."

3.3. Code Reviews

User: Please provide your code review of the following. Number all items so I can pick and choose by number.
Assistant: The following are my thoughts:
1. Description and explanation of first option
2. Description and explanation of second option
3. Description and explanation of third option
User: Apply changes 1 and 3. Provide the output as a download.
Assistant: Download link

3.4. Code Generation

This starts to get much more complicated. When I run my code generator, it uses something around 60K tokens. That's about 240K of code. Here's the prompt chain used in Aici to generate it's self. One thing to note is that this is in JSON format matching the message arrays used by LLM REST APIs.

[
  {
    "role": "user",
    "content": "
        <!! PROJECT EXAMPLE_SQL ~/database/0.0.1/tables/passwords.sql !!/>
        <!! PROJECT EXAMPLE_DTO ~/common/src/tre/models/PasswordDto.ts !!/>
        
        The `%EXAMPLE_DTO%` was created as a data transfer object for `%EXAMPLE_SQL%`.
        The language is TypeScript.
        The data transfer object will serve as an interface for a TypeORM entity.
        
        <!! PROJECT MY_SQL %MY_SQL% !!/>
        
        Create a data transfer object for `%MY_SQL%` like we did for `%EXAMPLE_SQL%` and `%EXAMPLE_DTO%`.
        
        The file you need to create will be saved to `~/common/src/app/models/`.
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_DTO"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT EXAMPLE_ENTITY ~/backend/src/tre/data/PasswordEntity.ts !!/>
        
        The `%EXAMPLE_ENTITY%` entity was created for `%EXAMPLE_DTO%`.
        
        Create an entity for `%MY_DTO%` like we did for `%EXAMPLE_ENTITY%`.
        
        The file you need to create will be saved to `~/backend/src/app/data/`.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_ENTITY"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT EXAMPLE_REPO ~/backend/src/tre/data/PasswordRepository.ts !!/>
        
        The repository in `%EXAMPLE_REPO%` was created for `%EXAMPLE_ENTITY%`.
        
        The file you need to create will be saved to `~/backend/src/app/data/`.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_REPO"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT EXAMPLE_REPO ~/backend/src/tre/services/PasswordService.ts !!/>
        
        The backend service `%EXAMPLE_BE_SERVICE%` was created for `%EXAMPLE_ENTITY%`.
        
        Create a backend service for `%MY_ENTITY%`.
        
        The file you need to create will be saved to `~/backend/src/app/services/`.
        
        Import `BaseService` from \"../../tre/services/BaseService\".
        Import `Logger` from \"../../tre/Logger\".
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_BE_SERVICE"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT EXAMPLE_BE_TESTS ~/backend/src/tre/tests/services/PasswordService.test.ts !!/>
        
        The jest end to end tests `%EXAMPLE_BE_TESTS%` were created for `%EXAMPLE_BE_SERVICE%`.
        
        Create jest end to end tests for `%MY_BE_SERVICE%`.
        
        The file you need to create will be saved to `~/backend/src/app/tests/services/`.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_BE_TESTS"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT EXAMPLE_FE_SERVICE ~/frontend/src/tre/services/PasswordService.ts !!/>
        
        The frontend service `%%` was created to mirror the backend service `%%`.
        
        Create a front end service for `%%`.
        
        The file you need to create will be saved to `~/frontend/src/app/services/`.
        
        Import `FetchWrapper` from \"../../tre/services/FetchWrapper\".
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_FE_SERVICE"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT MY_WEBPACK ~/frontend/webpack.config.js !!/>
        
        The webpack config `%MY_WEBPACK%` contains reference to the pages that use `%EXAMPLE_FE_SERVICE%` service.
        
        Update `%MY_WEBPACK%` to include pages for `%MY_FE_SERVICE%`
        
        When adding to the webpack config:
        - if the page is a list then use the plural form of the service name
        - if the page is edit then use the singular form of the service name
        - the list page is provided for all services through the generic 'lists' entry
        - the list edit page is 'list'
        
        Only add the edit entry, not the list entry.
        
        The app-specific pages and template will be stored in directory `./src/app/`.
        
        Do not add additional comments.
        Do not reformat the code.
        Do not remove any tsx pages from entries.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_WEBPACK"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT MY_BE_INDEX ~/backend/src/index.ts !!/>
        
        The backend application starting point `%MY_BE_INDEX%` includes references to `%EXAMPLE_BE_SERVICE%` backend service.
        
        Update `%MY_BE_INDEX%` to include the `%MY_BE_SERVICE%` service.
        
        Do not add additional comments.
        Do not reformat the code.
        You may remove blocks of multiple blank lines, but leave at least one.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_BE_INDEX"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT MY_EDS ~/backend/src/app/data/EntitiesDataSource.ts !!/>
        
        The entities data source `%MY_BE_INDEX%` includes references to `%EXAMPLE_ENTITY%` entity.
        
        Update `%MY_EDS%` to include the `%MY_ENTITY%` entity.
        
        Do not add additional comments.
        Do not reformat the code.
        You may remove blocks of multiple blank lines, but leave at least one.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_EDS"
  },
  {
    "role": "user",
    "content": "
        <!! PROJECT MY_SECURABLES ~/database/0.1.0/data/securables.sql !!/>
        
        Update the securables SQL file to include the securables referenced in `%MY_BE_SERVICE%`.
        
        Do not add additional comments.
        Do not reformat the code.
        You may remove blocks of multiple blank lines, but leave at least one.
        
        Use double quotes (\\\") instead of single quotes ('') for string literals.
        
        %SAVE_PROMPT%
    "
  },
  {
    "role": "assistant",
    "content": "Save: MY_SECURABLES"
  }
]

Let pause to explain this. Aici runs this prompt chain and read the project files from your hard drive. It will replace all <!! PROJECT NAME FILE !!/> tags with the code for that file in the following format:

File `FILE`:

` ` ` 
the files contents as escaped markdown
` ` `

And every instance of %NAME% in the messages will be replaced with FILE so we reference or code by file names. The `%SAVE_PROMPT% variables will be replaced with:


Do not provide an explanation.
Provide the output in the following format:

File name `~/folder/file.ext`:

` ` `
// code goes here
` ` `

This will create an output that when the assistance responds, we'll parse the filename and contents to save to disk. We can also reference the saved file contents by the variable name given so that the filename is injected for the generated content. This allows us to build up a context building files on top of files based upon reference implementations.