What are the two types of large language models (LLMs)
- Base LLM (predicts the next word)
- Instruction Tuned LLM (tries to follow instructions)
- RLHF: Reinforcement Learning with Human Feedback
Key Principles
- write clear and specific instructions
- use delimiters (backticks, quotation marks, slashes)
- ask for structured output (HTML, JSON)
- check whether conditions are satisfied
- few-shot prompting (Give successful examples of completing tasks)
- give the model time to think
- specify the steps to complete a task
- instruct the mode to work out its own solution before rushing to a conclusion
Main capabilities
- summarizing
- inferring
- transforming (translating, formatting)
- expanding
Example of a complex prompt (with delimiters)
Hallucinations
🤔 What are hallucinations? Makes statements that sound plausible but are not true.
🤔 How do you reduce hallucinations?
- first find relevant information and then answer the question based on the relevant information
Temperature
- the degree of exploration or randomness of the model
- at a higher temperature (a value between 0 and 1) it might choose one of the less likely following words
- with temperature 0, everytime
The Chat Format
user
message is the inputassistant
message is the outputsystem
role is a high level instructions for the conversation