Prompt Engineering
www.kaggle.com
2025년 4월 11일, 구글의 Lee Boonstra는 69페이지 분량의 프롬프트 엔지니어링 백서를 발행했다.
내용 프롬프트 엔지니어링에 대한 간단한 소개, LLM 컨트롤, 프롬프트 테크닉과 Best Practices로 구성되어 있다.
쭉 살펴보니 프롬프트 엔지니어링에 대한 지식이 없는 사람들이 입문하기 좋은 정도의 수준인 것으로 보인다.
다만 내가 주목한 부분은 이 백서가 구글에서 실제 Gemini의 프롬프트 엔지니어링을 수행하며 깨달은 점을 기술하고 있기 때문에 실무에서 어떤 테크닉과 노하우를 사용하고 있는지 알 수 있다는 점이었다.
특히, 테크닉별, 목적별 LLM 설정까지 공개하여 실무에서 활용할 때 충분히 도움이 될 것 같다.
LLM output configuration
Output length
LLM의 output length를 줄이는 것이 더 간결한 문장으로 결과를 생성한다는 뜻이 아니고, 그저 한계 토큰에 다다르면 더 이상 예측하는 것을 멈추게 한다는 것이다.
그러므로 짧은 output length를 원한다면 프롬프트 엔지니어링을 통해 조작하는 과정이 필요하다.
Sampling controls (Gemini 기준)
LLM은 단순히 하나의 토큰을 예측하는 것이 아니라 그 뒤의 토큰에 대한 확률을 예측하기 때문에, LLM의 단어들에 대한 토큰은 각각의 확률을 지닌다.
Temperature
temperature는 토큰 선택에 대한 랜덤성의 정도를 조절한다.
- 낮은 temperature는 결정론적인 응답을 부른다.
- 높은 temperature는 더 다양하고 예측 불가능한 결과를 부른다.
- 0(greedy decoding)은 가장 가능성이 높은 하나의 답을 선택한다.(가장 확률이 높은 토큰이 둘 이상일 경우, 때에 따라서 다른 답이 나올 수는 있음)
Gemini에서는 temperature 컨트롤이 소프트맥스 함수와 비슷하게 작동한다.
Top-K and Top-P
Top-K와 Top-P는 상위의 확률을 지닌 토큰들에 대한 제한을 거는 샘플링 설정이다.
Top-K: 모델의 예측 분포에서 상위 K개를 샘플링한다.
- 높으면, 더 창의적이고 다양한 결과를 만든다.
- 낮으면, 더 사실적이고 제한된 결과를 만든다.
- 1이면, greedy decoding과 같다.
Top-P: 누적 확률이 P를 넘지 않도록 샘플링한다.
- 0(greedy decoding)부터 1(단어장 내의 모든 토큰)까지의 범위
Putting it all together
LLM 설정들은 서로에게 영향을 준다.
다음은 조합에 따른 적용 방식이다.
- temperature, top-K and top-P: top-K, top-P를 모두 충족하는 토큰들만 후보가 되고, 그 후보들을 샘플링할 때 온도 설정이 적용됨.
- temperature, (top-K or top-P): top-K와 top-P 중 하나만 설정에 사용되고 그 뒤는 동일
- top-K and top-P: top-K나 top-P 조건을 만족한 후보들 중에서만 랜덤하게 선택
설정 중 하나를 극단적인 값으로 설정했을 경우, 다른 설정들을 무시하거나 관련 없게 만든다.
- temperature가 0일 경우, top-K나 top-P를 무시하고 가장 확률이 높은 단 하나의 토큰 선택
- top-K가 1일 경우, temperature나 top-P를 무시하고 top-K 기준을 통과한 토큰만 선택
- top-P가 0이거나 매우 작을 때, temperature나 top-K를 무시하고 top-P 기준을 통과한 토큰만 선택
백서에서는 다음과 같은 예시 세팅을 추천하고 있다.
- 적당한 수준: temperature 0.2, top-P 0.95, top-K 30
- 매우 창의적인 수준: temperatue 0.9, top-P 0.99, top-K 40
- 덜 창의적인 수준: temperatue 0.1, top-P 0.90, top-K 20
- 단 하나의 정답: temperature 0
Prompting Techniques
백서에서는 다음과 같은 프롬프팅 테크닉을 제시했다.
대부분 잘 알려진 테크닉들이지만, Step-back 프롬프팅은 여기서 특별히 주목해볼 필요가 있다.
Zero-Shot prompting
예시 없이 지시만을 제공하는 테크닉
Name | 1_1_movie_classification | ||
Goal | Classify movie reviews as positive, neutral or negative. | ||
Model | gemini-pro | ||
Temperature | 0.1 | Token Limit | 5 |
Top-K | N/A | Top-P | 1 |
Prompt | Classify movie reviews as POSITIVE, NEUTRAL or NEGATIVE. Review: "Her" is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. I wish there were more movies like this masterpiece. Sentiment: |
||
Output | POSITIVE |
One-Shot prompting
모델에게 한 개의 참고할만한 예시를 제공하는 테크닉
Few-Shot prompting
모델에게 몇 개의 참고할만한 예시를 제공하는 테크닉
퓨-샷 프롬프팅에 사용할 예시의 개수는 작업의 복잡성, 예시의 질, 모델의 성능 등에 좌우된다.
보통은 적어도 3~5개의 예시는 사용해야 한다.
작은 실수가 모델을 혼란시켜 예상치 못한 결과를 낼 수 있으니, 다양하고 높은 퀄리티의 예시를 사용해야 한다.
또한, 다양한 입력에 강건한 출력을 만들어내고 싶다면, 예시에 ‘edge cases’를 포함하는 것이 좋다.
edge cases란, 평범하지 않고 예측 불가능한, 그러나 모델이 다룰 수 있는 입력을 말한다.
Goal | Parse pizza orders to JSON | ||
Model | gemini-pro | ||
Temperature | 0.1 | Token Limit | 250 |
Top-K | N/A | Top-P | 1 |
Prompt | Parse a customer's pizza order into valid JSON: EXAMPLE: I want a small pizza with cheese, tomato sauce, and pepperoni. JSON Response: ``` { "size": "small", "type": "normal", "ingredients": [["cheese", "tomato sauce", "peperoni"]] } ``` EXAMPLE: Can I get a large pizza with tomato sauce, basil and mozzarella { "size": "large", "type": "normal", "ingredients": [["tomato sauce", "bazel", "mozzarella"]] } Now, I would like a large pizza, with the first half cheese and mozzarella. And the other tomato sauce, ham and pineapple. JSON Response: |
||
Output | ``` { "size": "large", "type": "half-half", "ingredients": [["cheese", "mozzarella"], ["tomato sauce", "ham", "pineapple"]] } ``` |
System prompting
모델의 출력 형식 등을 지정
Goal | Classify movie reviews as positive, neutral or negative. | ||
Model | gemini-pro | ||
Temperature | 1 | Token Limit | 5 |
Top-K | 40 | Top-P | 0.8 |
Prompt | Classify movie reviews as positive, neutral or negative. Only return the label in uppercase. Review: "Her" is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. It's so disturbing I couldn't watch it. Sentiment: |
||
Output | NEGATIVE |
Goal | Classify movie reviews as positive, neutral or negative, return JSON. | ||
Model | gemini-pro | ||
Temperature | 1 | Token Limit | 1024 |
Top-K | 40 | Top-P | 0.8 |
Prompt | Classify movie reviews as positive, neutral or negative. Return valid JSON: Review: "Her" is a disturbing study revealing the direction humanity is headed if AI is allowed to keep evolving, unchecked. It's so disturbing I couldn't watch it. Schema: ``` MOVIE: { "sentiment": String "POSITIVE" | "NEGATIVE" | "NEUTRAL", "name": String } MOVIE REVIEWS: { "movie_reviews": [MOVIE] } ``` JSON Response: |
||
Output | ``` { "movie_reviews": [ { "sentiment": "NEGATIVE", "name": "Her" } ] } ``` |
Role prompting
모델에게 역할 지정
효과를 보였던 스타일 예시:
Confrontational, Descriptive, Direct, Formal, Humorous, Influential, Informal,
Inspirational, Persuasive
Goal | Act as travel guide and provide 3 travel suggestions | ||
Model | gemini-pro | ||
Temperature | 1 | Token Limit | 1024 |
Top-K | 40 | Top-P | 0.8 |
Prompt | I want you to act as a travel guide. I will write to you about my location and you will suggest 3 places to visit near me. In some cases, I will also give you the type of places I will visit. My suggestion: "I am in Amsterdam and I want to visit only museums." Travel Suggestions: |
||
Output | 1. Rijksmuseum: World-renowned museum, known for Dutch masterpieces like Rembrandt’s “The Night Watch.” Don’t miss the incredible collection of paintings and artifacts showcasing Dutch art and history. 2. Van Gogh Museum: Immerse yourself in the captivating works of Vincent van Gogh. The museum houses the largest collection of his paintings and drawings, including “The Starry Night” and “Sunflowers.” 3. Stedelijk Museum Amsterdam: Discover modern and contemporary art from around the world. The museum’s collection includes works by Picasso, Kandinsky, and Rietveld, housed in a striking modern building. |
Contextual prompting
모델에게 맥락을 제공
Goal | Suggest articles for a blog about retro games | ||
Model | gemini-pro | ||
Temperature | 1 | Token Limit | 1024 |
Top-K | 40 | Top-P | 0.8 |
Prompt | Context: You are writing for a blog about retro 80's arcade video games. Suggest 3 topics to write an article about with a few lines of description of what this article should contain. |
||
Output | 1. **The Evolution of Arcade Cabinet Design:** This article would explore the evolution of arcade cabinet designs, from the early wood and metal cabinets of the 1970s to the sleek, neon-lit designs of the 1980s and beyond. 2. **Blast From The Past: Iconic Arcade Games of The 80’s:** - This article would feature a glimpse into the history, impact, and legacy of some of the most iconic and influential arcade games of the 1980s, highlighting their innovations, popular mechanics, and enduring charm. 3. **The Rise and Retro Revival of Pixel Art:** - This article would delve into the evolution of pixel art as a defining visual style of early arcade games, tracing its roots, exploring its techniques, and discussing the resurgence of pixel art in modern games and digital art. |
Step-back prompting
LLM에게 우선 작업에 관련된 더 일반적인 수준의 질문에 답하게 하고, 그에 따른 응답을 최종 작업의 프롬프트로 입력하여 정답을 도출해내는 방식
LLM이 최종 작업을 수행하기 전 배경지식과 추론 과정을 효과적으로 활성화할 수 있도록 한다.
기존의 전통적인 프롬프트:
Goal | Write a storyline for a level of a first-person shooter video game. | ||
Model | gemini-pro | ||
Temperature | 1 | Token Limit | 1024 |
Top-K | 40 | Top-P | 0.8 |
Prompt | Write a one paragraph storyline for a new level of a first- person shooter video game that is challenging and engaging. |
||
Output | The level begins with the player’s squad being ambushed in a dense urban area by a relentless enemy faction. Players must fight their way through a maze of back alleys and dilapidated buildings, utilizing stealth and precision to eliminate enemy forces and avoid detection. The player must use their environment to their advantage, setting traps, utilizing cover, and making calculated moves to survive. Along the way, they uncover intel suggesting the enemy is planning a devastating attack, and the player must race against time to gather evidence and relay it back to their headquarters before the enemy can carry out their plans. |
Step-back: 1) 우선 일반적인 수준의 질문을 생성한 후
Prompt | Based on popular first-person shooter action games, what are 5 fictional key settings that contribute to a challenging and engaging level storyline in a first-person shooter video game? |
||
Output | 1. **Abandoned Military Base**: A sprawling, post-apocalyptic military complex crawling with mutated soldiers and rogue robots, ideal for challenging firearm combat. 2. **Cyberpunk City**: A neon-lit, futuristic urban environment with towering skyscrapers and dense alleyways, featuring cybernetically enhanced enemies and hacking mechanics. 3. **Alien Spaceship**: A vast alien vessel stranded on Earth, with eerie corridors, zero-gravity sections, and extraterrestrial creatures to encounter. 4. **Zombie-Infested Town**: A desolate town overrun by hordes of aggressive zombies, featuring intense close-quarters combat and puzzle-solving to find safe passage. 5. **Underwater Research Facility**: A deep-sea laboratory flooded with water, filled with mutated aquatic creatures, and requiring stealth and underwater exploration skills to survive. |
Step-back: 2) 그에 대한 응답을 프롬프트로 하여 최종 작업에 대한 결과 생성
Prompt | Context: 5 engaging themes for a first person shooter video game: 1. **Abandoned Military Base**: A sprawling, post-apocalyptic military complex crawling with mutated soldiers and rogue robots, ideal for challenging firearm combat. 2. **Cyberpunk City**: A neon-lit, futuristic urban environment with towering skyscrapers and dense alleyways, featuring cybernetically enhanced enemies and hacking mechanics. 3. **Alien Spaceship**: A vast alien vessel stranded on Earth, with eerie corridors, zero-gravity sections, and extraterrestrial creatures to encounter. 4. **Zombie-Infested Town**: A desolate town overrun by hordes of aggressive zombies, featuring intense close-quarters combat and puzzle-solving to find safe passage. 5. **Underwater Research Facility**: A deep-sea laboratory flooded with water, filled with mutated aquatic creatures, and requiring stealth and underwater exploration skills to survive. Take one of the themes and write a one paragraph storyline for a new level of a first-person shooter video game that is challenging and engaging. |
||
Output | In the heart of a murky abyss, lies a dilapidated underwater research facility, standing as a testament to human ambition and its disastrous consequences. Shrouded in darkness, pulsating with the hum of malfunctioning machinery, and haunted by the echoes of drowned souls, the facility is now a breeding ground for a nightmare borne from the deep sea. The player, an elite marine equipped with advanced diving gear and experimental weaponry, is tasked with venturing into this aquatic realm of terror. Armed with resilience and resourcefulness, they must navigate treacherous corridors, solve cryptic puzzles, and confront gruesome sea monstrosities that lurk in the depths. From grappling with the crushing pressure of the deep sea to outsmarting cunning aquatic predators, every moment in this uncharted underworld tests the limits of human endurance and courage. |
Chain of Thought (CoT)
'Let's think step by step.' 등의 문장을 통해 모델이 단계적으로 추론하도록 만드는 방식
제로-샷, 원-샷, 퓨-샷과 결합해서 사용 가능하다.
Zero-Shot 방식:
Prompt | When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let's think step by step. |
||
Output | 1. Let’s assume my current age is “x” years. 2. When I was 3 years old, my age was 3 years. 3. My partner’s age at that time was 3 times my age, which means my partner’s age was 3 * 3 = 9 years. 4. Now, I am 20 years old, which means my age has increased by 20 - 3 = 17 years since I was 3 years old. 5. Since my partner’s age was 9 years when I was 3 years old, and my age has increased by 17 years, my partner’s age must have also increased by 17 years. 6. Therefore, my partner’s current age is 9 + 17 = 26 years. So, my partner is 26 years old.. |
One-Shot 방식:
Prompt | Q: When my brother was 2 years old, I was double his age. Now I am 40 years old. How old is my brother? Let's think step by step. A: When my brother was 2 years, I was 2 * 2 = 4 years old. That's an age difference of 2 years and I am older. Now I am 40 years old, so my brother is 40 - 2 = 38 years old. The answer is 38. Q: When I was 3 years old, my partner was 3 times my age. Now, I am 20 years old. How old is my partner? Let's think step by step. A: |
||
Output | When I was 3 years old, my partner was 3 * 3 = 9 years old. That’s an age difference of 6 years and my partner is older. Now I am 20 years old, so my partner is 20 + 6 = 26 years old. The answer is 26. |
Self-consistency
같은 프롬프트를 여러 번 사용해 여러 결괏값을 만들어내고, 다수결 투표를 진행해 가장 많이 나온 결과를 최종 응답으로 결정하는 방식
Prompt | EMAIL: ``` Hi, I have seen you use Wordpress for your website. A great open source content management system. I have used it in the past too. It comes with lots of great user plugins. And it's pretty easy to set up. I did notice a bug in the contact form, which happens when you select the name field. See the attached screenshot of me entering text in the name field. Notice the JavaScript alert box that I inv0k3d. But for the rest it's a great website. I enjoy reading it. Feel free to leave the bug in the website, because it gives me more interesting things to read. Cheers, Harry the Hacker. ``` Classify the above email as IMPORTANT or NOT IMPORTANT. Let's think step by step and explain why. |
||
Output Attempt 1 | **Step 1: Identify the purpose of the email.** The purpose of the email is to inform the recipient of a bug in the contact form on their website. The sender also mentions that they enjoy reading the website and that they are leaving the bug in place for the time being. **Step 2: Assess the potential impact of the bug.** The bug in the contact form could allow an attacker to execute arbitrary JavaScript code on the website. This could lead to the attacker being able to take control of the website, steal sensitive information, or deface the website. **Step 3: Consider the credibility of the sender.** The sender of the email claims to be a hacker named Harry. However, there is no way to verify the sender’s identity. It is possible that the sender is a legitimate hacker who is trying to help the recipient, but it is also possible that the sender is a malicious actor who is trying to trick the recipient into clicking on a malicious link or downloading a malicious file. **Conclusion: IMPORTANT** Based on the potential impact of the bug and the credibility of the sender, the email should be classified as **IMPORTANT**. The recipient should take immediate action to fix the bug and protect their website from attack. |
||
Output Attempt 2 | **Explanation:** 1. **Lack of urgency:** The email does not convey any sense of urgency or time-sensitivity. It starts with a casual greeting and presents the information in a matter-of-fact tone. 2. **Non-critical bug report:** The bug reported in the contact form is described as interesting but not critical. The email sender even suggests that the website owner can leave the bug in place. 3. **Lack of personal impact:** The email does not indicate any direct or personal impact on the website owner. It simply mentions that the sender enjoys reading the website. 4. **Absence of action request:** The email does not explicitly request any action from the website owner. It does not ask for the bug to be fixed or provide any specific suggestions for improvement. 5. **Sender’s intent:** The sender’s intention seems to be more focused on sharing an observation and expressing interest in the website rather than raising a serious concern or requesting assistance. **Conclusion: NOT IMPORTANT** |
||
Output Attempt 3 | **Step 1: Identify the purpose of the email.** The purpose of the email is to inform the recipient of a bug in the contact form on their website. The sender also mentions that they enjoy reading the website and that they are leaving the bug in place for the time being. **Step 2: Assess the potential impact of the bug.** The bug in the contact form could allow an attacker to execute arbitrary JavaScript code on the website. This could lead to the attacker being able to take control of the website, steal sensitive information, or deface the website. **Step 3: Consider the credibility of the sender.** The sender of the email claims to be a hacker named Harry. However, there is no way to verify the sender’s identity. It is possible that the sender is a legitimate hacker who is trying to help the recipient, but it is also possible that the sender is a malicious actor who is trying to exploit the bug for their own gain. **Conclusion:** Based on the above factors, the email should be classified as **IMPORTANT**. The bug in the contact form is a serious security risk, and the sender’s credibility is unknown. The recipient should take immediate action to fix the bug and |
Tree of Thoughts (ToT)
CoT와 개념은 비슷하지만, 추론 경로를 다양화해서 탐색하는 방식
ReAct (reason & act)
외부 에이전트, 도구들과 결합하여 추론-행동 루프를 통해 작업
> Entering new AgentExecutor chain...
Metallica has 4 members.
Action: Search
Action Input: How many kids does James Hetfield have?
Observation: three children
Thought: 1/4 Metallica band members have 3 children
Action: Search
Action Input: How many kids does Lars Ulrich have?
Observation: 3
Thought: 2/4 Metallica band members have 6 children
Action: Search
Action Input: How many kids does Kirk Hammett have?
Observation: Hammett has been married to his second wife Lani since 1998.
They have two sons, Angel (b. September 29, 2006) and Vincenzo (b. June
28, 2008).
Thought: 3/4 Metallica band members have 8 children
Action: Search
Action Input: How many kids does Robert Trujillo have?
Observation: 2
Thought: 4/4 Metallica band members have 10 children
Final Answer: 10
Automatic Prompt Engineering
LLM에게 프롬프트를 작성하도록 자동화하는 방식
과정은 다음과 같다:
- 다양한 출력을 생성하도록 프롬프트를 작성
- 생성된 프롬프트 후보들을 평가(BLEU나 ROUGE 등 평가 기준 활용)
- 가장 높은 평가 점수를 받은 후보를 최종 선택해 프롬프트에 반영
Prompt | We have a band merchandise t-shirt webshop, and to train a chatbot we need various ways to order: "One Metallica t-shirt size S". Generate 10 variants, with the same semantics but keep the same meaning. |
||
Output | 1. I’d like to purchase a Metallica t-shirt in size small. 2. Can I order a small-sized Metallica t-shirt? 3. I’d like to buy one Metallica t-shirt in small. 4. One Metallica shirt, size small, please. 5. I want one Metallica t-shirt in size small. 6. I’d like to place an order for one Metallica t-shirt in small. 7. Small Metallica t-shirt, one please. 8. I’m looking for one Metallica t-shirt in small. 9. Can you get me one Metallica t-shirt in small? 10.One Metallica small t-shirt, please. |
Code prompting(writing/explaining/translating/debugging and reviewing)
코드를 작성/설명/번역/디버깅과 리뷰할 때 프롬프트를 사용하는 것
Multimodal prompting
LLM에게 텍스트뿐만 아니라 이미지, 오디오, 코드 등 다양한 입력을 함께 제공하는 것
Best Practices
1. Provide examples
2. Design with simplicity
혼란스럽게 만드는 어려운 어휘는 사용하지 말 것. 불필요한 정보는 포함하지 말 것.
Examples
BEFORE:
I am visiting New York right now, and I'd like to hear more about great locations. I am with two 3 year old kids. Where should we go during our vacation?
AFTER REWRITE:
Act as a travel guide for tourists. Describe great places to visit in New York Manhattan with a 3 year old.
모델의 지시에 액션을 묘사하는 단어를 사용할 때의 예시:
Act, Analyze, Categorize, Classify, Contrast, Compare, Create, Describe, Define, Evaluate, Extract, Find, Generate, Identify, List, Measure, Organize, Parse, Pick, Predict, Provide, Rank, Recommend, Return, Retrieve, Rewrite, Select, Show, Sort, Summarize, Translate, Write.
3. Be specific about the output
Examples:
DO: Generate a 3 paragraph blog post about the top 5 video game consoles. The blog post should be informative and engaging, and it should be written in a conversational style.
DO NOT: Generate a blog post about video game consoles.
4. Use Instructions over Constraints
많은 리서치가 제약에 의지하는 것 보다는 긍정적인 지시를 포함하는 것이 더 효과적이라고 주장한다.
긍정적인 지시가 주어진 범위 내에서 유연함과 창의력을 증진하는 반면, 제약은 모델의 잠재력을 제한한다.
제약은 모델이 유해하거나 편향된 컨텐츠를 만들지 않도록 할 때, 혹은 제한된 결과 포맷이나 스타일이 필요할 때만 사용하는 것이 좋다.
모델이 ‘하지 말아야 할 것’을 얘기하는 것이 아니라, ‘해야할 것’을 얘기해 주는 것.
DO: Generate a 1 paragraph blog post about the top 5 video game consoles. Only discuss the console, the company who made it, the year, and total sales.
DO NOT: Generate a 1 paragraph blog post about the top 5 video game consoles. Do not list video game names.
5. Control the max token length
"Explain quantum physics in a tweet length message.”
6. Use variables in prompts
변수 사용은 불필요한 반복으로 인한 시간과 노력을 줄여준다.
Prompt
VARIABLES
{city} = "Amsterdam"
PROMPT
You are a travel guide. Tell me a fact about the city: {city}
Output
Amsterdam is a beautiful city full of canals, bridges, and narrow streets. It’s a great place to visit for its rich history, culture, and nightlife.
7. Experiment with input formats and writing styles
스타일, 단어 선택, 프롬프트 타입과 같은 요소들을 바꾸어가며 실험하는 것이 필요하다.
- Question: What was the Sega Dreamcast and why was it such a revolutionary console
- Statement: The Sega Dreamcast was a sixth-generation video game console released by Sega in 1999. It...
- Instruction: Write a single paragraph that describes the Sega Dreamcast console and explains why it was so revolutionary.
8. For few-shot prompting with classification tasks, mix up the classes
퓨-샷 분류 작업을 할 때, 예시를 섞지 않고 특정한 순서로 배열한다면 모델이 순서를 외워 과적합될 위험이 있다.
처음은 6개의 퓨-샷 예시로 시작하면서 정확도를 측정해 테스트해 나가는 것이 좋다.
9. Adapt to model updates
10. Experiment with output formats
창의성이 필요하지 않은 추출, 선택, 파싱, 정렬, 랭킹, 분류와 같은 작업에서는 출력 포맷을 JSON이나 XML으로 시도해볼 수 있다.
JSON 형식으로 출력하는 것에는 몇 가지 장점이 존재한다:
- 항상 같은 스타일로 출력 가능
- 받길 원하는 데이터에 집중 가능
- 환각의 가능성 적음
- 모델에게 관계를 알게 할 수 있음
- 데이터 타입을 받을 수 있음
- 정렬 가능
11. JSON Repair
JSON 형식으로 출력할 경우, 장점도 존재하지만 자연어 텍스트에 비해 더 많은 토큰을 사용하기 때문에 시간과 비용이 더 든다. 또한, 출력 창을 많이 차지하면서 잘리거나 괄호 등이 잘못 출력되는 경우가 발생한다.
이 경우, json-repair 라이브러리를 사용하여 잘못된 출력을 고칠 수 있다.
12. Working with Schemas
구조화된 JSON을 출력으로 사용하는 것 뿐만 아니라, 입력으로 사용할 때도 유용하다.
JSON 스키마는 입력에 대한 구조와 데이터 타입을 한정할 수 있다.
스키마를 제공함으로써, LLM에게 데이터에 대한 청사진을 주고, 관련된 정보에 집중하게 하여 입력을 오해할 리스크를 줄임. 또한, 데이터 간의 관계와 시간을 이해하게 한다.
Example: LLM에게 e-commerce catalog 속 제품의 설명을 생성하게 하는 경우
Prompt
{
"type": "object",
"properties": {
"name": { "type": "string", "description": "Product name" },
"category": { "type": "string", "description": "Product category" },
"price": { "type": "number", "format": "float", "description": "Product price" },
"features": {
"type": "array",
"items": { "type": "string" },
"description": "Key features of the product"
},
"release_date": { "type": "string", "format": "date", "description":
"Date the product was released"}
},
Output
{
"name": "Wireless Headphones",
"category": "Electronics",
"price": 99.99,
"features": ["Noise cancellation", "Bluetooth 5.0", "20-hour battery life"],
"release_date": "2023-10-27"
}
13. Experiment together with other prompt engineers
14. CoT Best practices
CoT 프롬프팅은 greedy decoding에 기초하고 있어, 시퀀스의 다음 단어로 가장 높은 가능성을 가진 단어로 예측한다.
최종 정답을 예측하기 위해 추론을 사용할 때, 단 하나의 정답을 도출해내야 하기 때문에 온도는 항상 0으로 맞추어야 한다.
15. Document the various prompt attempts
프롬프트 시도를 아주 자세하게, 무엇이 잘 되었고 잘 되지 않았는지를 포함하여 추후 도움이 될 수 있도록 문서화해야 한다.
프롬프트 버전, 결과, 피드백 등을 기록
RAG를 사용할 경우 프롬프트에 삽입된 콘텐츠(쿼리, 청크 세팅, 청크 결과 등)을 기록
프롬프트 문서화 템플릿 예시:
Name | [name and version of your prompt] | ||
Goal | [One sentence explanation of the goal of this attempt] | ||
Model | [name and version of the used model] | ||
Temperature | [value between 0 - 1] | Token Limit | [number] |
Top-K | [number] | Top-P | [number] |
Prompt | [Write all the full prompt] | ||
Output | [Write out the output or multiple outputs] |
'프롬프트 엔지니어링' 카테고리의 다른 글
[프롬프트 엔지니어링 기초] 프롬프트 엔지니어링 테크닉(2) (0) | 2025.04.07 |
---|---|
[프롬프트 엔지니어링 기초] 프롬프트 엔지니어링 테크닉(1) (0) | 2025.04.05 |
[프롬프트 엔지니어링 기초] 프롬프트 엔지니어링 과정 (0) | 2025.03.30 |