Building Applications with LLMs

A practical guide for developers

July 1, 202528 min read12 views
llmgptai

Large Language Models have fundamentally changed how we think about building software. But despite all the hype, there's surprisingly little practical guidance on how to actually integrate LLMs into production applications.

What We'll Cover

This guide focuses on practical patterns: prompt engineering, context management, RAG architectures, and error handling. No theoretical fluff – just real-world techniques you can use today.

Understanding the LLM API

At its core, working with an LLM is straightforward: you send a prompt, you get a response. But the devil is in the details. Let's look at a basic integration:

basic-llm.js
import OpenAI from class="text-emerald-class="text-amber-400">400">'openai';

class="text-pink-400">const openai = new OpenAI();

class="text-pink-400">async class="text-pink-400">function generateResponse(userMessage) {
  class="text-pink-400">const completion = class="text-pink-400">await openai.chat.completions.create({
    model: class="text-emerald-class="text-amber-400">400">'gpt-class="text-amber-400">4',
    messages: [
      {
        role: class="text-emerald-class="text-amber-400">400">'system',
        content: class="text-emerald-class="text-amber-400">400">'You are a helpful assistant.'
      },
      {
        role: class="text-emerald-class="text-amber-400">400">'user',
        content: userMessage
      }
    ],
    temperature: class="text-amber-400">0.class="text-amber-400">7,
    max_tokens: class="text-amber-400">1000
  });
  
  class="text-pink-400">return completion.choices[class="text-amber-400">0].message.content;
}

Prompt Engineering Patterns

The quality of your prompts directly determines the quality of your outputs. Here are some patterns that consistently produce better results:

structured-prompt.js
class="text-pink-400">const systemPrompt = `
You are an expert code reviewer. Your task is to analyze code 
and provide constructive feedback.

## Guidelines:
- Focus on correctness, performance, and readability
- Be specific about issues and provide examples
- Suggest improvements, don't just criticize
- Rate severity: Critical, Warning, or Suggestion

## Output Format:
Respond in JSON with this structure:
{
  class="text-emerald-class="text-amber-400">400">"summary": class="text-emerald-class="text-amber-400">400">"Brief overview",
  class="text-emerald-class="text-amber-400">400">"issues": [
    {
      class="text-emerald-class="text-amber-400">400">"severity": class="text-emerald-class="text-amber-400">400">"Warning",
      class="text-emerald-class="text-amber-400">400">"line": class="text-amber-400">42,
      class="text-emerald-class="text-amber-400">400">"description": class="text-emerald-class="text-amber-400">400">"...",
      class="text-emerald-class="text-amber-400">400">"suggestion": class="text-emerald-class="text-amber-400">400">"..."
    }
  ],
  class="text-emerald-class="text-amber-400">400">"score": class="text-amber-400">85
}
`;

Pro Tip

Always specify your output format explicitly. JSON schemas, markdown templates, or clear examples help the LLM produce consistently structured responses.

RAG: Retrieval-Augmented Generation

LLMs have knowledge cutoffs and can't access your specific data. RAG solves this by retrieving relevant context before generating a response:

rag-example.js
class="text-pink-400">async class="text-pink-400">function ragQuery(userQuestion) {
  // 1. Convert question to embedding
  class="text-pink-400">const embedding = class="text-pink-400">await getEmbedding(userQuestion);
  
  // 2. Search vector database for relevant docs
  class="text-pink-400">const relevantDocs = class="text-pink-400">await vectorDB.search(embedding, {
    topK: class="text-amber-400">5,
    threshold: class="text-amber-400">0.class="text-amber-400">7
  });
  
  // 3. Build context from retrieved documents
  class="text-pink-400">const context = relevantDocs
    .map(doc => doc.content)
    .join(class="text-emerald-class="text-amber-400">400">'\n\n');
  
  // 4. Generate response with context
  class="text-pink-400">return generateResponse(`
    Context:
${context}


    Question: ${userQuestion}

    Answer based only on the provided context.
  `);
}

Handling Errors Gracefully

LLM APIs can fail for many reasons: rate limits, timeouts, invalid responses. Robust error handling is essential for production applications.

Always Have a Fallback

Never let an LLM failure crash your application. Implement graceful degradation, retry logic, and meaningful error messages for users.

Share this post

52k
likes