Overview / Best Practices / Model Selection

Introduction

There are many models on CBorg and it is not obvious which model may be the best choice for you. This page provides some notes.

Models for Coding

ModelCostSpeedComments
lbl/cborg-coderFreeMediumUnlimited usage.
anthropic/claude-haiku$FastSuggested use: adding documentation, simple tasks.
anthropic/claude-sonnet$$FastStrong performance for every day use. Supports computer use.
anthropic/claude-sonnet-high$$$MediumBetter every day, but watch cost
anthropic/claude-opus$$$SlowVery expensive with marginal benefit over Sonnet
anthropic/claude-opus-high$$$$SlowChallenging tasks where other models fail
openai/o4-mini-high$$MediumExcellent performance at a very low cost.
openai/o3-high$$$SlowFor challenging tasks (refactoring, math, etc).
google/gemini-pro$$$MediumSolid performance, moderate cost.
google/gemini-pro-high$$$$SlowFor challenging tasks

TLDR - Which model to use?

  • lbl/cborg-coder: Never need to worry about budget.
  • openai/o4-mini-high: Best performance-to-cost ratio, strong performer. Best daily for heavy use.
  • anthropic/claude-sonnet-high: Best daily driver for moderate use.
  • openai/o3-high: Most challenging reasoning tasks (math, refactoring, etc)
  • google/gemini-pro-high: Alternative for challenging reasoning tasks

Models for Technical Writing

Recommended Best Practices:

  • Style Emulation: Using a custom system prompt, provide a large corpus of sample writing and ask the model to emulate the style and language. This is more effective for maintaining style than using general descriptions of writing style.
  • Complex Document Formation: Use multiple prompts to construct the document one paragraph or one section at a time. This is more scalable and will maintain accuracy and tone throughout.

Best models: anthropic/claude-opus-high and openai/o3-high.

Models for Data Extraction / Summarization

Recommended Best Practice:

  • Use a small / lightweight model and extract data one piece at a time.
  • Use chain of thought prompting to enhance accuracy.
  • Structure prompts with data first, then question, to leverage prompt caching.

Models for Data Extraction

ModelCostSpeedComments
lbl/cborg-chatFreeMediumUnlimited usage
lbl/cborg-miniFreeFastUnlimited usage, lower latency
anthropic/claude-haiku$$FastMore expensive than other lightweight models
anthropic/gemini-flash$FastStrong performance, low cost
anthropic/gemini-flash-lite$Very FastLow latency applications
openai/gpt-4.1-mini$$FastGood performance, supports implicit prompt caching
openai/gpt-4.1-nano$Very FastLow latency

Example Code (Not a working example - just a starting point):

import json

def ask_cborg(query):

    # helper function to send query to a model
    # see other example code to implement this
    ...

fields = {
    'project': 'The name of the project',
    'instrument': 'The name of the scientific instrument',
    'status': 'The status of the scientific instrument'
}

data = {}

for key, query in fields.items():

    data[key] = json.loads(ask_cborg(
        "Read the following document and then respond to the question below.\n" + 
        "[[ * Start of Document * ]]\n" + 
        document + 
        "\n[[ * End of Document ]]\n" + 
        f"What is {query}? First answer the question in natural language, describing how it relates to the document, then provide a concise final answer." + 
        "Use the following JSON template: " +
        "{ 'answer': Your answer in natural language, 'final_answer': The final answer. }\n" +
        "Only output the JSON object without any additional commentary."
      ))['final_answer']

Using a loop over each key value as shown above will work better than attempting to extract a large number of items in a single query.

This approach can also be scaled to a very large number of items without any loss in accuracy.

Use of a small model is recommended to reduce cost. Prompt caching can also reduce cost as the preamble for each query is repeated.