Overview / Best Practices / Model Selection

Introduction

There are many models on CBorg and it is not obvious which model may be the best choice for you. This page provides some notes.

Models for Coding

Model	Cost	Speed	Comments
`lbl/cborg-coder`	Free	Medium	Unlimited usage.
`anthropic/claude-haiku`	$	Fast	Suggested use: adding documentation, simple tasks.
`anthropic/claude-sonnet`	$$	Fast	Strong performance for every day use. Supports computer use.
`anthropic/claude-sonnet-high`	$$$	Medium	Better every day, but watch cost
`anthropic/claude-opus`	$$$	Slow	Very expensive with marginal benefit over Sonnet
`anthropic/claude-opus-high`	$$$$	Slow	Challenging tasks where other models fail
`openai/o4-mini-high`	$$	Medium	Excellent performance at a very low cost.
`openai/o3-high`	$$$	Slow	For challenging tasks (refactoring, math, etc).
`google/gemini-pro`	$$$	Medium	Solid performance, moderate cost.
`google/gemini-pro-high`	$$$$	Slow	For challenging tasks

TLDR - Which model to use?

lbl/cborg-coder: Never need to worry about budget.
openai/o4-mini-high: Best performance-to-cost ratio, strong performer. Best daily for heavy use.
anthropic/claude-sonnet-high: Best daily driver for moderate use.
openai/o3-high: Most challenging reasoning tasks (math, refactoring, etc)
google/gemini-pro-high: Alternative for challenging reasoning tasks

Models for Technical Writing

Recommended Best Practices:

Style Emulation: Using a custom system prompt, provide a large corpus of sample writing and ask the model to emulate the style and language. This is more effective for maintaining style than using general descriptions of writing style.
Complex Document Formation: Use multiple prompts to construct the document one paragraph or one section at a time. This is more scalable and will maintain accuracy and tone throughout.

Best models: anthropic/claude-opus-high and openai/o3-high.

Models for Data Extraction / Summarization

Recommended Best Practice:

Use a small / lightweight model and extract data one piece at a time.
Use chain of thought prompting to enhance accuracy.
Structure prompts with data first, then question, to leverage prompt caching.

Models for Data Extraction

Model	Cost	Speed	Comments
`lbl/cborg-chat`	Free	Medium	Unlimited usage
`lbl/cborg-mini`	Free	Fast	Unlimited usage, lower latency
`anthropic/claude-haiku`	$$	Fast	More expensive than other lightweight models
`anthropic/gemini-flash`	$	Fast	Strong performance, low cost
`anthropic/gemini-flash-lite`	$	Very Fast	Low latency applications
`openai/gpt-4.1-mini`	$$	Fast	Good performance, supports implicit prompt caching
`openai/gpt-4.1-nano`	$	Very Fast	Low latency

Example Code (Not a working example - just a starting point):

import json

def ask_cborg(query):

    # helper function to send query to a model
    # see other example code to implement this
    ...

fields = {
    'project': 'The name of the project',
    'instrument': 'The name of the scientific instrument',
    'status': 'The status of the scientific instrument'
}

data = {}

for key, query in fields.items():

    data[key] = json.loads(ask_cborg(
        "Read the following document and then respond to the question below.\n" + 
        "[[ * Start of Document * ]]\n" + 
        document + 
        "\n[[ * End of Document ]]\n" + 
        f"What is {query}? First answer the question in natural language, describing how it relates to the document, then provide a concise final answer." + 
        "Use the following JSON template: " +
        "{ 'answer': Your answer in natural language, 'final_answer': The final answer. }\n" +
        "Only output the JSON object without any additional commentary."
      ))['final_answer']

Using a loop over each key value as shown above will work better than attempting to extract a large number of items in a single query.

This approach can also be scaled to a very large number of items without any loss in accuracy.

Use of a small model is recommended to reduce cost. Prompt caching can also reduce cost as the preamble for each query is repeated.