dspy.LocalLLM

dspy.LocalLLM(
    object=None,
    model='LocalLLM',
    model_type='chat',
    temperature=0.0,
    max_tokens=1000,
    cache=True,
    trace=True,
    **kwargs,
)

Create a Local LLM object which you can use alongside dspy

Parameters

Name Type Description Default
object An object of class ‘llama_cpp.llama.Llama’ as returned by Llama from the llama-cpp-python package None
model A name that you provide to the model. Defaults to ‘LocalLLM’ 'LocalLLM'
model_type String with the type of model. e.g. ‘chat’, ‘responses’. Currently only tested with type: ‘chat’ 'chat'
temperature The model temperature to use. Defaults to 0 0.0
max_tokens Maximum number of tokens. Defaults to 1000 1000
cache Use the cache to avoid recomputing the same llm call twice. Defaults to True. True
trace Boolean indicating to log internal calls. Defaults to False. True

Returns

Name Type Description
dspy.BaseLM An object of type dspy.BaseLM

Examples

>>> import dspy
>>> from llama_cpp import Llama
>>> from localllm import LocalLLM, localllm_download_model
>>> path = localllm_download_model("gemma-3-270m-it-Q8_0", overwrite = True, trace = False)
>>> transformer = Llama(model_path=path, n_gpu_layers=-1, flash_attn = False, n_ctx = 32768, n_threads = 1, seed = 4321, verbose = False)
>>> out = transformer("How much is 4x4")
>>>
>>> dspy.configure(lm = LocalLLM(transformer))
>>> class Go(dspy.Signature):
...     sentence: str = dspy.InputField(desc = "A question")
...     answer:   str = dspy.OutputField(desc = "A city name")
>>>
>>> model = dspy.Predict(Go)
>>> out = model(sentence="What is the capital of France")
>>> out["answer"]
'Paris'