dspy.LocalLLM
dspy.LocalLLM(
object=None,
model='LocalLLM',
model_type='chat',
temperature=0.0,
max_tokens=1000,
cache=True,
trace=True,
**kwargs,
)
Create a Local LLM object which you can use alongside dspy
Parameters
| object |
|
An object of class ‘llama_cpp.llama.Llama’ as returned by Llama from the llama-cpp-python package |
None |
| model |
|
A name that you provide to the model. Defaults to ‘LocalLLM’ |
'LocalLLM' |
| model_type |
|
String with the type of model. e.g. ‘chat’, ‘responses’. Currently only tested with type: ‘chat’ |
'chat' |
| temperature |
|
The model temperature to use. Defaults to 0 |
0.0 |
| max_tokens |
|
Maximum number of tokens. Defaults to 1000 |
1000 |
| cache |
|
Use the cache to avoid recomputing the same llm call twice. Defaults to True. |
True |
| trace |
|
Boolean indicating to log internal calls. Defaults to False. |
True |
Returns
|
dspy.BaseLM |
An object of type dspy.BaseLM |
Examples
>>> import dspy
>>> from llama_cpp import Llama
>>> from localllm import LocalLLM, localllm_download_model
>>> path = localllm_download_model("gemma-3-270m-it-Q8_0", overwrite = True, trace = False)
>>> transformer = Llama(model_path=path, n_gpu_layers=-1, flash_attn = False, n_ctx = 32768, n_threads = 1, seed = 4321, verbose = False)
>>> out = transformer("How much is 4x4")
>>>
>>> dspy.configure(lm = LocalLLM(transformer))
>>> class Go(dspy.Signature):
... sentence: str = dspy.InputField(desc = "A question")
... answer: str = dspy.OutputField(desc = "A city name")
>>>
>>> model = dspy.Predict(Go)
>>> out = model(sentence="What is the capital of France")
>>> out["answer"]
'Paris'