LLM Integration (LLM)

The llm submodule of sys provides the metadata, configuration, and runtime services for connecting the ERP to large-language-model providers. It defines what the ERP can ask an LLM to do (LlmAction), who is asking (LlmAgent and LlmApplication), how the LLM should behave (LlmPersona, LlmConfiguration, LlmMemoryStrategy), what extra capabilities it has (LlmSkill, LlmTool, McpClient, LlmDocumentCollection, LlmAdvisor), and records every conversation, message, attachment, and tool call. It depends on sys.frm for input/output DomainSchema definitions and is built on top of Spring AI (ChatClient, ChatMemory, Advisor, ToolCallback, VectorStore) and the Model Context Protocol (McpSyncClient).

Concepts

Action

A repeatable LLM task (e.g. "summarize a brand document", "extract competitors"). An action bundles the model, configuration, persona, memory strategy, prompt text, schemas, skills, tools, MCP clients, document collections, and advisors needed to run the task.

Agent

A configured caller of actions. Belongs to an Application and has its own prompt contribution. Each agent has a goal action that defines what it ultimately produces.

Application

A grouping of agents that share a system prompt — the umbrella product or feature an agent participates in.

Persona

A reusable voice/role prompt (for example "concise editor", "marketing copywriter") that gets prepended to action prompts.

Skill

A reusable capability prompt (for example "write SEO-friendly headlines") attached to an action.

Tool

A Spring bean that the LLM can invoke as a function call. Each tool entity records the bean name and optional input/output schemas.

MCP Client

An external Model-Context-Protocol server connection that exposes one or more tools to the LLM.

Advisor

A Spring AI Advisor bean inserted into the chat-client pipeline, ordered per action. Used for cross-cutting concerns such as logging, caching, or content filtering.

Document Collection

A vector-store collection used for retrieval-augmented generation (RAG). Linking a collection to an action wires up a QuestionAnswerAdvisor against that store.

Memory Strategy

The chat-memory configuration applied to a conversation (window size, full retention, …​).

Conversation

One running instance of an agent calling an action. Owns an ordered stream of messages, attachments, and tool calls and tracks process status, finish reason, token usage, and estimated cost.

Prompt Contributor

A common interface (IsPromptContributor) implemented by entities that contribute a fragment to the assembled system prompt — application, agent, persona, action, skill, MCP client.

Question Role

A special message role used when the LLM calls the AskUserQuestion tool to pause the conversation and wait for human input.

Entities

LLM entities

Application (LlmApplication)

The umbrella product/feature whose agents share a base system prompt.

FieldDescription

code

Business key (up to 8 characters).

description

Human-readable name.

llmSystemPrompt

Optional prompt fragment prepended to every conversation initiated by an agent of this application.

Agent (LlmAgent)

A configured caller within an application; runs actions to achieve a goal.

FieldDescription

code

Business key (up to 8 characters).

description

Human-readable name.

llmApplication

Owning application.

llmAgentType

Classification of the agent.

llmAgentPrompt

Prompt fragment that describes the agent’s role.

llmActionAchievesGoal

The action whose successful completion satisfies the agent’s goal.

Agent Type (LlmAgentType)

Classification used to group agents (for example "human", "autonomous", "review").

FieldDescription

code

Business key (up to 4 characters).

description

Human-readable label.

Action (LlmAction)

A repeatable LLM task. An action bundles all the metadata Spring AI needs to build a ChatClient: the model, configuration, persona, memory strategy, prompt text, plus the link tables for schemas, skills, tools, MCP clients, document collections, and advisors.

FieldDescription

code

Business key (up to 12 characters).

description

Human-readable name.

goalDescription

Optional plain-language goal of the action.

llmActionType

Classification (Generate, Plan, Reflect, …​).

llmModel

The LLM model to call.

llmConfiguration

Sampling, temperature, token limits, and retry settings.

llmPersona

Voice/role applied to the assistant.

llmMemoryStrategy

Optional chat-memory configuration; absent means no memory.

llmActionPrompt

Optional task-specific prompt fragment.

The hand-written entity adds the helpers getInputSchemas() (returns input/memory schemas) and getOutputAgentSchema() (returns the single output schema or raises a business exception if zero or many exist).

Action Type (LlmActionType)

Classification with an optional global constraint prompt that is added to every action of that type.

FieldDescription

code

Business key (up to 4 characters).

description

Human-readable label.

llmConstraintPrompt

Optional constraint text appended to action prompts of this type.

Action Schema (LlmActionSchema)

Links an action to a DomainSchema for either input, output, or memory data, with cardinality and optional UI hints.

FieldDescription

llmAction

Owning action (composite business key with domainSchema).

domainSchema

The schema describing the data shape.

llmInputOutput

Whether this schema is consumed (I), produced (O), or read from memory (M).

llmCardinality

Whether the data is a single item (I) or a list (L).

domainForm

Optional form layout for entering this data.

domainGrid

Optional grid layout for displaying this data.

A validation rule enforces that, when a domainForm or domainGrid is given, it must reference the same domainSchema as this row.

Input/Output (LlmInputOutput)

CodeNameMeaning

I

Input

Data the LLM consumes.

O

Output

Data the LLM produces.

M

From Memory

Data sourced from chat memory or RAG context.

Cardinality (LlmCardinality)

CodeNameMeaning

I

Item

A single record.

L

List

A list of records.

These are pure link tables that attach reusable capabilities to an action.

LlmActionSkill

links a skill (prompt fragment) to an action.

LlmActionTool

links a tool (Spring bean) to an action.

LlmActionMcp

links an MCP client (external tool server) to an action.

LlmActionAdvisor

links an advisor (Spring AI advisor bean) to an action with an explicit ordering.

LlmActionDocument

links a RAG document collection to an action.

FieldDescription

llmAction (all)

Owning action; combined with the second key field forms the business key.

llmAdvisorOrder (LlmActionAdvisor)

Order in which this advisor runs in the chat-client pipeline; resolved descending.

Persona (LlmPersona) and Persona Type (LlmPersonaType)

Persona provides the voice/role prompt fragment.

FieldDescription

code

Business key (up to 8 characters; type uses 4).

description

Human-readable label.

llmPersonaType

Classification of the persona.

llmPersonaPrompt

Prompt fragment that gets injected into the system prompt.

Skill (LlmSkill) and Skill Type (LlmSkillType)

Skill is a reusable capability prompt added to actions through LlmActionSkill.

FieldDescription

code

Business key (up to 8 characters; type uses 4).

description

Human-readable label.

llmSkillType

Classification of the skill.

llmSkillPrompt

Capability description; emitted as Your skill: <prompt>. in the system prompt.

Tool (LlmTool) and Tool Type (LlmToolType)

Tool maps a Spring bean to a function the LLM can call.

FieldDescription

code

Business key (up to 12 characters).

description

Human-readable label shown to the LLM.

llmToolBeanName

Name of the Spring bean that implements the tool callback.

llmToolType

Classification of the tool.

llmSchemaInput

Optional input DomainSchema.

llmSchemaOutput

Optional output DomainSchema.

MCP Client (McpClient) and MCP Client Type (McpClientType)

MCP client describes a Model-Context-Protocol server connection that exposes one or more tools.

FieldDescription

name

Business key (the MCP client’s name).

description

Human-readable description; used as the prompt contribution.

mcpClientType

Classification of the MCP client.

mcpTransportType

Transport mechanism — see Transport Types below.

mcpConnectionUrl

Optional connection URL (used for SH and SE transports).

llmSchemaInput

Schema describing the MCP request payload.

llmSchemaOutput

Schema describing the MCP response payload.

Transport Types (McpTransportType)

CodeNameMeaning

ST

Stdio

Local process launched via stdio.

SH

Streamable HTTP

HTTP transport with streamable response framing.

SE

Server-Sent Events

HTTP transport using SSE.

Advisor (LlmAdvisor) and Advisor Type (LlmAdvisorType)

Advisor declares a Spring AI advisor bean that can be inserted into the chat-client pipeline of an action.

FieldDescription

code

Business key (up to 8 characters; type uses 4).

description

Human-readable label.

llmAdvisorType

Classification of the advisor.

llmAdvisorBeanName

Spring bean name that resolves to a org.springframework.ai.chat.client.advisor.api.Advisor.

llmAdvisorOrder

Default ordering hint for the advisor.

Document Collection / Vector Store

These three entities configure RAG sources.

LlmDocumentCollection

a named collection inside a vector store, with similarity threshold and result count.

LlmVectorStore

a vector store endpoint (and LlmVectorStoreType classifies the engine).

FieldDescription

code (collection / store)

Business key (up to 8 characters).

description

Human-readable label.

llmVectorStore (collection)

Backing vector store.

collectionName

Collection name inside the store.

llmSimilarityThreshold

Optional similarity cut-off for retrieval.

llmResultCount

Optional top-K result count.

llmVectorStoreType (store)

Classification of the vector store.

connectionUrl (store)

Optional URL of the store.

defaultCollectionName (store)

Optional default collection name.

Memory Strategy (LlmMemoryStrategy) and Memory Strategy Type (LlmMemoryStrategyType)

Memory strategy controls how chat history is retained for a conversation.

FieldDescription

code

Business key (up to 4 characters).

description

Human-readable label.

llmMemoryStrategyType

Strategy classification (e.g. FULL retains all messages).

llmMemoryWindowSize

Number of messages retained in the rolling window.

Configuration (LlmConfiguration)

Sampling, temperature, token-limit, and retry settings applied when building chat options.

FieldDescription

code

Business key (up to 4 characters).

description

Human-readable label.

samplingSizeTopK

Optional top-K sampling.

samplingSizeTopP

Optional top-P (nucleus) sampling.

temperature

Optional temperature.

tokenSizeMax

Mandatory output token limit.

frequencyPenalty

Optional frequency penalty.

presencePenalty

Optional presence penalty.

llmRetryCount

Optional retry count for transient failures.

llmRetryDelay

Optional delay between retries.

Model (LlmModel) and Provider (LlmProvider)

Model identifies a specific LLM offered by a provider; provider is the vendor (Anthropic, OpenAI, …​).

FieldDescription

code (model / provider)

Business key (up to 12 characters; provider uses 4).

description

Human-readable label.

llmModelName (model)

Vendor model identifier passed to the API (e.g. claude-opus-4-7).

llmProvider (model)

The vendor offering the model.

llmKnowledgeCutoffDate

The model’s training cutoff date.

llmInputTokenPrice

Price per input token, used for cost estimation.

llmOutputTokenPrice

Price per output token, used for cost estimation.

Model Capability (LlmModelCapability)

Lists what a model is capable of (link table on the model, keyed by capability type).

Capability Types (LlmModelCapabilityType)

CodeNameMeaning

FC

Function Calling

Supports tool/function calls.

VI

Vision

Accepts image input.

SR

Streaming

Supports streaming responses.

JM

JSON Mode

Can be forced to emit valid JSON.

EM

Embedding

Can produce embedding vectors.

AU

Audio

Accepts or produces audio.

Conversation (LlmConversation)

One running instance of an agent calling an action. The conversation aggregates its messages, attachments, and tool calls and tracks lifecycle, finish reason, token usage, estimated cost, and any error detail.

FieldDescription

conversationUuid

Globally-unique business key.

llmAgent

Agent that started the conversation.

llmAction

Action being executed.

llmProcessStatus

Lifecycle state — see Process Status below.

llmFinishReason

Optional reason the model stopped — see Finish Reason below.

llmResponseModel

Model name actually used in the response.

biInputTokenCount

Calculated input-token total (business-information field).

biOutputTokenCount

Calculated output-token total (business-information field).

biEstimatedCost

Estimated cost from token counts and model pricing (business-information field).

llmErrorDetail

Captured error message when the conversation fails.

Process Status (LlmProcessStatus)

CodeNameMeaning

NS

Not started

Initial state after creation.

RU

Running

The model is currently being called.

CO

Completed

The action finished successfully.

FA

Failed

The action raised an exception.

TE

Terminated

The conversation was ended early.

KI

Killed

The conversation was forcibly stopped.

ST

Stuck

The conversation appears to be hung.

WA

Waiting

Paused waiting for user input via the AskUserQuestion tool.

PA

Paused

Paused administratively.

Finish Reason (LlmFinishReason)

CodeNameMeaning

ST

Stop

Model produced a normal stop token.

LE

Length

Output truncated by the token limit.

CF

Content Filter

Output blocked by a safety filter.

TC

Tool Calls

Stopped to emit one or more tool calls.

OT

Other

Any other reason reported by the provider.

Message (LlmMessage)

One entry in a conversation’s transcript. Messages are recorded as immutable events ordered by messageSequence.

FieldDescription

llmConversation

Owning conversation.

messageSequence

Per-conversation message ordinal.

llmMessageRole

Who or what produced the message — see Message Roles below.

llmMessageContent

Message body.

biTokenCount

Calculated tokens for this message.

llmToolCallId

Optional tool-call id when this message is a tool response.

Message Roles (LlmMessageRole)

CodeNameMeaning

SY

System

System prompt.

US

User

Human user input.

AS

Assistant

LLM-generated response.

TO

Tool

Output of a tool call returning to the LLM.

QU

Question

The LLM is asking the user a structured question via the AskUserQuestion tool.

Message Attachment (LlmMessageAttachment)

A file attached to a message (image, document, …​). Stored as an immutable event.

FieldDescription

llmMessage

Owning message.

attachmentSequence

Order of the attachment within the message.

llmMediaType

MIME type / category of the attached resource.

attachmentUrl

URL pointing to the attachment content.

Tool Call (LlmToolCall)

A function call the LLM emitted during a message.

FieldDescription

llmMessage

Owning assistant message.

llmToolCallId

Provider-supplied tool-call id.

llmToolCallName

Name of the tool the LLM asked to invoke.

llmToolCallArguments

Optional JSON arguments supplied to the tool.

Functionality

System-prompt assembly

LlmPromptAssemblyService builds the system prompt that is sent on every conversation. It concatenates contributions from the agent’s application, the agent itself, the action’s persona, every linked skill, every linked MCP client, the schema-prompt block (input/output data structures), and finally the action’s own prompt. Each contribution comes from IsPromptContributor.getPromptContribution() — implemented by LlmApplication, LlmAgent, LlmPersona, LlmSkill, McpClient, and LlmAction. Skills and MCP entries get short labels (Your skill: …​, Available tool '<name>': …​) before their text so the LLM can distinguish them.

Schema-prompt building

LlmActionSchemaPromptService walks the action’s LlmActionSchema rows and produces an "Input data structure:" / "Output data structure:" block describing each schema’s fields. It is used as one segment of the assembled system prompt so the model knows the shape of the data it must consume and produce.

Chat-options building

LlmChatOptionsFactory translates an action’s LlmConfiguration into a Spring AI ChatOptions (model name, max tokens, temperature, top-K, top-P, frequency penalty, presence penalty). Optional values are only set on the builder when present.

Chat-client construction

LlmChatClientFactory is the central wiring point. For each conversation it produces a Spring AI ChatClient configured with: the model bean for the action’s model, the assembled system prompt, the chat options, the resolved tool callbacks plus a per-conversation AskUserQuestion tool, the resolved advisors (action advisors + RAG advisors + a chat-memory advisor when a memory strategy is set), and any MCP tool callbacks. The resulting client is what LlmConversationService calls into.

Tool-callback resolution

LlmToolCallbackResolver looks up each LlmActionTool by its llmToolBeanName in the Spring application context and verifies that the bean is a ToolCallback. It supports an exclude set so the per-conversation AskUserQuestion callback can be inserted separately.

Advisor resolution

LlmAdvisorResolver reads LlmActionAdvisor rows ordered by llmAdvisorOrder and looks up each llmAdvisorBeanName in the Spring context. The resulting list of Advisor instances is added to the chat-client pipeline.

Document-collection (RAG) resolution

LlmDocumentCollectionResolver walks the action’s LlmActionDocument rows. For each linked LlmDocumentCollection it looks up the matching VectorStore bean (by the vector store’s code) and creates a QuestionAnswerAdvisor configured with the collection’s similarity threshold and result count. The default top-K is 4 when not specified on the collection.

MCP-client resolution

LlmMcpClientResolver opens (and caches) one McpSyncClient per McpClient entity, choosing the transport (stdio, streamable HTTP, or SSE) based on mcpTransportType and the mcpConnectionUrl. Each connected client is wrapped in a SyncMcpToolCallbackProvider whose tool callbacks are merged into the chat client. Connections are kept across conversations and closed during shutdown.

Chat memory

LlmMemoryService materializes a ChatMemory per conversation based on the action’s LlmMemoryStrategy. FULL strategy uses an unbounded MessageWindowChatMemory; any other strategy uses a window of llmMemoryWindowSize messages. Memories are cached by conversation id and can be cleared explicitly.

Conversation execution

LlmConversationService is the primary entry point for running an action. startConversation creates a LlmConversation row with a fresh GUID. sendMessage / sendMessageWithAttachments records the user message (and any attachments), flips the conversation to RUNNING, builds a chat client, and calls the model. On success it records the assistant message, persists every emitted tool call, captures token usage and estimated cost (using the model’s input/output token prices), and marks the conversation COMPLETED with the appropriate finish reason. On failure it records the error detail and marks the conversation FAILED.

Human-in-the-loop questions

LlmAskUserQuestionService together with ErpAskUserQuestionHandler implements the AskUserQuestion pattern. When the LLM calls the AskUserQuestion tool, the handler thread blocks on a CompletableFuture for up to five minutes while the conversation status is set to WAITING and a question message (role QU, falling back to TO if the generator has not yet produced the QU code) is persisted. A separate API call (submitAnswers) resolves the future and unblocks the handler so the LLM can continue. Each conversation gets its own handler instance, bound to its conversation id.

Public API

SYS_LLM_QueryApi

Read-side facade used by other modules and the UI.

MethodDescription

getAllActions()

All LlmAction rows.

findActionByCode(Code12)

Action by business key, or null.

findAgentByCode(Code8)

Agent by business key, or null.

findConversation(Guid)

Conversation by uuid, or null.

getConversationMessages(LlmConversation)

Messages of a conversation in sequence.

findToolByCode(Code12)

Tool by business key, or null.

findMcpClientByName(McpClientName)

MCP client by name, or null.

getPendingQuestions(Guid)

The most recent question message for a conversation in WAITING, or null.

SYS_LLM_CommandApi

Write-side facade used by other modules and the UI.

MethodDescription

startConversation(LlmAgent, LlmAction)

Creates a new conversation in NS (not started).

sendMessage(Guid, String)

Posts a user message and runs the LLM; returns the assistant text or null on failure.

sendMessageWithAttachments(Guid, String, List<AttachmentData>)

Same as sendMessage, but includes file attachments on the user message.

submitAnswers(Guid, Map<String, String>)

Resolves a WAITING conversation by answering the LLM’s pending AskUserQuestion call.

IsPromptContributor extension

IsPromptContributor is the public seam for entities that contribute a fragment to the assembled system prompt. The submodule itself implements it on LlmApplication, LlmAgent, LlmPersona, LlmAction, LlmSkill, and McpClient. Other modules that introduce their own LLM-aware entities can implement the same interface to plug into prompt assembly.

AskUserQuestion handler

ErpAskUserQuestionHandler implements the framework’s io.venlo.frame.server.ai.tools.AskUserQuestionHandler. It is constructed per conversation by LlmChatClientFactory and forwards the LLM’s structured question payload to LlmAskUserQuestionService.waitForUserAnswers, blocking until submitAnswers is called.

ViewModel actions

The submodule defines view models for the standard CRUD pages on every entity but does not declare any custom UI actions. Conversations, messages, tool calls, and attachments are surfaced as event streams under their parent conversation.