LLM Integration (LLM)
The llm submodule of sys provides the metadata, configuration, and runtime services for connecting the ERP to large-language-model providers. It defines what the ERP can ask an LLM to do (LlmAction), who is asking (LlmAgent and LlmApplication), how the LLM should behave (LlmPersona, LlmConfiguration, LlmMemoryStrategy), what extra capabilities it has (LlmSkill, LlmTool, McpClient, LlmDocumentCollection, LlmAdvisor), and records every conversation, message, attachment, and tool call. It depends on sys.frm for input/output DomainSchema definitions and is built on top of Spring AI (ChatClient, ChatMemory, Advisor, ToolCallback, VectorStore) and the Model Context Protocol (McpSyncClient).
Concepts
ActionA repeatable LLM task (e.g. "summarize a brand document", "extract competitors"). An action bundles the model, configuration, persona, memory strategy, prompt text, schemas, skills, tools, MCP clients, document collections, and advisors needed to run the task.
AgentA configured caller of actions. Belongs to an
Applicationand has its own prompt contribution. Each agent has a goal action that defines what it ultimately produces.ApplicationA grouping of agents that share a system prompt — the umbrella product or feature an agent participates in.
PersonaA reusable voice/role prompt (for example "concise editor", "marketing copywriter") that gets prepended to action prompts.
SkillA reusable capability prompt (for example "write SEO-friendly headlines") attached to an action.
ToolA Spring bean that the LLM can invoke as a function call. Each tool entity records the bean name and optional input/output schemas.
MCP ClientAn external Model-Context-Protocol server connection that exposes one or more tools to the LLM.
AdvisorA Spring AI
Advisorbean inserted into the chat-client pipeline, ordered per action. Used for cross-cutting concerns such as logging, caching, or content filtering.Document CollectionA vector-store collection used for retrieval-augmented generation (RAG). Linking a collection to an action wires up a
QuestionAnswerAdvisoragainst that store.Memory StrategyThe chat-memory configuration applied to a conversation (window size, full retention, …).
ConversationOne running instance of an agent calling an action. Owns an ordered stream of messages, attachments, and tool calls and tracks process status, finish reason, token usage, and estimated cost.
Prompt ContributorA common interface (
IsPromptContributor) implemented by entities that contribute a fragment to the assembled system prompt — application, agent, persona, action, skill, MCP client.Question RoleA special message role used when the LLM calls the AskUserQuestion tool to pause the conversation and wait for human input.
Entities
Application (LlmApplication)
The umbrella product/feature whose agents share a base system prompt.
| Field | Description |
|---|---|
| Business key (up to 8 characters). |
| Human-readable name. |
| Optional prompt fragment prepended to every conversation initiated by an agent of this application. |
Agent (LlmAgent)
A configured caller within an application; runs actions to achieve a goal.
| Field | Description |
|---|---|
| Business key (up to 8 characters). |
| Human-readable name. |
| Owning application. |
| Classification of the agent. |
| Prompt fragment that describes the agent’s role. |
| The action whose successful completion satisfies the agent’s goal. |
Agent Type (LlmAgentType)
Classification used to group agents (for example "human", "autonomous", "review").
| Field | Description |
|---|---|
| Business key (up to 4 characters). |
| Human-readable label. |
Action (LlmAction)
A repeatable LLM task. An action bundles all the metadata Spring AI needs to build a ChatClient: the model, configuration, persona, memory strategy, prompt text, plus the link tables for schemas, skills, tools, MCP clients, document collections, and advisors.
| Field | Description |
|---|---|
| Business key (up to 12 characters). |
| Human-readable name. |
| Optional plain-language goal of the action. |
| Classification (Generate, Plan, Reflect, …). |
| The LLM model to call. |
| Sampling, temperature, token limits, and retry settings. |
| Voice/role applied to the assistant. |
| Optional chat-memory configuration; absent means no memory. |
| Optional task-specific prompt fragment. |
The hand-written entity adds the helpers getInputSchemas() (returns input/memory schemas) and getOutputAgentSchema() (returns the single output schema or raises a business exception if zero or many exist).
Action Type (LlmActionType)
Classification with an optional global constraint prompt that is added to every action of that type.
| Field | Description |
|---|---|
| Business key (up to 4 characters). |
| Human-readable label. |
| Optional constraint text appended to action prompts of this type. |
Action Schema (LlmActionSchema)
Links an action to a DomainSchema for either input, output, or memory data, with cardinality and optional UI hints.
| Field | Description |
|---|---|
| Owning action (composite business key with |
| The schema describing the data shape. |
| Whether this schema is consumed ( |
| Whether the data is a single item ( |
| Optional form layout for entering this data. |
| Optional grid layout for displaying this data. |
A validation rule enforces that, when a domainForm or domainGrid is given, it must reference the same domainSchema as this row.
Input/Output (LlmInputOutput)
| Code | Name | Meaning |
|---|---|---|
| Input | Data the LLM consumes. |
| Output | Data the LLM produces. |
| From Memory | Data sourced from chat memory or RAG context. |
Cardinality (LlmCardinality)
| Code | Name | Meaning |
|---|---|---|
| Item | A single record. |
| List | A list of records. |
Action Skill / Tool / MCP / Advisor / Document Links
These are pure link tables that attach reusable capabilities to an action.
LlmActionSkilllinks a skill (prompt fragment) to an action.
LlmActionToollinks a tool (Spring bean) to an action.
LlmActionMcplinks an MCP client (external tool server) to an action.
LlmActionAdvisorlinks an advisor (Spring AI advisor bean) to an action with an explicit ordering.
LlmActionDocumentlinks a RAG document collection to an action.
| Field | Description |
|---|---|
| Owning action; combined with the second key field forms the business key. |
| Order in which this advisor runs in the chat-client pipeline; resolved descending. |
Persona (LlmPersona) and Persona Type (LlmPersonaType)
Persona provides the voice/role prompt fragment.
| Field | Description |
|---|---|
| Business key (up to 8 characters; type uses 4). |
| Human-readable label. |
| Classification of the persona. |
| Prompt fragment that gets injected into the system prompt. |
Skill (LlmSkill) and Skill Type (LlmSkillType)
Skill is a reusable capability prompt added to actions through LlmActionSkill.
| Field | Description |
|---|---|
| Business key (up to 8 characters; type uses 4). |
| Human-readable label. |
| Classification of the skill. |
| Capability description; emitted as |
Tool (LlmTool) and Tool Type (LlmToolType)
Tool maps a Spring bean to a function the LLM can call.
| Field | Description |
|---|---|
| Business key (up to 12 characters). |
| Human-readable label shown to the LLM. |
| Name of the Spring bean that implements the tool callback. |
| Classification of the tool. |
| Optional input |
| Optional output |
MCP Client (McpClient) and MCP Client Type (McpClientType)
MCP client describes a Model-Context-Protocol server connection that exposes one or more tools.
| Field | Description |
|---|---|
| Business key (the MCP client’s name). |
| Human-readable description; used as the prompt contribution. |
| Classification of the MCP client. |
| Transport mechanism — see Transport Types below. |
| Optional connection URL (used for |
| Schema describing the MCP request payload. |
| Schema describing the MCP response payload. |
Transport Types (McpTransportType)
| Code | Name | Meaning |
|---|---|---|
| Stdio | Local process launched via stdio. |
| Streamable HTTP | HTTP transport with streamable response framing. |
| Server-Sent Events | HTTP transport using SSE. |
Advisor (LlmAdvisor) and Advisor Type (LlmAdvisorType)
Advisor declares a Spring AI advisor bean that can be inserted into the chat-client pipeline of an action.
| Field | Description |
|---|---|
| Business key (up to 8 characters; type uses 4). |
| Human-readable label. |
| Classification of the advisor. |
| Spring bean name that resolves to a |
| Default ordering hint for the advisor. |
Document Collection / Vector Store
These three entities configure RAG sources.
LlmDocumentCollectiona named collection inside a vector store, with similarity threshold and result count.
LlmVectorStorea vector store endpoint (and
LlmVectorStoreTypeclassifies the engine).
| Field | Description |
|---|---|
| Business key (up to 8 characters). |
| Human-readable label. |
| Backing vector store. |
| Collection name inside the store. |
| Optional similarity cut-off for retrieval. |
| Optional top-K result count. |
| Classification of the vector store. |
| Optional URL of the store. |
| Optional default collection name. |
Memory Strategy (LlmMemoryStrategy) and Memory Strategy Type (LlmMemoryStrategyType)
Memory strategy controls how chat history is retained for a conversation.
| Field | Description |
|---|---|
| Business key (up to 4 characters). |
| Human-readable label. |
| Strategy classification (e.g. |
| Number of messages retained in the rolling window. |
Configuration (LlmConfiguration)
Sampling, temperature, token-limit, and retry settings applied when building chat options.
| Field | Description |
|---|---|
| Business key (up to 4 characters). |
| Human-readable label. |
| Optional top-K sampling. |
| Optional top-P (nucleus) sampling. |
| Optional temperature. |
| Mandatory output token limit. |
| Optional frequency penalty. |
| Optional presence penalty. |
| Optional retry count for transient failures. |
| Optional delay between retries. |
Model (LlmModel) and Provider (LlmProvider)
Model identifies a specific LLM offered by a provider; provider is the vendor (Anthropic, OpenAI, …).
| Field | Description |
|---|---|
| Business key (up to 12 characters; provider uses 4). |
| Human-readable label. |
| Vendor model identifier passed to the API (e.g. |
| The vendor offering the model. |
| The model’s training cutoff date. |
| Price per input token, used for cost estimation. |
| Price per output token, used for cost estimation. |
Model Capability (LlmModelCapability)
Lists what a model is capable of (link table on the model, keyed by capability type).
Capability Types (LlmModelCapabilityType)
| Code | Name | Meaning |
|---|---|---|
| Function Calling | Supports tool/function calls. |
| Vision | Accepts image input. |
| Streaming | Supports streaming responses. |
| JSON Mode | Can be forced to emit valid JSON. |
| Embedding | Can produce embedding vectors. |
| Audio | Accepts or produces audio. |
Conversation (LlmConversation)
One running instance of an agent calling an action. The conversation aggregates its messages, attachments, and tool calls and tracks lifecycle, finish reason, token usage, estimated cost, and any error detail.
| Field | Description |
|---|---|
| Globally-unique business key. |
| Agent that started the conversation. |
| Action being executed. |
| Lifecycle state — see Process Status below. |
| Optional reason the model stopped — see Finish Reason below. |
| Model name actually used in the response. |
| Calculated input-token total (business-information field). |
| Calculated output-token total (business-information field). |
| Estimated cost from token counts and model pricing (business-information field). |
| Captured error message when the conversation fails. |
Process Status (LlmProcessStatus)
| Code | Name | Meaning |
|---|---|---|
| Not started | Initial state after creation. |
| Running | The model is currently being called. |
| Completed | The action finished successfully. |
| Failed | The action raised an exception. |
| Terminated | The conversation was ended early. |
| Killed | The conversation was forcibly stopped. |
| Stuck | The conversation appears to be hung. |
| Waiting | Paused waiting for user input via the AskUserQuestion tool. |
| Paused | Paused administratively. |
Finish Reason (LlmFinishReason)
| Code | Name | Meaning |
|---|---|---|
| Stop | Model produced a normal stop token. |
| Length | Output truncated by the token limit. |
| Content Filter | Output blocked by a safety filter. |
| Tool Calls | Stopped to emit one or more tool calls. |
| Other | Any other reason reported by the provider. |
Message (LlmMessage)
One entry in a conversation’s transcript. Messages are recorded as immutable events ordered by messageSequence.
| Field | Description |
|---|---|
| Owning conversation. |
| Per-conversation message ordinal. |
| Who or what produced the message — see Message Roles below. |
| Message body. |
| Calculated tokens for this message. |
| Optional tool-call id when this message is a tool response. |
Message Roles (LlmMessageRole)
| Code | Name | Meaning |
|---|---|---|
| System | System prompt. |
| User | Human user input. |
| Assistant | LLM-generated response. |
| Tool | Output of a tool call returning to the LLM. |
| Question | The LLM is asking the user a structured question via the AskUserQuestion tool. |
Message Attachment (LlmMessageAttachment)
A file attached to a message (image, document, …). Stored as an immutable event.
| Field | Description |
|---|---|
| Owning message. |
| Order of the attachment within the message. |
| MIME type / category of the attached resource. |
| URL pointing to the attachment content. |
Tool Call (LlmToolCall)
A function call the LLM emitted during a message.
| Field | Description |
|---|---|
| Owning assistant message. |
| Provider-supplied tool-call id. |
| Name of the tool the LLM asked to invoke. |
| Optional JSON arguments supplied to the tool. |
Functionality
System-prompt assembly
LlmPromptAssemblyService builds the system prompt that is sent on every conversation. It concatenates contributions from the agent’s application, the agent itself, the action’s persona, every linked skill, every linked MCP client, the schema-prompt block (input/output data structures), and finally the action’s own prompt. Each contribution comes from IsPromptContributor.getPromptContribution() — implemented by LlmApplication, LlmAgent, LlmPersona, LlmSkill, McpClient, and LlmAction. Skills and MCP entries get short labels (Your skill: …, Available tool '<name>': …) before their text so the LLM can distinguish them.
Schema-prompt building
LlmActionSchemaPromptService walks the action’s LlmActionSchema rows and produces an "Input data structure:" / "Output data structure:" block describing each schema’s fields. It is used as one segment of the assembled system prompt so the model knows the shape of the data it must consume and produce.
Chat-options building
LlmChatOptionsFactory translates an action’s LlmConfiguration into a Spring AI ChatOptions (model name, max tokens, temperature, top-K, top-P, frequency penalty, presence penalty). Optional values are only set on the builder when present.
Chat-client construction
LlmChatClientFactory is the central wiring point. For each conversation it produces a Spring AI ChatClient configured with: the model bean for the action’s model, the assembled system prompt, the chat options, the resolved tool callbacks plus a per-conversation AskUserQuestion tool, the resolved advisors (action advisors + RAG advisors + a chat-memory advisor when a memory strategy is set), and any MCP tool callbacks. The resulting client is what LlmConversationService calls into.
Tool-callback resolution
LlmToolCallbackResolver looks up each LlmActionTool by its llmToolBeanName in the Spring application context and verifies that the bean is a ToolCallback. It supports an exclude set so the per-conversation AskUserQuestion callback can be inserted separately.
Advisor resolution
LlmAdvisorResolver reads LlmActionAdvisor rows ordered by llmAdvisorOrder and looks up each llmAdvisorBeanName in the Spring context. The resulting list of Advisor instances is added to the chat-client pipeline.
Document-collection (RAG) resolution
LlmDocumentCollectionResolver walks the action’s LlmActionDocument rows. For each linked LlmDocumentCollection it looks up the matching VectorStore bean (by the vector store’s code) and creates a QuestionAnswerAdvisor configured with the collection’s similarity threshold and result count. The default top-K is 4 when not specified on the collection.
MCP-client resolution
LlmMcpClientResolver opens (and caches) one McpSyncClient per McpClient entity, choosing the transport (stdio, streamable HTTP, or SSE) based on mcpTransportType and the mcpConnectionUrl. Each connected client is wrapped in a SyncMcpToolCallbackProvider whose tool callbacks are merged into the chat client. Connections are kept across conversations and closed during shutdown.
Chat memory
LlmMemoryService materializes a ChatMemory per conversation based on the action’s LlmMemoryStrategy. FULL strategy uses an unbounded MessageWindowChatMemory; any other strategy uses a window of llmMemoryWindowSize messages. Memories are cached by conversation id and can be cleared explicitly.
Conversation execution
LlmConversationService is the primary entry point for running an action. startConversation creates a LlmConversation row with a fresh GUID. sendMessage / sendMessageWithAttachments records the user message (and any attachments), flips the conversation to RUNNING, builds a chat client, and calls the model. On success it records the assistant message, persists every emitted tool call, captures token usage and estimated cost (using the model’s input/output token prices), and marks the conversation COMPLETED with the appropriate finish reason. On failure it records the error detail and marks the conversation FAILED.
Human-in-the-loop questions
LlmAskUserQuestionService together with ErpAskUserQuestionHandler implements the AskUserQuestion pattern. When the LLM calls the AskUserQuestion tool, the handler thread blocks on a CompletableFuture for up to five minutes while the conversation status is set to WAITING and a question message (role QU, falling back to TO if the generator has not yet produced the QU code) is persisted. A separate API call (submitAnswers) resolves the future and unblocks the handler so the LLM can continue. Each conversation gets its own handler instance, bound to its conversation id.
Public API
SYS_LLM_QueryApi
Read-side facade used by other modules and the UI.
| Method | Description |
|---|---|
| All |
| Action by business key, or |
| Agent by business key, or |
| Conversation by uuid, or |
| Messages of a conversation in sequence. |
| Tool by business key, or |
| MCP client by name, or |
| The most recent question message for a conversation in |
SYS_LLM_CommandApi
Write-side facade used by other modules and the UI.
| Method | Description |
|---|---|
| Creates a new conversation in |
| Posts a user message and runs the LLM; returns the assistant text or |
| Same as |
| Resolves a |
IsPromptContributor extension
IsPromptContributor is the public seam for entities that contribute a fragment to the assembled system prompt. The submodule itself implements it on LlmApplication, LlmAgent, LlmPersona, LlmAction, LlmSkill, and McpClient. Other modules that introduce their own LLM-aware entities can implement the same interface to plug into prompt assembly.
AskUserQuestion handler
ErpAskUserQuestionHandler implements the framework’s io.venlo.frame.server.ai.tools.AskUserQuestionHandler. It is constructed per conversation by LlmChatClientFactory and forwards the LLM’s structured question payload to LlmAskUserQuestionService.waitForUserAnswers, blocking until submitAnswers is called.
ViewModel actions
The submodule defines view models for the standard CRUD pages on every entity but does not declare any custom UI actions. Conversations, messages, tool calls, and attachments are surfaced as event streams under their parent conversation.