Implementing RAG in AI with Pimcore: A Technical Perspective

Introduction

This article is the second part of our RAG + AI series. In Part One, we explored Retrieval-Augmented Generation from a business and data-management perspective: why enterprises struggle with unreliable AI, and how a strong data foundation, especially one based on Pimcore, reduces hallucinations and supports consistent, trustworthy results.

This part goes deeper.

It is is written for developers, solution architects, and IT and data leads who need to understand not just what RAG is, but how to implement it effectively. We'll cover:

  • RAG system architecture and core components
  • Why Pimcore works as an enterprise data source for AI
  • Practical data preparation strategies
  • Integration options and implementation patterns
  • A real-world example from the field
  • Technical safeguards and best practices

By the end, you will have a clear understanding of how RAG systems work in practice and how to build a solution that is stable, scalable, and aligned with enterprise data governance.

RAG System Architecture Explained

Retrieval-Augmented Generation (RAG) is an AI architecture pattern that combines two capabilities:

  • The reasoning abilities of Large Language Models (LLMs)
  • Real-time retrieval from trusted business data sources

LLMs alone are not reliable enough for enterprise use. They operate on static training data and frequently generate confident but incorrect responses. RAG mitigates this by grounding model outputs in actual company data. To understand how to implement RAG with Pimcore, we first need a clear view of the architecture.

Core Components of a Modern RAG System

At its core, a RAG system consists of several key components working together:

  • Data Source Layer: your trusted business data (products, documentation, policies, customer information)
  • Embedding Pipeline: converts text into semantic vectors that capture meaning, not just keywords
  • Vector Database: stores and indexes these embeddings for fast similarity search
  • Retrieval Engine: finds relevant context based on semantic similarity to user queries
  • LLM Agent: generates responses grounded in the retrieved context
  • Application Layer: manages user interactions, prompt engineering, and response delivery

The magic happens when these components work in harmony. When a user asks a question, the system doesn't just search for matching keywords, it understands the semantic meaning, retrieves contextually relevant information, and uses that as the foundation for the LLM's response. This is why retrieval quality depends entirely on having structured, current, and well-indexed data.

Turning Enterprise Data into a Retrieval Layer with Pimcore

Pimcore sits at the center of this architecture for good reason. It's not just another data management system, it's a comprehensive data foundation that centralizes:

What makes Pimcore technically suitable as a RAG data source?

  1. Real-time API Access: Pimcore exposes data through GraphQL, REST APIs, and the powerful DataHub, giving you flexible options for data extraction and synchronization. You're not locked into batch exports, you can query data on-demand or set up event-driven updates.
  2. Structured Data Model: unlike unstructured document repositories, Pimcore enforces schema and data types. This means your AI system receives consistent, predictable data that's easier to embed and retrieve accurately.
  3. Multilingual Support: Pimcore handles localized content natively. If you need to support queries in multiple languages, your data is already structured to match.
  4. Relationship Modeling: Product variants, cross-sells, document relationships, Pimcore captures these connections, which can be crucial for contextual AI responses.
Diagram showing Pimcore at the center of a RAG ecosystem

Structuring Pimcore Content for AI Query Compatibility

Having data in Pimcore is one thing. Making it useful for AI retrieval is another. Here's how to prepare your Pimcore data for optimal RAG performance:

Define Clear Semantic Units

Not every field or attribute needs to be embedded separately. Think in terms of "semantic chunks" or pieces of information that make sense on their own. For product data, this might be:

  • Full product descriptions with key attributes
  • Technical specifications grouped by category
  • Usage instructions or safety information
  • Customer reviews or Q&A pairs

Enrich with Metadata

Add contextual information that helps the retrieval engine understand what each chunk represents:

  • Entity type (product, document, customer, policy)
  • Category or classification
  • Last updated timestamp
  • Language and locale
  • Access level or permissions

Keep It Fresh

RAG systems need current data. Set up:

  • Event-driven updates when Pimcore objects change
  • Scheduled re-indexing for bulk updates
  • Version tracking to know when embeddings are stale

Factory.dev has developed specialized bundles for Pimcore that automate this synchronization process, making it straightforward to keep your vector database aligned with your source data, you can contact us for more information. 

Feeding Your AI: API, Embedding, and Framework Options with Pimcore

There are multiple paths to connect Pimcore data to your RAG stack. The right choice depends on your architecture, scale, and preferences.

Data Extraction Options

Pimcore DataHub

The most flexible option. Configure HTTP endpoints that you can call to expose exactly the data you need. Using pimcore data hub file exporter bundle where you can configure schema that you want to export.Transform data visually, and set up multiple endpoints for different use cases. Another way could be using custom routes and using webhooks, or using completely custom implementation. Additionally, some customization would be needed to expose bulk indexing. 

Custom implementation

If you don’t want to use DataHub, you can always use a custom implementation. Keep in mind a few important things we already mentioned: event-driven approach and bulk indexing. These are the starting points to designing your custom architecture. 

Embedding Pipeline Strategies

Once you have data flowing from Pimcore, you need to convert it into embeddings:

Batch Processing: Load all data at once for initial indexing. This is useful when setting up a new RAG system or adding new attributes to your embedding strategy.

Incremental Updates: As Pimcore objects change, only re-embed the affected chunks. This keeps your vector database current without a full re-indexing.

Chunking Strategy: Split large documents or descriptions into smaller pieces (typically 200-500 tokens). This improves retrieval precision, and users get exactly the context they need, not entire documents. 

Framework Options

LangChain, N8N, Flowise: These are the most popular frameworks for building RAG applications. They provide pre-built components for document loading, chunking, embedding, retrieval, and LLM integration.

The key to utilizing the framework is:

  • No vendor locking
    • Use any AI model or any Vector database
    • Switch models, Vectors easily
  • Write your adapter once 
  • Fine tune data easily
  • Extend framework’s functionality
  • Use the Agentic approach to easily scale your use cases 

Factory.dev provides pre-built integrations that work with these frameworks, significantly reducing implementation effort.

Moving Beyond RAG: AI Agents + MCP with Pimcore

Traditional LLM applications follow a simple pattern: the user sends a prompt, the LLM generates a response, and the conversation ends. This works for straightforward queries, but it breaks down when tasks require: 

  • Multiple steps to complete
  • Access to external data sources
  • Decision-making based on intermediate results 
  • Ability to correct course when initial attempts fail

AI Agents and Why They Matter

AI agents change this paradigm. An agent is an LLM-powered system that:

  1. Receives a goal (not just a prompt)
  2. Plans how to achieve it (breaks down into sub-tasks)
  3. Executes actions (calls tools, queries APIs, processes data)
  4. Evaluates results (checks if the goal is met)
  5. Iterates (tries different approaches if needed)
  6. Completes the task (returns the final result)

This is fundamentally different from a single LLM call. The agent operates in a loop, continuously reasoning about what to do next based on the information it has gathered.

Understanding the Model Context Protocol (MCP)

The Model Context Protocol is an open standard (developed by Anthropic) that defines how AI models can safely and reliably interact with external systems. Think of it as a contract between your application and the AI agent.

MCP enables:

  • Structured tool definitions: Your functions become “tools” the agent can see and call
  • Type safety: Parameters and return values are typed and validated
  • Safe execution: Agents can’t execute arbitrary code, only defined tools
  • Extensibility: Add new tools without retraining the model

And this is exactly what agents see. We present them with tools like the RAG database (which can be multiple), we can attach Pimcore’s functions and routes as tools, and so much more. We can even expand on the later tools available for the Agent.

Implementing Pimcore as an MCP Server

To turn Pimcore into an MCP server, you need three layers:

1. Tool Definitions: What can the agent do?

2. Adapters: How do tool calls map to Pimcore operations?

3. Security Layer: Who can do what?

As we mentioned earlier, using a framework like LangChain or similar tools allows us to easily define agents and expose the tools they have access to.

The PHP SDK is currently in development. The project represents a collaboration between the PHP Foundation and the Symfony project. It adopts development practices and standards from the Symfony project, including Coding Standards and the Backward Compatibility Promise.

In the meantime, the idea is to create a bridge between Pimcore and the MPC server. You could leverage the FastMPC Python implementation and expose Pimcore functions. For example, you could define exposable functions and attributes via PHP attributes.

Modern PHP 8+ attributes make tool definition elegant. Here’s an example:

<?php

namespace App\Controller\MCP;

// your imports + attribute imports

use Factory\AiBundle\Attribute\MCPParameter;

use Factory\AiBundle\Attribute\MCPTool;

use Factory\AiBundle\Attribute\MCPToolServer;

#[MCPToolServer(

    name: 'server-name',

    description: 'MCP test server',

    port: 8000

)]

class MyMCPController

{

    private $orderService;

    public function __construct(

        private OrderService $orderService

    ) {

        $this->orderService = $orderServiceFactory->make($siteService->getCurrentSiteIdentifier());

    }

    /**

     * Get latest orders for a user session with pagination

     *

     * @param string $userSessionId User session identifier

     * @param int $perPage Number of orders per page

     * @param int $page Current page number

     * @return Response JSON response with order data

     */

    #[MCPTool(

        name: 'latestOrders',

        description: 'Get latest orders for a user session with pagination'

    )]

    #[Route('/latest-orders', methods: ['GET'])]

    public function latestOrdersAction(

        Request $request,

        #[MCPParameter(

            type: 'string',

            description: 'User session identifier'

        )]

        string $userSessionId = '',

        #[MCPParameter(

            type: 'integer',

            description: 'Number of orders per page'

        )]

        int $perPage = 10,

        #[MCPParameter(

            type: 'integer',

            description: 'Current page number'

        )]

        int $page = 1

    ): Response {

  // Authenticate user here - This can be achived in couple of ways code is example purpose only

        $userSessionId = $request->get('userSessionId', '');

        $perPage = (int) $request->get('perPage', 10);

        $page = (int) $request->get('page', 1);

        if (empty($userSessionId)) {

            return new JsonResponse(['error' => 'User session ID is required'], 400);

        }

        $customer = $this->aiService->getUserByCustomSessionId($userSessionId);

        if (!$customer) {

            return new JsonResponse(['error' => 'Customer not found'], 404);

        }

        $searchIds = $this->orderService->getOrderHistoryCustomerIds($customer);

        $orderListing = new OrderHistoryListingAdapter();

        $paginator = $this->paginator->paginate($orderListing, $page, $perPage);

        $orderMapped = OrderDataMapper::list((array)$paginator->getItems())->toArray($orderListing);

        return new JsonResponse([

            'orders' => $orderMapped,

            'pagination' => [

                'currentPage' => $page,

                'perPage' => $perPage,

                'totalItems' => $paginator->getTotalItemCount(),

                'totalPages' => ceil($paginator->getTotalItemCount() / $perPage)

            ]

        ]);

    }

}

Now that you have an idea of how to use modern PHP attributes, you could use a custom script that would read the PHP attributes and generate a FastMCP server that would actually proxy the calls from the AI Agent to Pimcore/Symfony.

Building User-Facing AI Experiences: Chat Interfaces

You've built a powerful RAG system. Your Pimcore data is structured and indexed. Your AI agents can query, reason, and take action through MCP. Now comes the critical question: How do you present this to your users?

The interface between humans and AI systems is where theoretical capability meets practical value. A brilliant AI system with a poor interface frustrates users and fails to deliver ROI. Conversely, a well-designed interface makes AI feel magical, even when the underlying technology is straightforward.

Core Chat Interface Components:

Most AI interactions now happen through conversational UI because it allows flexible, natural-language exchanges. A production-ready chat component should support:

  • Real-time message updates (WebSocket or SSE)
  • Typing indicators
  • Message history loading
  • Rich content rendering (cards, buttons, images)
  • Input with file upload support
  • Mobile-responsive design

This creates an experience that feels modern and reliable.

Core UX Principles for Enterprise AI

Modern AI interfaces require thoughtful UX, predictable behavior, and transparency. Below are the key components when deploying a user-facing AI experience on top of Pimcore and your RAG stack.

1. Be Transparent About AI Behavior

Users must immediately understand they are interacting with an AI system. Don't pretend to be human, it erodes trust when users discover the truth.

Example:

✓ "I'm an AI assistant here to help you find products."

✗ "Hi, I'm Sarah from customer service!"

2. Set Clear Expectations

Tell users:

  • What the AI can do
  • What it cannot do
  • When a human handoff is available

Example:

Welcome! I can help you:

• Find products based on your needs

• Answer questions about specifications

• Check inventory and pricing

• Compare options

I cannot:

• Process orders (but I can guide you to checkout)

• Make promises about delivery dates

• Override pricing or policies

3. Provide Escape Hatches

Always offer ways to:

  • Talk to a human if needed (you can connect tool as MPC that allows this bridge)
  • Start over if the conversation gets stuck

4. Show Progress and State

For long-running tasks, indicate what's happening:

🔍 Searching inventory...

✓ Found 47 matching products

📊 Analyzing your preferences...

✓ Ranked by relevance

5. Make Responses Scannable

Large blocks of text are hard to read. Use:

  • Short paragraphs
  • Bullet points
  • Bold for emphasis
  • Cards or structured data displays
  • Use Markdown as a response from LLM
    • Have ability to convert the MD response from AI to HTML

6. Don't Rely Solely on Text

Recognize SKUs in post-processing from the AI response, and send an HTML response that represents the interactive product card.

Technical Architecture of a Production Chat System

Below is the reference architecture that ties together frontend, backend, RAG components, agents, and Pimcore:

Diagram showing the architecture of a production chat system. It illustrates the flow from the Frontend (user interface) to the Backend API Layer, then to the Conversation Manager, and finally branching into three components: LLM Provider, RAG System, and MCP Tools. All three connect to the Pimcore Data Layer at the bottom.

Factory.dev Bundles That Accelerate Implementation

To reduce implementation complexity, Factory.dev provides tested Pimcore bundles built specifically for RAG, agents, and AI-driven applications.

These bundles eliminate weeks of development time and provide battle-tested patterns for production deployments. All solutions are available by request, contact us to discuss your use case.

Data Synchronization Bundle

This bundle handles the end-to-end lifecycle of preparing Pimcore data for your RAG pipeline:

  • Event-driven updates from Pimcore to vector databases
  • Configurable chunking strategies
  • Batch and incremental indexing
  • Support for multiple embedding providers

AI Agent Bundle

A foundation for agent-driven applications:

  • Prebuilt RAG pipeline integration
  • Prompt management interface
  • Response validation and quality controls
  • Usage tracking and analytics

MCP Server Implementation

Our MCP implementation exposes Pimcore functions as safe, discoverable tools for AI agents.

  • Exposes Pimcore data and functions to AI agents through the Model Context Protocol
  • Enables agentic workflows where AI can query, update, and act on Pimcore data
  • Built on Symfony with clean adapter patterns

Frontend Widget:

A React component that could be embedded into the website that connects to certain Pimcore and AI agents. It handles progress and typing, supports streaming messages, etc., which significantly shortens frontend development.

Conclusion

Building reliable AI systems requires more than a powerful model. It requires a stable data foundation, predictable retrieval, and an architecture that supports visibility, governance, and continuous improvement. Pimcore provides these fundamentals: structured data models, clear relationships, real-time access, and the flexibility to integrate AI wherever it creates value.

With a well-designed RAG pipeline and agentic capabilities, enterprises can shift from experimental prototypes to AI systems that support daily operations, reflect business logic, and deliver consistent outcomes.

If you are exploring how to bring AI into your digital operations, from retrieval systems to MCP-based agents, reach out to us. We’ll help you design an approach that fits your architecture and scales with your organization.

Looking for Exponential Growth? Let’s Get Started.
Explore next

Pimcore News

Read all the latest news on Pimcore and the entire Pimcore community!

Discover more