aiiop.ai is an enterprise AI platform that runs entirely within your infrastructure. It consists of three products: BizChat for natural language data queries, RAS for self-service RAG pipelines, and Fabric for ML model training with full audit trails.

aiiop RAS (Retrieval-as-a-Service) is a self-service RAG building platform. You register your vector database, LLM and embedding model, build pipelines with a guided wizard, and publish inference endpoints. Supports pgvector, Pinecone, Milvus, Weaviate, Qdrant, and multiple retrieval strategies.

Does aiiop.ai send data outside my network?

No. aiiop.ai is designed for zero data movement. All processing happens within your infrastructure - on-premise, cloud, or hybrid. Your data never leaves your network.

What AI models does aiiop.ai support?

aiiop.ai supports multiple model families including Mistral, LLaMA, Phi-3, Qwen, Gemma for LLMs, as well as BERT variants, XGBoost, LightGBM, and embedding models like MiniLM and BGE-M3. For RAG, it supports OpenAI, Anthropic, Gemini, Azure OpenAI, Bedrock, and Ollama.

How is aiiop.ai different from other enterprise AI platforms?

Unlike cloud-based AI services, aiiop.ai runs entirely on your infrastructure with zero data movement. It provides full audit trails for ML training, reproducible pipelines, self-service RAG building, and lets business users query data in natural language without knowing SQL.

Enterprise AI Platform

Build enterprise
AI systems
with control.

aiiop.ai gives enterprise teams three things: a way to ask questions of their data, a way to build RAG pipelines on documents, and a way to train the models that power them. All run inside your infrastructure. Nothing leaves.

BizChat → RAS → Fabric →

aiiop.ai – system status

// platform layers active

{

"bizchat": "running",

"fabric": "running",

"data_movement": false,

"control": "yours"

}

// environment

infra → on-prem | cloud | hybrid

status → all systems nominal

The Problem

Most AI projects stall before they ship

Teams get results in notebooks. Then spend months trying to make those results repeatable, auditable, and safe to run on real data. Most don't get there.

Nothing is reproducibleSix months later, nobody can tell you exactly what data a model was trained on, or reproduce the result.
Data is locked behind specialistsThe people who need answers from the data can't get them without filing a request to someone who knows SQL.
No path from notebook to productionA model that works in a Jupyter notebook is not a model anyone can rely on six months later.
Every project starts from scratchNo shared workflow, no reusable configs, no standard way to hand things off. Each new project reinvents the last one.

The aiiop.ai Platform

Three products. One platform.

Query your databases. Build RAG pipelines. Train ML models. All on your own infrastructure.

aiiop.ai BizChat

Conversational access to your business data

Connect your databases and ask questions in plain English. No SQL, no analyst in the middle, no data sent to an external API.

MySQL, PostgreSQL, SQL Server, ERP, CRM — connect what you have
Plain English questions, structured answers
The LLM sees field names, not your actual data
Every query logged with user and timestamp
Runs inside your network, not ours

Explore BizChat → Live Demo

aiiop.ai RAS

Self-service RAG building and retrieval

Build production-ready RAG pipelines with a guided wizard. Bring your own vector DB, LLM, and embedding model — cloud APIs or self-hosted on your GPU servers.

pgvector Pinecone Milvus Weaviate Qdrant ChromaDB

Cloud LLMs (OpenAI, Anthropic, Gemini, Azure, Bedrock) or self-hosted via vLLM/llama.cpp
Self-hosted models: Llama 3, Mistral, DeepSeek, Qwen — deployed as systemd services
Embeddings: cloud APIs or self-hosted via HuggingFace TEI (nomic, BGE, E5)
Retrieval: semantic, BM25, hybrid RRF, two-stage — reranking with Cohere or cross-encoder
Publish endpoints with API keys, rate limiting, and full audit logging per org

Explore RAS → RAG Platform

aiiop.ai Fabric

Controlled ML training infrastructure

Run ML training on servers you already have. Every experiment is tracked, every result is reproducible, and the trained model is yours to take anywhere.

Classification Domain Pre-training Regression NER Custom

Every run captures exactly what data and config it used — at the moment it starts
Runs on any Linux server you can SSH into — on-prem, cloud VM, or a GPU box
Pre-train on your own text corpus, then use that model as the base for classification or NER
Add your own preprocessing or evaluation scripts around any training run
Package any trained model as a standalone zip — predict.py included, no Fabric needed to run it
Full audit log: who ran what, on which data, when — readable by your compliance team

Explore Fabric → ML Training

How They Work Together

Three products. One infrastructure.

BizChat lets your team ask questions of structured data today. RAS lets you build RAG pipelines over documents and unstructured content. Fabric lets your data scientists train custom models that power both.

You don't have to use all three. Start with whichever solves your immediate problem. But when you're ready to expand, the rest of the platform is already there — same infrastructure, same control model, same audit trails.

aiiop.ai BizChat
Natural language → Database queries

aiiop.ai RAS
Documents → RAG pipelines

aiiop.ai Fabric
Data → Trained models

All on your infrastructure
Zero data movement · Full audit trails

aiiop.ai BizChat — Under the Hood

What BizChat
actually does.

You connect your database, auto-generate API endpoints from your tables, and build AI agents that answer questions in natural language. The LLM generates Smart Payloads — it never sees your actual data.

No-code API generationConnect MySQL, PostgreSQL, SQL Server, or Oracle. Select tables, define relationships, and BizChat auto-generates REST API endpoints. No backend code, no developer bottleneck.

Build AI agents visuallyConfigure agents with your APIs, field information, and business context. Add retry logic, caching rules, and fallback behavior. No YAML, no code — just configuration.

Smart Payloads keep data privateThe LLM sees field names and types — never actual values. It generates an optimized API payload. BizChat executes it locally, fetches only required fields, and returns structured answers.

Smart pipeline with retriesBuilt-in retry logic, error handling, and caching. If a query fails, the agent retries with refined parameters. Results are cached to reduce latency on repeated questions.

Cloud or self-hosted LLMsUse GPT-4, Claude, Gemini, or Azure OpenAI. Or deploy open-source models (Llama 3, Mistral, Phi-3, Qwen) on your own infrastructure for complete control.

See BizChat in Detail →

BizChat — agent: sales-analytics

Agent Config

apisorders, customers, products

llmGPT-4o (cloud)

retry_logic3 attempts · exponential

cache_ttl15 min

Query Flow

input"Top 10 customers by revenue"

schema_sentfield names only [ok]

payloadSELECT customer, SUM(...)...

api_called/api/orders · 142ms

Privacy

data_to_llmnone — Smart Payload

data_left_networkfalse

audit_loggedanalyst@org [ok]

No-Code Stack

DB: MySQL · PostgreSQL · SQL Server · Oracle
LLM: GPT-4 · Claude · Gemini · Llama · Mistral
APIs: Auto-generated · REST · Cached

aiiop.ai RAS — Under the Hood

What RAS
actually does.

You register your vector database, LLM, and embedding model. RAS handles the chunking, indexing, retrieval, and generation. You get a production-ready inference endpoint with rate limiting and audit logging.

Bring your own everythingRegister Pinecone, Qdrant, Milvus, Weaviate, ChromaDB or pgvector. Connect OpenAI, Anthropic, Gemini, or self-host Llama/Mistral via vLLM. Your keys, your infrastructure — RAS just orchestrates.

Six-step guided pipeline wizardConfigure chunking strategy, embedding model, vector store, retrieval method, reranker, and LLM. Each step validates before the next unlocks. No YAML files, no guesswork.

Advanced retrieval strategiesSemantic cosine similarity, BM25 keyword search, or hybrid RRF that merges both. Two-stage retrieval finds relevant documents first, then searches within them. Rerank with Cohere or cross-encoder models.

Self-hosted model deploymentRegister GPU servers via SSH. RAS deploys vLLM (GPU) or llama.cpp (CPU) as systemd services. Embeddings via HuggingFace TEI. No public ports needed — accessed via SSH tunnel.

Multi-tenant with strict isolationEvery vector search and API call is scoped to an org_id. No org can access another's documents, vectors, pipelines, or logs. Full audit trail on every query.

See RAS in Detail →

RAS — pipeline #12 · policy-docs

Pipeline Config

vector_dbPinecone (BYO)

embeddingOpenAI text-embedding-3-small

chunkingrecursive · 512t / 50 overlap

retrievalhybrid BM25 + semantic (RRF)

rerankerCohere Rerank 3

llmClaude 3.5 Sonnet

Inference

endpoint/api/v1/infer/policy-docs

rate_limit100 req/min

status• published

Governance

org_isolationstrict [ok]

audit_loggedall queries [ok]

Supported Components

Vector: pgvector · Pinecone · Milvus · Qdrant · Weaviate
LLM: OpenAI · Anthropic · Gemini · vLLM · llama.cpp
Embed: OpenAI · Cohere · TEI (nomic, BGE, E5)

aiiop.ai Fabric — Under the Hood

What Fabric
actually does.

You register a server, pick a training mode, upload your data, and run. Fabric handles the dispatch, the logging, the result collection, and the model registration. You get back a versioned artifact you can test, compare, and export.

Every run records what it usedThe data config and model are locked at the moment a run starts. A year later you can go back to any run and see exactly what it was trained on, which columns were included, and which parameters were used.

Pre-train on your own language firstBefore you classify anything, you can run masked language modelling on your own text — ITSM tickets, clinical notes, legal documents. The result is a model that already understands your terminology before you ask it to do anything useful.

Override any part of the pipelineThe standard training modes come with built-in scripts. If you need custom preprocessing, a different evaluation method, or your own training loop entirely, upload a plugin for that stage. It's versioned, checksummed, and pinned to the run.

Take the model and goWhen a run completes, you can package it into a standalone zip: model weights, a ready-to-run predict.py, and a setup script. No Fabric installed where you deploy. No dependency on us to serve it.

The audit trail isn't just for engineersEvery action is logged with the user, the timestamp, and the resource it touched. A compliance auditor can see who approved a training config, who started a run, and which dataset it used — without opening a database.

See Fabric in Detail →

Fabric — run #38 · DOMAIN_PRETRAIN

Run

purposeDOMAIN_PRETRAIN

corpusitsm_tickets_250k.csv

modeldistilbert-base-uncased

serverprod-gpu-01 · on-prem

Result

mlm_loss0.3812 ↓

perplexity1.464

status• completed

Artifact

registeredmodel_registry [ok]

packageddistilbert-itsm-v1.zip

next_runbase → run #39 · CLASSIFICATION

Governance

snapshot_frozenat creation [ok]

data_left_networkfalse

audit_loggedengineer@org [ok]

Model Families Available

DistilBERT · BERT · RoBERTa · ALBERT
XGBoost · LightGBM · CatBoost · RF
Mistral · LLaMA · Phi-3 · Qwen · Gemma
MiniLM · BGE-M3 · E5-Large

Use Cases

Where it gets used

A few examples. The pattern is the same: data that can't go outside, questions that need answers, models that need to stay reproducible.

Internal question-answering

A team that needs answers from their own data without sending that data to OpenAI or any other external API.

BizChat

Document Q&A with source citation

Build RAG pipelines over policy documents, contracts, or technical manuals. Get answers with exact source references.

RAS

Getting answers out of fragmented data

When the data is spread across five systems and nobody on the business side knows SQL, BizChat becomes the layer that connects questions to answers.

BizChat

Legal and compliance research

Query thousands of contracts, regulations, or case files. Hybrid search with reranking ensures relevant results even with legal jargon.

RAS

ML training with an audit trail

Teams who need to be able to answer "what was that model trained on?" — whether for a regulator, a client, or their own QA process.

Fabric

Classification on specialist text

ITSM ticket routing, contract classification, clinical triage — cases where a generic model doesn't know your terminology and you need to train it on yours.

Fabric

Work With Us

We design and implement
AI systems with you

We're not a software vendor that hands over a licence and moves on. We work with your team from the start — understanding your setup, building on our platform, and making sure what we build actually runs in your environment.

Discovery & DesignBefore anything gets built, we need to understand what you have, what you're trying to do, and what can't change.

ImplementationWe configure and deploy on your infrastructure — not a generic demo environment.

OperationalizeWorking software your team understands and can keep running without us in the room.

Scale & EvolveOnce the first use case is running, we can help you think about what comes next.

Build enterprise
AI systems
with control.

Most AI projects stall before they ship

Three products. One platform.

Conversational access to your business data

Self-service RAG building and retrieval

Controlled ML training infrastructure

Three products. One infrastructure.

What BizChat
actually does.

What RAS
actually does.

What Fabric
actually does.

Where it gets used

Internal question-answering

Document Q&A with source citation

Getting answers out of fragmented data

Legal and compliance research

ML training with an audit trail

Classification on specialist text

We design and implement
AI systems with you

Get in Touch

Your infrastructure.
Your data.
Your AI.

Build enterpriseAI systemswith control.

Most AI projects stall before they ship

Three products. One platform.

Conversational access to your business data

Self-service RAG building and retrieval

Controlled ML training infrastructure

Three products. One infrastructure.

What BizChatactually does.

What RASactually does.

What Fabricactually does.

Where it gets used

Internal question-answering

Document Q&A with source citation

Getting answers out of fragmented data

Legal and compliance research

ML training with an audit trail

Classification on specialist text

We design and implementAI systems with you

Get in Touch

Your infrastructure.Your data.Your AI.

Build enterprise
AI systems
with control.

What BizChat
actually does.

What RAS
actually does.

What Fabric
actually does.

We design and implement
AI systems with you

Your infrastructure.
Your data.
Your AI.