REST API v1

Kerdos AI API Reference

A stateful, session-based REST API for document Q&A. Upload any document, build a FAISS vector index, and query it with LLaMA 3.1 8B — all in three HTTP calls.

Swagger UI ← Back to Product

Base URL: https://kerdosdotio-kerdos-llm-rag-api.hf.space

Auth: hf_token in body (chat only)

Session TTL: 60 minutes (configurable)

Endpoints

All Endpoints

POST/sessions

Create a new isolated session. Returns a session_id.

Response

{ "session_id": "uuid-v4" }

GET/sessions/{id}

Get the status and metadata of an existing session.

Response

{ "session_id": "...", "created_at": "...", "doc_count": 2 }

DELETE/sessions/{id}

Delete the session and free all in-memory resources (FAISS index + documents).

Response

{ "deleted": true }

POST/sessions/{id}/documents

Upload and index one or more files. Supported: PDF, DOCX, TXT, MD, CSV (max 50 MB each).

Body: multipart/form-data — field: files (one or more)

Response

{ "indexed": ["report.pdf", "policy.docx"] }

POST/sessions/{id}/chathf_token

Ask a question. Returns a grounded answer generated only from the uploaded documents. Optionally pass hf_token to use production LLaMA inference.

Body: { "question": "string", "hf_token": "hf_..." (optional) }

Response

{ "answer": "Based on the document..." }

DELETE/sessions/{id}/history

Clear the conversation history for a session (keeps the indexed documents).

Response

{ "cleared": true }

GET/health

Health check. Returns API status and version.

Response

{ "status": "ok", "version": "1.0.0" }

Code Samples

Typical Integration Workflow

Create session → upload document → ask question → delete session. Three steps, three endpoints.

cURL (bash)

BASE="https://kerdosdotio-kerdos-llm-rag-api.hf.space"

# 1. Create session
SESSION=$(curl -s -X POST $BASE/sessions | jq -r '.session_id')
echo "Session: $SESSION"

# 2. Upload a document
curl -X POST "$BASE/sessions/$SESSION/documents" \
  -F "files=@your_doc.pdf"

# 3. Ask a question
curl -X POST "$BASE/sessions/$SESSION/chat" \
  -H "Content-Type: application/json" \
  -d '{"question": "Summarise this document", "hf_token": "hf_..."}'

# 4. Delete session
curl -X DELETE "$BASE/sessions/$SESSION"

Python

import requests

BASE = "https://kerdosdotio-kerdos-llm-rag-api.hf.space"

# 1. Create session
session_id = requests.post(f"{BASE}/sessions").json()["session_id"]
print(f"Session: {session_id}")

# 2. Upload documents
with open("your_doc.pdf", "rb") as f:
    requests.post(
        f"{BASE}/sessions/{session_id}/documents",
        files={"files": ("your_doc.pdf", f, "application/pdf")},
    )

# 3. Ask a question
resp = requests.post(
    f"{BASE}/sessions/{session_id}/chat",
    json={"question": "Summarise this document", "hf_token": "hf_..."},
)
print(resp.json()["answer"])

# 4. Clean up
requests.delete(f"{BASE}/sessions/{session_id}")

TypeScript

const BASE = "https://kerdosdotio-kerdos-llm-rag-api.hf.space";

// 1. Create session
const { session_id } = await fetch(`${BASE}/sessions`, {
  method: "POST",
}).then((r) => r.json());

// 2. Upload a document
const form = new FormData();
form.append("files", fileInput.files[0]);
await fetch(`${BASE}/sessions/${session_id}/documents`, {
  method: "POST",
  body: form,
});

// 3. Ask a question
const { answer } = await fetch(`${BASE}/sessions/${session_id}/chat`, {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    question: "Summarise this document",
    hf_token: process.env.HF_TOKEN, // keep server-side!
  }),
}).then((r) => r.json());

console.log(answer);

// 4. Delete session
await fetch(`${BASE}/sessions/${session_id}`, { method: "DELETE" });

Configuration

Authentication & Environment Variables

HuggingFace Token (hf_token)

Required only for the POST /sessions/{id}/chat endpoint. The token is passed in the JSON request body — never as a header.

Must have read access to meta-llama/Llama-3.1-8B-Instruct (accept the licence on HuggingFace)

Variable	Default	Description
HF_TOKEN	—	HuggingFace token for LLaMA inference
SESSION_TTL_MINUTES	60	Session auto-expiry in minutes
MAX_UPLOAD_MB	50	Max file size per upload in MB

Security Best Practice

Never expose hf_token in client-side code. Use a Next.js API route or server action as a proxy to keep your token server-side. For enterprise deployments, replace hf_token with your own OAuth2 / API key system.

Ready to integrate?

Try the demo now or contact us for a private, on-premise enterprise deployment with your own models and infrastructure.

Try the Playground Enterprise Deployment