*A 15‑minute end‑to‑end demo*
*By the AK‑F team – November 2025*
## Why add a personal blog?
The Adaptive Knowledge‑Fusion (AK‑F) engine already blends domain‑specific knowledge graphs, vector‑based retrieval, and a mixture‑of‑experts (MoE) router.
Bringing in an external, high‑quality source—like Swervin Curvin’s October & November 2025 archives—demonstrates:
* **Rapid ingestion** of semi‑structured web content.
* **Seamless enrichment** (entity extraction, embeddings) that feeds both the graph and the vector store.
* **Dynamic routing** that surfaces blog references only when they are relevant.
The whole pipeline can be run in under 15 minutes on a modest Kubernetes cluster.
## Overview of the demo
| Step | What we do | Approx. time |
|------|------------|--------------|
| 1️⃣ | Set up the Python environment | 2 min |
| 2️⃣ | Crawl the two archive months with Scrapy | 3 min |
| 3️⃣ | Enrich, embed, and store vectors in FAISS | 4 min |
| 4️⃣ | Load the chunks into Neo4j (knowledge graph) | 3 min |
| 5️⃣ | Register a new **BlogExpert** in the MoE router | 1 min |
| 6️⃣ | Live query the engine and see blog references | 2 min |
| 7️⃣ | (Optional) quick performance check | < 1 min |
## 1️⃣ Prepare the workspace
```bash
# Clone the demo repo (contains spider, pipelines, and k8s manifests)
git clone https://github.com/yourorg/akf-blog-demo.git
cd akf-blog-demo
# Create a virtual environment and install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```
`requirements.txt` bundles:
* `scrapy` – crawling
* `tiktoken` – token‑based chunking
* `sentence‑transformers` – embedding model
* `neo4j` – graph driver
* `requests` – simple HTTP calls
## 2️⃣ Crawl the blog archives
The spider walks the month‑archive pages (`/2025/10/` and `/2025/11/`), follows each article link, and emits **JSON‑Lines** records—one per 2 k‑token chunk.
```bash
./run_ingest.sh
```
You’ll see log lines such as:
```
2025-10-12 08:00:00 INFO Scraped article: LoRA adapters in transformers – step‑by‑step
2025-10-12 08:00:00 INFO Produced 7 chunks for article a1b2c3…
```
Result: `data/blog_chunks.jsonl` (≈ 1–2 MB for two months).
## 3️⃣ Enrich, embed, and store vectors
```bash
python enrich_and_store.py \
--input data/blog_chunks.jsonl \
--vector-store faiss \
--embedding-model sentence-transformers/all-mpnet-base-v2 \
--output data/faiss_index
```
What happens under the hood:
| Sub‑step | Action |
|----------|--------|
| **Entity extraction** | spaCy (`en_core_web_sm`) adds a list of entities to each record. |
| **Embedding** | `model.encode(text, normalize_embeddings=True)` → 768‑dim vector. |
| **FAISS index** | `IndexIVFFlat` (`nlist=100`, `nprobe=10`). |
| **Persistence** | Writes the index and a `metadata.pkl` (UUID ↔ metadata). |
When finished you’ll see:
```
✅ Processed 1 842 chunks
✅ FAISS index written to data/faiss_index/
```
## 4️⃣ Load the chunks into Neo4j
```bash
python load_into_neo4j.py \
--uri bolt://neo4j.my-cluster.svc.cluster.local:7687 \
--user neo4j \
--password $NEO4J_PASS \
--jsonl data/blog_chunks.jsonl
```
The script batches rows (5 k per batch) and runs three `MERGE` statements:
```cypher
MERGE (a:Article {id: $article_id}) // title, url, date
MERGE (c:Chunk {id: $chunk_id}) // text, index
MERGE (a)-[:HAS_CHUNK]->(c);
```
It also creates indexes on `Article.id` and `Chunk.id` for fast look‑ups. Sample output:
```
✔️ Imported 5 000 rows (batch 1)
…
✅ Neo4j import complete
```
## 5️⃣ Register the **BlogExpert** in the MoE router
The MoE router reads its expert list from a ConfigMap called `moe-router-config`. We append a lightweight retrieval expert that formats blog references.
```bash
kubectl -n akf-engine patch configmap moe-router-config \
--type=json \
-p='[{"op":"add","path":"/data/experts","value":"- name: BlogExpert\n type: retrieval\n weight: 0.05"}]'
# Restart the router so it picks up the new config
kubectl -n akf-engine rollout restart deployment/moe-router
```
`weight: 0.05` means the router will allocate roughly 5 % of its routing capacity to this expert—enough to surface blog citations when the query hints at them.
## 6️⃣ Live query – see the blog reference in action
```bash
curl -X POST http://<gateway-svc>/v1/chat \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"What does Swervin Curvin say about LoRA adapters?"}]}'
```
**Typical response (truncated):**
```json
{
"role":"assistant",
"content":"Swervin Curvin’s October‑2025 post “LoRA adapters in transformers – step‑by‑step” explains that …\n\n**References**\n- LoRA adapters in transformers – step‑by‑step – https://swervincurvin.blogspot.com/2025/10/lora-adapters.html"
}
```
The **reference block** is generated by `BlogExpert`.
A non‑blog query (e.g., “latest NVIDIA H100 specs”) still returns the core knowledge‑base answer without any blog citation, confirming that the new expert only fires when appropriate.
## 7️⃣ Quick performance sanity check (optional)
```bash
curl -s http://prometheus.monitoring.svc:9090/api/v1/query \
-G --data-urlencode 'query=rate(ram_vector_search_seconds_sum[1m])' | jq .
```
You should see an average latency around **0.12 s**, indicating the extra FAISS lookup adds negligible overhead to the existing autoscaling headroom.
## TL;DR – One‑click summary
```bash
# 1️⃣ Setup
git clone https://github.com/yourorg/akf-blog-demo.git && cd akf-blog-demo
python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
# 2️⃣ Crawl
./run_ingest.sh
# 3️⃣ Enrich & embed
python enrich_and_store.py --input data/blog_chunks.jsonl --vector-store faiss \
--embedding-model sentence-transformers/all-mpnet-base-v2 --output data/faiss_index
# 4️⃣ Load into Neo4j
python load_into_neo4j.py --uri bolt://neo4j.my-cluster.svc.cluster.local:7687 \
--user neo4j --password $NEO4J_PASS --jsonl data/blog_chunks.jsonl
# 5️⃣ Register BlogExpert
kubectl -n akf-engine patch configmap moe-router-config \
--type=json -p='[{"op":"add","path":"/data/experts","value":"- name: BlogExpert\n type: retrieval\n weight: 0.05"}]'
kubectl -n akf-engine rollout restart deployment/moe-router
# 6️⃣ Query
curl -X POST http://<gateway-svc>/v1/chat -d '{"messages":[{"role":"user","content":"What does Swervin Curvin say about LoRA adapters?"}]}'
```
License
The demo code (spider, pipelines, Kubernetes manifests, and helper scripts) is released under the **MIT License**. Feel free to copy, modify, and redistribute—just keep the copyright notice and license text attached.
No comments:
Post a Comment