The AI Council

A Multi-Perspective AI Response System built on Ollama. Get diverse insights from 5 specialized AI models running across your home lab, then synthesize them into a unified consensus with text-to-speech output.

Overview

The AI Council is theLAB's multi-perspective AI response system. Instead of relying on a single AI model, it queries 5 specialized models running on Ollama across your local network, then synthesizes their responses into a unified consensus using Gemma3:27b.

Each council member has a distinct role and personality, providing different perspectives on any question. This approach helps overcome single-model biases and produces more well-rounded, thoroughly analyzed answers.

5 AI Models
Diverse perspectives from specialized models
Consensus Engine
Gemma3:27b synthesizes all responses
Distributed
Models across home lab servers
Why Multiple Models?
Single LLMs can hallucinate, miss perspectives, or get stuck in reasoning ruts. By querying multiple models with different training backgrounds and prompting them with distinct roles, we get a more complete picture that's cross-validated by the consensus engine.

Council Members

The AI Council consists of 5 specialized models, each with a unique role. Models are distributed across localhost and 192.168.1.100:

GM31
Gemma3
gemma3:12b
Primary Assistant
Comprehensive analysis and detailed responses. The main workhorse providing thorough, well-structured answers.
Endpoint: localhost:11434
QW2
Qwen3
qwen3:4b
Quick Insights
Fast, concise responses and key insights. Cuts through complexity to deliver the essentials quickly.
Endpoint: 192.168.1.100:11434
WL3
WizardLM
wizardlm2:7b
Critical Analysis
Alternative perspectives and critical thinking. Challenges assumptions and identifies potential issues.
Endpoint: localhost:11434
PH34
Phi3
phi3:3.8b
Historical Context
Historical context and temporal perspectives. Draws on precedents and past patterns to inform current questions.
Endpoint: 192.168.1.100:11434
GPT5
GPT-OSS
gpt-oss:20b
Strategy & Foresight
Scenario planning, stepwise execution plans, verification & rollback strategies. The strategic planner.
Endpoint: 192.168.1.100:11434

Architecture

The system uses a PHP proxy (ollama_proxy.php) to route requests to Ollama instances across the network:

Browser UI ollama_proxy.php Routes to Ollama instances localhost:11434 GM3 WL 192.168.1.100:11434 QW PH3 GPT consensus.php Gemma3:27b Synthesis
AI Council Architecture - Distributed Ollama instances with PHP proxy and consensus engine

Response Modes

The Council supports two response modes selectable via dropdown:

Sequential Mode

Models respond one at a time in order (1→2→3→4→5). Each response streams in real-time with a typing cursor. Good for following the "conversation" between council members and seeing how each builds on the previous.

Parallel Mode

All models are queried simultaneously using Promise.all(). Faster overall response time since all models work at once, but responses appear together when all finish.

JavaScript
// Council member configuration
const COUNCIL_MEMBERS = [
  {
    id: 'adi',
    name: 'Gemma3',
    fullName: 'Gemma3:12b',
    endpoint: 'localhost',
    model: 'gemma3:12b',
    role: 'Primary Assistant',
    description: 'Comprehensive analysis and detailed responses',
    avatar: 'adi'
  },
  {
    id: 'tinyllama',
    name: 'Qwen3',
    fullName: 'Qwen3:4b',
    endpoint: '192.168.1.100',
    model: 'qwen3:4b',
    role: 'Quick Insights',
    description: 'Fast, concise responses and key insights',
    avatar: 'tinyllama'
  },
  // ... wizardlm, historian, strategos
];

Consensus Engine

After all 5 council members respond, the Consensus Engine synthesizes their perspectives into a unified answer. This calls consensus.php which uses Gemma3:27b - a larger model capable of understanding and reconciling multiple viewpoints.

The consensus is displayed in a special green-bordered panel with streaming output. Once complete, a speaker button appears to play the consensus via TTS.

Role Prompts

Each council member receives a role-specific system prompt that shapes their personality:

JavaScript
const rolePrompts = {
  adi: `You are Gemma3, the primary AI assistant. Provide 
        comprehensive, well-structured responses with clear 
        explanations and examples when helpful.`,
  
  tinyllama: `You are Qwen3, focused on providing quick, concise 
              insights. Give clear, direct answers that capture 
              the key points efficiently.`,
  
  wizardlm: `You are WizardLM, the critical analyst. Question 
             assumptions, present alternative viewpoints, and 
             provide constructive criticism.`,
  
  historian: `You are Phi3, specializing in historical context. 
              Relate topics to historical precedents and trends.`,
  
  strategos: `You are GPT-OSS STRATEGOS, the Council's staff 
              planner. Turn problems into clear, executable plans 
              with assumptions, numbered steps, verification checks, 
              rollback paths, and risks. Prefer commands, exact 
              paths, and short rationale.`
};

Ollama Proxy

The ollama_proxy.php handles all communication with Ollama instances, supporting streaming responses via Server-Sent Events:

PHP
<?php
// ollama_proxy.php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache');

$input = json_decode(file_get_contents('php://input'), true);

$endpoint = $input['endpoint'] ?? 'localhost';
$model = $input['model'] ?? 'gemma3:12b';
$prompt = $input['prompt'] ?? '';

$ollamaUrl = "http://{$endpoint}:11434/api/generate";

$payload = json_encode([
    'model' => $model,
    'prompt' => $prompt,
    'stream' => true
]);

$ch = curl_init($ollamaUrl);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $payload);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_WRITEFUNCTION, function($ch, $data) {
    echo $data;
    ob_flush();
    flush();
    return strlen($data);
});
curl_exec($ch);
echo '[STREAM_END]';
curl_close($ch);

Setup Guide

1. Install Ollama on Your Servers

Bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull required models
ollama pull gemma3:12b
ollama pull gemma3:27b    # For consensus
ollama pull qwen3:4b
ollama pull wizardlm2:7b
ollama pull phi3:3.8b

# Enable network access
sudo systemctl edit ollama
# Add: Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl restart ollama

2. Deploy PHP Files

Bash
/var/www/ai-council/
├── index.php           # Main UI (Tailwind + vanilla JS)
├── ollama_proxy.php    # Ollama API proxy with streaming
└── consensus.php       # Consensus engine (Gemma3:27b)

3. Update Endpoints

Edit the COUNCIL_MEMBERS array to match your network. Change endpoint values to point to your Ollama servers.

Resource Requirements
Running multiple LLMs requires significant RAM/VRAM. The 27b consensus model alone needs ~16GB+ VRAM. Consider distributing models across multiple machines or using quantized variants (Q4_K_M) if resources are limited.