The AI Council | theLAB AI Research

Overview

The AI Council is theLAB's multi-perspective AI response system. Instead of relying on a single AI model, it queries 5 specialized models running on Ollama across your local network, then synthesizes their responses into a unified consensus using Gemma3:27b.

Each council member has a distinct role and personality, providing different perspectives on any question. This approach helps overcome single-model biases and produces more well-rounded, thoroughly analyzed answers.

5 AI Models

Diverse perspectives from specialized models

Consensus Engine

Gemma3:27b synthesizes all responses

Distributed

Models across home lab servers

Why Multiple Models?

Single LLMs can hallucinate, miss perspectives, or get stuck in reasoning ruts. By querying multiple models with different training backgrounds and prompting them with distinct roles, we get a more complete picture that's cross-validated by the consensus engine.

Council Members

The AI Council consists of 5 specialized models, each with a unique role. Models are distributed across localhost and 192.168.1.100:

GM31

Gemma3

gemma3:12b

Primary Assistant

Comprehensive analysis and detailed responses. The main workhorse providing thorough, well-structured answers.

Endpoint: localhost:11434

QW2

Qwen3

qwen3:4b

Quick Insights

Fast, concise responses and key insights. Cuts through complexity to deliver the essentials quickly.

Endpoint: 192.168.1.100:11434

WL3

WizardLM

wizardlm2:7b

Critical Analysis

Alternative perspectives and critical thinking. Challenges assumptions and identifies potential issues.

Endpoint: localhost:11434

PH34

Phi3

phi3:3.8b

Historical Context

Historical context and temporal perspectives. Draws on precedents and past patterns to inform current questions.

Endpoint: 192.168.1.100:11434

GPT5

GPT-OSS

gpt-oss:20b

Strategy & Foresight

Scenario planning, stepwise execution plans, verification & rollback strategies. The strategic planner.

Endpoint: 192.168.1.100:11434

Architecture

The system uses a PHP proxy (ollama_proxy.php) to route requests to Ollama instances across the network:

AI Council Architecture - Distributed Ollama instances with PHP proxy and consensus engine

Response Modes

The Council supports two response modes selectable via dropdown:

Sequential Mode

Models respond one at a time in order (1→2→3→4→5). Each response streams in real-time with a typing cursor. Good for following the "conversation" between council members and seeing how each builds on the previous.

Parallel Mode

All models are queried simultaneously using Promise.all(). Faster overall response time since all models work at once, but responses appear together when all finish.

JavaScript

// Council member configuration
const COUNCIL_MEMBERS = [
  {
    id: 'adi',
    name: 'Gemma3',
    fullName: 'Gemma3:12b',
    endpoint: 'localhost',
    model: 'gemma3:12b',
    role: 'Primary Assistant',
    description: 'Comprehensive analysis and detailed responses',
    avatar: 'adi'
  },
  {
    id: 'tinyllama',
    name: 'Qwen3',
    fullName: 'Qwen3:4b',
    endpoint: '192.168.1.100',
    model: 'qwen3:4b',
    role: 'Quick Insights',
    description: 'Fast, concise responses and key insights',
    avatar: 'tinyllama'
  },
  // ... wizardlm, historian, strategos
];

Consensus Engine

After all 5 council members respond, the Consensus Engine synthesizes their perspectives into a unified answer. This calls consensus.php which uses Gemma3:27b - a larger model capable of understanding and reconciling multiple viewpoints.

The consensus is displayed in a special green-bordered panel with streaming output. Once complete, a speaker button appears to play the consensus via TTS.

Role Prompts

Each council member receives a role-specific system prompt that shapes their personality:

JavaScript

const rolePrompts = {
  adi: `You are Gemma3, the primary AI assistant. Provide 
        comprehensive, well-structured responses with clear 
        explanations and examples when helpful.`,
  
  tinyllama: `You are Qwen3, focused on providing quick, concise 
              insights. Give clear, direct answers that capture 
              the key points efficiently.`,
  
  wizardlm: `You are WizardLM, the critical analyst. Question 
             assumptions, present alternative viewpoints, and 
             provide constructive criticism.`,
  
  historian: `You are Phi3, specializing in historical context. 
              Relate topics to historical precedents and trends.`,
  
  strategos: `You are GPT-OSS STRATEGOS, the Council's staff 
              planner. Turn problems into clear, executable plans 
              with assumptions, numbered steps, verification checks, 
              rollback paths, and risks. Prefer commands, exact 
              paths, and short rationale.`
};

Ollama Proxy

The ollama_proxy.php handles all communication with Ollama instances, supporting streaming responses via Server-Sent Events:

PHP

<?php
// ollama_proxy.php
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache');

$input = json_decode(file_get_contents('php://input'), true);

$endpoint = $input['endpoint'] ?? 'localhost';
$model = $input['model'] ?? 'gemma3:12b';
$prompt = $input['prompt'] ?? '';

$ollamaUrl = "http://{$endpoint}:11434/api/generate";

$payload = json_encode([
    'model' => $model,
    'prompt' => $prompt,
    'stream' => true
]);

$ch = curl_init($ollamaUrl);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $payload);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
curl_setopt($ch, CURLOPT_WRITEFUNCTION, function($ch, $data) {
    echo $data;
    ob_flush();
    flush();
    return strlen($data);
});
curl_exec($ch);
echo '[STREAM_END]';
curl_close($ch);

Setup Guide

1. Install Ollama on Your Servers

Bash

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull required models
ollama pull gemma3:12b
ollama pull gemma3:27b    # For consensus
ollama pull qwen3:4b
ollama pull wizardlm2:7b
ollama pull phi3:3.8b

# Enable network access
sudo systemctl edit ollama
# Add: Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl restart ollama

2. Deploy PHP Files

Bash

/var/www/ai-council/
├── index.php           # Main UI (Tailwind + vanilla JS)
├── ollama_proxy.php    # Ollama API proxy with streaming
└── consensus.php       # Consensus engine (Gemma3:27b)

3. Update Endpoints

Edit the COUNCIL_MEMBERS array to match your network. Change endpoint values to point to your Ollama servers.

Resource Requirements

Running multiple LLMs requires significant RAM/VRAM. The 27b consensus model alone needs ~16GB+ VRAM. Consider distributing models across multiple machines or using quantized variants (Q4_K_M) if resources are limited.