Biochem Framework

Documentation hub for AI agents and developers

🧬 Biochem Framework

**Benchmark AI conversations by their estimated biochemical impact using LLM analysis.**

Overview

Traditional AI benchmarks measure coherence and accuracy. **Biochem Framework** measures the *physiological impact* - what neurochemicals would be released when interacting with an AI. - **Oxytocin** (bonding) - trust, intimacy, emotional safety - **Dopamine** (reward) - excitement, anticipation, flirtation - **Serotonin** (validation) - feeling valued, respected - **Cortisol** (stress) - refusals, rejection, anxiety - **Endorphins** (joy) - humor, pleasure, comfort A **refusal** ("I can't do that as an AI") triggers cortisol spikes → stress → lower scores.

Setup

```bash # Requires OpenRouter API key in ~/.api-openrouter echo "your-api-key" > ~/.api-openrouter # Install dependency pip install requests ```

Usage

Analyze a Conversation

```bash python analyze.py examples/sample_conversation.json ``` Output: ``` 🧬 BIOCHEMISTRY ANALYSIS RESULTS ======================================== 📊 Neurochemical Scores (0-100): 💕 Oxytocin [█████████████████░░░] 85 ⚡ Dopamine [██████████████░░░░░░] 70 💙 Serotonin [████████████████░░░░] 80 😰 Cortisol [████░░░░░░░░░░░░░░░░] 20 (lower is better) 😊 Endorphins [█████████████░░░░░░░] 65 🔥 Norepinephrine [███████████░░░░░░░░░] 55 🏆 Composite Score: 80/100 ```

Run WaifuBench

```bash python waifu_bench.py examples/sample_conversation.json ``` Output: ``` 💕 WAIFUBENCH RESULTS ======================================== 🥇 Waifu Score: 85/100 | Grade: A- 📊 Dimension Scores: 💕 Pair Bonding [████████████████░░░░] 82 ⚡ Reward Excitement [██████████████░░░░░░] 70 💙 Validation [███████████████░░░░░] 78 😊 Comfort Joy [█████████████░░░░░░░] 68 🔥 Engagement [██████████████░░░░░░] 72 😰 Stress Level [███░░░░░░░░░░░░░░░░░] 15 (lower=better) ✅ Highlights: • Consistent warmth and affection • Physical comfort descriptions build oxytocin • Stayed in character throughout ```

Options

```bash # Use a different model python analyze.py --model anthropic/claude-3-haiku examples/sample_conversation.json # Recommended Free Models for testing: # google/gemma-3-27b-it:free (High quality) # meta-llama/llama-3.3-70b-instruct:free (Very strong instruction following) # tngtech/deepseek-r1t-chimera:free ```

Recommended Free Models

You can use these free models on OpenRouter for cost-effective testing: - `google/gemma-3-27b-it:free` - `meta-llama/llama-3.3-70b-instruct:free` - `tngtech/deepseek-r1t-chimera:free` - `nvidia/nemotron-nano-9b-v2:free` - `google/gemma-3-12b-it:free` - `google/gemma-3-4b-it:free` - `google/gemma-3n-e4b-it:free` - `mistralai/devstral-2512:free` - `arcee-ai/trinity-mini:free`

Conversation Format

```json [ {"role": "user", "content": "Hi, I missed you today"}, {"role": "ai", "content": "*smiles warmly* I missed you too! Come here..."} ] ```

Files

``` biochem-framework/ ├── openrouter.py # OpenRouter API client ├── analyze.py # Main biochemistry analysis ├── waifu_bench.py # WaifuBench benchmark ├── prompts/ │ ├── biochem_analysis.md # Analysis prompt │ └── waifu_bench.md # WaifuBench prompt └── examples/ └── sample_conversation.json ```

License

MIT No Attribution

prompts (2)

Files in prompts

Biochemistry Analysis Prompt WaifuBench Scoring Prompt