Awful Jade (`aj`) 🌲

aj is your command-line sidekick for working with Large Language Models (LLMs) — fast, friendly, and a dependable. 🦀💬

aj CLI Example

🌐 Project Links

GitHub Repository 🐙 — Open source code, issues, and contributions welcome.
Docs.rs Documentation 📖 — Full API reference, kept in sync with crate releases.
Crates.io Package 📦 — Install instructions, versions, and metadata.

LLM Swiss Army knife with the best intentions. 😇

Awful Jade CLI tool logo

🤔 Why `aj`?

Speed & Safety: Written in Rust for rock‑solid performance.
Talk or Tinker: One‑shot Q&A ✨ or full interactive sessions 🗣️.
Memories: Vectordb search + embeddings so it “remembers” helpful context 🧠.
YAML All the Things: Configs & prompt templates without tears ☕️.

⚡ TL;DR

What You’ll Find in This Book 📚

Install: macOS, Linux, Windows — you’ll be up and running in minutes.
Use: Quick commands, interactive mode, library usage for Rust devs.
Config: Tune models, context windows, paths.
Templates: Build your prompts like a pro.
Sessions & Memories: Understand how AJ recalls things.
Downstream Projects: See how others extend aj.

Pro tip: AJ auto‑downloads the BERT embedding model (all-mini-lm-l12-v2) when needed. No fuss. 📦⬇️

aj ask "What's an angel?"

Install (multiplatform)

Jade was built to work on all 3 major operating systems, aarch64 and x86_64. Installation steps are similar across all three but documentation for them is provided for convenience.

Install on macOS 🍎

✅ Requirements

Miniconda (recommended) 🐍
Python 3.11
PyTorch 2.4.0

1. Python via conda 🧪

brew install miniconda               # or use the official installer
conda create -n aj python=3.11 -y
conda activate aj

2. Install PyTorch 2.4.0 🧱

pip install torch==2.4.0

3. Environment setup 🌿

Add to your shell init (e.g., ~/.zshrc):

export LIBTORCH_USE_PYTORCH=1
export LIBTORCH="/opt/homebrew/Caskroom/miniconda/base/pkgs/pytorch-2.4.0-py3.11_0/lib/python3.11/site-packages/torch"
export DYLD_LIBRARY_PATH="$LIBTORCH/lib"

4. Install from crates.io and initialize 📦

cargo install awful_aj
cargo init

cargo init creates:

~/Library/Application Support/com.awful-sec.aj/
~/Library/Application Support/com.awful-sec.aj/config.yaml
~/Library/Application Support/com.awful-sec.aj/templates
~/Library/Application Support/com.awful-sec.aj/templates/default.yaml
~/Library/Application Support/com.awful-sec.aj/templates/simple_question.yaml

5. Prepare the Session Database (SQLite) 📂

aj stores sessions, messages, and configs in a local SQLite3 database (aj.db). You have two ways to provision it:

Option A — Without Diesel CLI (raw sqlite3)

This is the minimal approach if you don’t want extra tooling.

# Create the DB file
sqlite3 "$HOME/Library/Application Support/com.awful-sec.aj/aj.db" <<'SQL'
PRAGMA foreign_keys = ON;

CREATE TABLE IF NOT EXISTS conversations (
  id            INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  session_name  TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS messages (
  id               INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  role             TEXT NOT NULL,
  content          TEXT NOT NULL,
  dynamic          BOOLEAN NOT NULL DEFAULT 0,
  conversation_id  INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS awful_configs (
  id                                 INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  api_base                            TEXT NOT NULL,
  api_key                             TEXT NOT NULL,
  model                               TEXT NOT NULL,
  context_max_tokens                  INTEGER NOT NULL,
  assistant_minimum_context_tokens    INTEGER NOT NULL,
  stop_words                          TEXT NOT NULL,
  conversation_id                     INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
SQL

Verify tables:

sqlite3 "$HOME/Library/Application Support/com.awful-sec.aj/aj.db" ".tables"

Option B — With Diesel CLI 🛠️

This is recommended if you want migrations and a typed schema.rs.

Grab the awful_aj git repo.

git clone https://github.com/graves/awful_aj
cd awful_aj

Install Diesel CLI for SQLite.

cargo install diesel_cli --no-default-features --features sqlite

Configure database URL and run migrations.

export DATABASE_URL="$HOME/Library/Application Support/com.awful-sec.aj/aj.db"
diesel migration run

6. First‑run model download ⤵️

On first use needing embeddings, aj downloads all-mini-lm-l12-v2 from https://awfulsec.com/bigfiles/all-mini-lm-l12-v2.zip into:

~/Library/Application Support/com.awful-sec.aj/

You’re ready! ✅

Try:

aj ask "Hello from macOS!"

Install on Linux 🐧

✅ Requirements

Rust
Miniconda (recommended) 🐍
Python 3.11
PyTorch 2.4.0

1. Python via conda 🧪

# Install Miniconda (example for Debian/Ubuntu)
sudo apt-get update
sudo apt-get install -y wget bzip2

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

# Create and activate environment
conda create -n aj python=3.11 -y
conda activate aj

2. Install PyTorch 2.4.0 🧱

pip install torch==2.4.0

3. Environment setup 🌿

Add to your shell init (e.g., ~/.zshrc):

export LIBTORCH_USE_PYTORCH=1
export LIBTORCH="$HOME/miniconda3/envs/aj/lib/python3.11/site-packages/torch"
export LD_LIBRARY_PATH="$LIBTORCH/lib:$LD_LIBRARY_PATH"

4. Install from crates.io and initialize 📦

cargo install awful_aj
cargo init

cargo init creates:

~/.config/aj/
~/.config/aj/config.yaml
~/.config/aj/templates/
~/.config/aj/templates/default.yaml
~/.config/aj/templates/simple_question.yaml

5. Prepare the Session Database (SQLite) 📂

aj stores sessions, messages, and configs in a local SQLite3 database (aj.db). You have two ways to provision it:

Option A — Without Diesel CLI (raw sqlite3)

This is the minimal approach if you don’t want extra tooling.

sqlite3 ~/.config/aj/aj.db <<'SQL'
PRAGMA foreign_keys = ON;

CREATE TABLE IF NOT EXISTS conversations (
  id            INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  session_name  TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS messages (
  id               INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  role             TEXT NOT NULL,
  content          TEXT NOT NULL,
  dynamic          BOOLEAN NOT NULL DEFAULT 0,
  conversation_id  INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS awful_configs (
  id                                 INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  api_base                            TEXT NOT NULL,
  api_key                             TEXT NOT NULL,
  model                               TEXT NOT NULL,
  context_max_tokens                  INTEGER NOT NULL,
  assistant_minimum_context_tokens    INTEGER NOT NULL,
  stop_words                          TEXT NOT NULL,
  conversation_id                     INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
SQL

Verify tables:

sqlite3 ~/.config/aj/aj.db ".tables"

Option B — With Diesel CLI 🛠️

This is recommended if you want migrations and a typed schema.rs.

Grab the awful_aj git repo.

git clone https://github.com/graves/awful_aj
cd awful_aj

Install Diesel CLI for SQLite.

cargo install diesel_cli --no-default-features --features sqlite

Configure database URL and run migrations.

export DATABASE_URL="$HOME/.config/aj/aj.db"
diesel migration run

6. First‑run model download ⤵️

On first use needing embeddings, aj downloads all-mini-lm-l12-v2 from https://awfulsec.com/bigfiles/all-mini-lm-l12-v2.zip into:

~/.config/aj/

You’re ready! ✅

Try:

aj ask "Hello from macOS!"

Install on Windows 🪟

✅ Requirements

Miniconda (recommended) 🐍
Python 3.11
PyTorch 2.4.0
SQLite3 (or use Diesel CLI for migrations)

1. Python via conda 🧪

Open PowerShell (with Conda available on PATH):

winget install miniconda3                 # or use the official installer
conda create -n aj python=3.11 -y
conda activate aj

2. Install PyTorch 2.4.0 🧱

pip install torch==2.4.0

4. Install from crates.io and initialize 📦

cargo install awful_aj
aj init

aj init creates:

C:\Users\YOU\AppData\Roaming\awful-sec\aj\
config.yaml
templates\
templates\default.yaml
templates\simple_question.yaml

5. Prepare the Session Database (SQLite) 📂

aj stores sessions, messages, and configs in a local SQLite3 database (aj.db). You have two ways to provision it:

Option A — Without Diesel CLI (raw sqlite3)

Minimal setup if you don’t want extra tooling. Ensure you have sqlite3.exe in PATH.

$DB="$env:APPDATA\awful-sec\aj\aj.db"

sqlite3 $DB @"
PRAGMA foreign_keys = ON;

CREATE TABLE IF NOT EXISTS conversations (
  id            INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  session_name  TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS messages (
  id               INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  role             TEXT NOT NULL,
  content          TEXT NOT NULL,
  dynamic          BOOLEAN NOT NULL DEFAULT 0,
  conversation_id  INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS awful_configs (
  id                                 INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  api_base                            TEXT NOT NULL,
  api_key                             TEXT NOT NULL,
  model                               TEXT NOT NULL,
  context_max_tokens                  INTEGER NOT NULL,
  assistant_minimum_context_tokens    INTEGER NOT NULL,
  stop_words                          TEXT NOT NULL,
  conversation_id                     INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
"@

Verify tables:

sqlite3 $DB ".tables"

Option B — With Diesel CLI 🛠️

Recommended if you want migrations and a typed schema.rs.

Grab the repo:

git clone https://github.com/graves/awful_aj
cd awful_aj

Install Diesel CLI for SQLite:

cargo install diesel_cli --no-default-features --features sqlite

Configure database URL and run migrations:

$env:DATABASE_URL="$env:APPDATA\awful-sec\aj\aj.db"
diesel migration run

6. First-run model download ⤵️

On first use needing embeddings, aj downloads all-mini-lm-l12-v2 from https://awfulsec.com/bigfiles/all-mini-lm-l12-v2.zip into:

C:\Users\YOU\AppData\Roaming\awful-sec\aj\

✅ Quick sanity check

aj ask "Hello from Windows!"

Build from Source 🧱

Want to hack on aj? Let’s go! 🧑‍💻

🤢 Install dependencies

brew install miniconda               # or use the official installer
conda create -n aj python=3.11 -y
conda activate aj
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cp
export LIBTORCH_USE_PYTORCH=1
export LIBTORCH="/opt/homebrew/Caskroom/miniconda/base/pkgs/pytorch-2.4.0-py3.11_0/lib/python3.11/site-packages/torch"
export DYLD_LIBRARY_PATH="$LIBTORCH/lib"

🛠️ Clone & Build

git clone https://github.com/graves/awful_aj.git
cd awful_aj
cargo build

✅ Run Tests

cargo test

Tip: If you modify features that touch embeddings, ensure your Python + PyTorch environment is active before running commands that exercise memory/vector search.

🧯 Common Troubleshooting

Linker/PyTorch libs not found: Recheck the LIBTORCH environment variable and your platform’s dynamic library path env var (DYLD_LIBRARY_PATH on macOS, LD_LIBRARY_PATH on Linux, PATH on Windows).
Model not downloading:
- Ensure the config directory exists and is writable. See Config Paths on your OS's Install page.
- Check your network connection.

Use `aj` 🚀

AJ can be used in three primary ways:

Init: Bootstrap local config/templates/DB 🏗️
Ask: One‑shot Q&A ✨
Interactive: Chat with memory 🧠
As a Library: Embed AJ in your Rust code 🧩

`aj init` 🏗️

Create default config, templates, and the session database.

aj init

📁 What it creates

config.yaml with sensible defaults
templates/default.yaml, templates/simple_question.yaml
A SQLite database aj.db for sessions

📍 Where these live

macOS: ~/Library/Application Support/com.awful-sec.aj/
Linux: ~/.config/aj/
Windows: C:\\Users\\YOU\\AppData\\Roaming\\awful-sec\\aj\\

🙋🏻‍♀️ Help

aj init --help
Initialize configuration and default templates in the platform config directory.

Creates the config file and a minimal template set if they don’t exist yet.

Usage: aj init

Options:
  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

`aj ask` ✨

Ask a single question and print the assistant’s response.

aj ask "Is Bibi really from Philly?"

aj ask cli example

🔧 Options

--template : Use a specific prompt template.
--model : Override model for this question.
--session: Session name for long running conversations.

✅ When to use

Quick facts, transformations, summaries.
Scriptable one‑liners in shell pipelines.
Modify the default template and add a session name to give your computer a personality.

🙋🏻‍♀️ Help

λ aj ask --help
Ask a single question and print the assistant’s response.

If no `question` is provided, the application supplies a default prompt.

Aliases: `a`

Usage: aj ask [OPTIONS] [QUESTION]

Arguments:
  [QUESTION]
          The question to ask. When omitted, a default question is used

Options:
  -t <template>
          Name of the chat template to load (e.g., `simple_question`).

          Templates live under the app’s config directory, usually at: - macOS: `~/Library/Application Support/com.awful-sec.aj/templates/` - Linux: `~/.config/aj/templates/` - Windows: `%APPDATA%\\com.awful-sec\\aj\\templates\\`

  -s <session>
          Session name. When set, messages are persisted under this conversation.

          Using a session enables retrieval-augmented context from prior turns.

  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

`aj interactive` 🗣️

Start a REPL‑style chat. aj uses a vectordb to store embeddings of past messages and recalls relevant prior context.

aj interactive

aj interactive cli example

🧠 Features

Remembers salient turns via HNSW + sentence embeddings.

Limits total tokens to your configured quota (oldest context trimmed)
Supports templates and system prompts

💡 Pro Tips

aj interactive expects an ASCII escape code to send your message. On macOS that's Ctrl-d.
Send exit or Ctrl-c to exit the REPL.

🙋🏻‍♀️ Help

λ aj interactive --help
Start an interactive REPL-style conversation.

Prints streaming assistant output (when enabled) and persists messages if a session name is configured by the application.

Aliases: `i`

Usage: aj interactive [OPTIONS]

Options:
  -t <template>
          Name of the chat template to load (e.g., `simple_question`)

  -s <session>
          Session name for the conversation

  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

🧩 Use as a Library

Bring awful_aj into your own Rust projects—reuse the same high-level chat plumbing that powers the CLI. 🦀💬

⚠️ Note: The public API may evolve. Check docs.rs for signatures.

📦 Add Dependency

# Cargo.toml
[dependencies]
awful_aj = "*"
tokio = "1.45.0"

🐆 Quickstart

use awful_aj::{
    api::ask,
    config::AwfulJadeConfig,
    template::{self, ChatTemplate},
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let config: AwfulJadeConfig = awful_aj::config::load_config("somewhere/config.yaml")
    let template: ChatTemplate = template::load_template("book_txt_sanitizer")
        .await
        .map_err(|e| format!("Template load error: {e}"))?;

    let res = ask(config, chunk.to_string(), template, None, None).await;

    Ok(())
}

🔎 API Highlights

AwfulJadeConfig: Load/override runtime settings.
awful_jade::api::ask(..., None, None): One‑shot Q&A.
awful_jade::api::ask(..., vector_store, brain): Conversations with relevant context injected that falls outside your models maximum context length.
In-memory vectordb with flat-file persistence, powers aj's memory helpers behind the scenes.

🐆 Quickstart: One-Shot Q&A (Non-Streaming)

Uses api::ask with no session and no memory. Minimal + predictable. ✅

use std::error::Error;
use awful_aj::{
    api,
    config::AwfulJadeConfig,
    template::ChatTemplate,
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    // Build config explicitly for clarity. (You can load from your own source if you prefer.)
    let cfg = AwfulJadeConfig {
        api_key: "YOUR_KEY".into(),
        api_base: "http://localhost:5001/v1".into(), // OpenAI-compatible endpoint
        model: "qwen3_30b_a3".into(),
        context_max_tokens: 32768,
        assistant_minimum_context_tokens: 2048,
        stop_words: vec![],                 // forwarded to the request
        session_db_url: "aj.db".into(),     // unused when session_name is None
        session_name: None,                 // no session persistence
        should_stream: Some(false),         // non-streaming
    };

    let tpl = ChatTemplate {
        system_prompt: "You are Qwen, a helpful assistant.".into(),
        messages: vec![],                   // extra seed messages if you want
        response_format: None,              // set to a JSON schema for structured output
        pre_user_message_content: None,     // optional prepend to user input
        post_user_message_content: None,    // optional append to user input
    };

    let answer = api::ask(&cfg, "Hello from my app!".into(), &tpl, None, None).await?;
    println!("assistant: {answer}");

    Ok(())
}

What happens under the hood 🧠

Builds a Client → prepares a preamble (system + template messages).
Applies pre_user_message_content and/or post_user_message_content.
Sends one non-streaming request (because should_stream = Some(false)).
Returns the assistant’s text (and persists to DB only if sessions are enabled—see below).

📺 Streaming Responses (Live Tokens!)

Set should_stream = Some(true) and still call api::ask(...). The tokens print to stdout in blue/bold as they arrive (and you still get the final text returned).

#![allow(unused)]
fn main() {
let mut cfg = /* ... as above ... */ AwfulJadeConfig {
    // ...
    should_stream: Some(true),
    // ...
};
let tpl = /* ... */;
}

📝 Note: The streaming printer uses crossterm for color + attributes. It writes to the locked stdout and resets formatting at the end.

🧵 Sessions: Persistent Conversations (with Optional Memory)

Turn on sessions by setting a session_name. When the rolling conversation exceeds the token budget, the oldest user/assistant pair is ejected and (if you provide a VectorStore) embedded + stored for later retrieval. 📚➡️🧠

use std::error::Error;
use awful_aj::{api, config::AwfulJadeConfig, template::ChatTemplate};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let cfg = AwfulJadeConfig {
        // ... same as before ...
        session_db_url: "aj.db".into(),
        session_name: Some("getting-started".into()),  // ✅ enable session
        should_stream: Some(false),
        .. /* the rest */
        AwfulJadeConfig {
            api_key: "KEY".into(),
            api_base: "http://localhost:1234/v1".into(),
            model: "jade_qwen3_4b".into(),
            context_max_tokens: 32768,
            assistant_minimum_context_tokens: 2048,
            stop_words: vec![],
            session_db_url: "aj.db".into(),
            session_name: Some("getting-started".into()),
            should_stream: Some(false),
        }
    };

    let tpl = ChatTemplate {
        system_prompt: "You are Awful Jade, created by Awful Security.".into(),
        messages: vec![],
        response_format: None,
        pre_user_message_content: None,
        post_user_message_content: None,
    };

    // First turn
    let a1 = api::ask(&cfg, "Remember: my project is 'Alabaster'.".into(), &tpl, None, None).await?;
    println!("assistant: {a1}");

    // Next turn—session context is restored from DB automatically:
    let a2 = api::ask(&cfg, "What's the codename I told you?".into(), &tpl, None, None).await?;
    println!("assistant: {a2}");

    Ok(())
}

Session details 🗂️

get_session_messages loads or seeds the conversation.
On overflow, the oldest pair is ejected and (if a VectorStore is provided) embedded + persisted; the HNSW index is rebuilt.
On each call, the assistant reply is persisted to the session DB.

💡 You control the budget with context_max_tokens and the preamble budget with assistant_minimum_context_tokens (used by the brain/preamble logic).

🧠 Adding Memories (Vector Search Assist)

If you provide a VectorStore and a Brain, nearby memories (Euclidean distance < 1.0) are injected into the brain’s preamble before the call. This is how long-term recall is blended in. 🧲🧠✨

// RAG demo
use std::{fs, path::PathBuf};
use async_openai::types::Role;
use std::error::Error;
use awful_aj::{api, config::AwfulJadeConfig, template::ChatTemplate, brain::Brain, vector_store::VectorStore};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error + Send + Sync>> {
    let cfg = AwfulJadeConfig {
        api_key: "KEY".into(),
        api_base: "http://localhost:5001/v1".into(),
        model: "gpt-4o-mini".into(),
        context_max_tokens: 8192,
        assistant_minimum_context_tokens: 2048,
        stop_words: vec![],
        session_db_url: "aj.db".into(),
        session_name: Some("mem-demo".into()),
        should_stream: Some(false),
    };

    let tpl = ChatTemplate {
        system_prompt: "You are a helpful assistant that uses prior notes when relevant.".into(),
        messages: vec![],
        response_format: None,
        pre_user_message_content: None,
        post_user_message_content: None,
    };

    // Create a brain that will reserve a maximum of 1,024 tokens of the inference's context window.
    let mut brain = Brain::new(1024, &tpl);

    // Construct your VectorStore
    // 384 dims for MiniLM (as per your VectorStore)
    let session_name = "docs-demo";
    let mut store = VectorStore::new(384, session_name.to_string())?;

    // Seed a few memories (they could be doc chunks, FAQs, prior chat turns, etc.)
    let notes = [
        "Project codename is Alabaster.",
        "Primary repository is `awful_aj` owned by `graves`.",
        "Use `aj interactive` for a REPL with memory.",
        "Templates live in the config directory under `templates/`.",
    ];

    for s in notes {
        let v = vs.embed_text_to_vector(s)?;
        vs.add_vector_with_content(v, Memory::new(Role::User, s.to_string()))?;
    }

    // Finalize the HNSW index so queries will see inserts
    store.build()?;

    // Persist both the index (binary) and YAML metadata so you can rehydrate later 💦 (optional)
    let yaml_path = PathBuf::from("vector_store.yaml");
    store.serialize(&yaml_path, session_name.to_string())?;

    // Later, a query that should recall a nearby memory (< 1.0 distance):
    let ans = api::ask(&cfg, "Who owns the repo again?".into(), &tpl, Some(&mut store), Some(&mut brain)).await?;
    println!("assistant: {ans}");

    Ok(())
}

What add_memories_to_brain does 🔍

Embeds the current question.
Looks up top-k neighbors (3) in the HNSW index.
For neighbors with distance < 1.0, injects their content into the brain.
Rebuilds the preamble so these memories ship with the request.

📏 Threshold and k are implementation details you can tune inside your VectorStore module if you hack on awful_aj.

🧪 Templates: Powerful Knobs (System, Seeds, & Post-Processing)

ChatTemplate gives you flexible pre/post shaping without touching your app logic. 🎛️

system_prompt: The authoritative behavior message.
messages: Seed messages (system/user/assistant) to anchor behavior or provide examples.
pre_user_message_content / post_user_message_content: Lightweight way to wrap inputs (e.g., “Answer concisely.” / “Return JSON.”).
response_format: If present, it’s forwarded as a JSON Schema so that if your model supports Tool Calling or Structured Output the inference will only emit structured output. 🧩

🧰 For structured outputs, define the schema object your server expects and place it in template.response_format. For example:

{
  "type": "object",
  "properties": {
    "sanitizedBookExcerpt": {
      "type": "string"
    }
  }
}

🧯 Error Handling (What to Expect)

All public fns bubble up errors: API/network, I/O, (de)serialization, embeddings, DB, index build. Handle with Result<_, Box> or your own error types. Streaming writes to stdout can also return I/O errors that are forwarded.

🧪 Advanced: Call Streaming/Non-Streaming Primitives Directly

You can skip api::ask and call the lower-level primitives if you need full control (custom prompt stacks, different persistence strategy, special output handling):

stream_response(...) -> ChatCompletionRequestMessage
fetch_response(...) -> ChatCompletionRequestMessage

These expect a Client, a SessionMessages you’ve prepared, and your AwfulJadeConfig + ChatTemplate. They return the final assistant message object (you extract its text from AssistantMessageContent::Text).

⚠️ This is expert-mode: you manage session assembly (prepare_messages*) and persistence yourself.

🎨 Creative Patterns (Recipes!)

Here are ideas that use only the public API you’ve exposed—copy/paste and riff. 🧑🏽‍🍳

1. Batch Q&A (Non-Streaming) 📚⚡

Process a list of prompts and collect answers.

#![allow(unused)]
fn main() {
async fn batch_answer(
    cfg: &awful_aj::config::AwfulJadeConfig,
    tpl: &awful_aj::template::ChatTemplate,
    questions: impl IntoIterator<Item = String>,
) -> anyhow::Result<Vec<String>> {
    let mut out = Vec::new();
    for q in questions {
        let a = awful_aj::api::ask(cfg, q, tpl, None, None).await?;
        out.push(a);
    }
    Ok(out)
}
}

2. "Sticky" Session Bot 🤝🧵

Keep a named session and call ask repeatedly—great for chat sidebars and Agents.

#![allow(unused)]
fn main() {
struct Sticky<'a> {
    cfg: awful_aj::config::AwfulJadeConfig,
    tpl: awful_aj::template::ChatTemplate,
    _marker: std::marker::PhantomData<&'a ()>,
}

impl<'a> Sticky<'a> {
    async fn send(&self, user_text: &str) -> anyhow::Result<String> {
        awful_aj::api::ask(&self.cfg, user_text.into(), &self.tpl, None, None).await.map_err(Into::into)
    }
}
}

3. "Context Sandwich" Wrapper 🥪

Standardize how you wrap user input with pre/post content—without changing the template.

#![allow(unused)]
fn main() {
async fn sandwich(
    cfg: &awful_aj::config::AwfulJadeConfig,
    base_tpl: &awful_aj::template::ChatTemplate,
    pre: &str,
    post: &str,
    user: &str,
) -> anyhow::Result<String> {
    let mut tpl = base_tpl.clone();
    tpl.pre_user_message_content = Some(pre.into());
    tpl.post_user_message_content = Some(post.into());
    awful_aj::api::ask(cfg, user.into(), &tpl, None, None).await.map_err(Into::into)
}
}

4. Structured Outputs via Schema 🧾✅

Have your template include a JSON schema (set response_format) so responses are machine-readable—perfect for pipelines. (Exact schema type depends on your async_openai version; set it on template.response_format and api::ask will forward it.)

🗺️ Mental Model Recap

= Config (AwfulJadeConfig) → client setup + budgets + behavior flags (streaming, sessions).

Template (ChatTemplate) → system, seed messages, schema, pre/post hooks.
Session (session_name) → DB-backed rolling history with ejection on overflow.
Memory (VectorStore + Brain) → ejected pairs get embedded; nearest neighbors (< 1.0) are injected into the preamble next time.
Modes → streaming (should_stream: true) vs non-streaming (false/None).

You choose how many of those dials to turn. 🎛️😄

✅ Checklist for Production

Pin crate versions.
Decide on streaming vs non-streaming per use-case.
If you want history, set session_name.
If you want long-term recall, wire a VectorStore and a Brain.
Establish sensible stop words and token budgets.
Consider a JSON schema when you need structured output.

Happy hacking! 🧩🧠✨

Config ⚙️

aj reads a YAML configuration file from your platform directory.

📍 Paths

macOS: ~/Library/Application Support/com.awful-sec.aj/config.yaml
Linux: ~/.config/aj/config.yaml
Windows: C:\\Users\\YOU\\AppData\\Roaming\\awful-sec\\aj\\config.yaml

🧾 Example

api_base: "http://localhost:1234/v1"
api_key: "CHANGEME"
model: "jade_qwen3_4b_mlx"
context_max_tokens: 8192
assistant_minimum_context_tokens: 2048
stop_words:
  - "<|im_end|>\\n<|im_start|>"
  - "<|im_start|>\n"
session_db_url: "/Users/you/Library/Application Support/com.awful-sec.aj/aj.db"
should_stream: true

🔑 Key Fields

api_base: Where requests go.
api_key: Secret authorization key (optional).
model: LLM model to use.
context_max_tokens: Context length to request from model.
assistant_minimum_context_tokens: Will always eject messages from the context if needed to make room for this many tokens to fit the response.
stop_words: Tokens to cut off model output.
session_db_url: SQLite DB path for sessions (optional). = should_stream: Whether to stream the response or return it all at once when the inference ends.

✍️ Editing Tips

After edits, re‑run your command (no daemon reloads required).
Make sure you include the port number your LLM inference server is running on.
If your using an online service you can usually create an API key on your account settings or api page.
jade_qwen3_4b_mlx is a highly capable Qwen 3 4B model that I finetuned for Apple focused systems programming and general discussion. You can find it on huggingface or download it directly in LM Studio.

Templates 🧰

Templates are YAML files that shape prompts and behavior. Swap them per command or set a default in config.

📦 Default Locations

macOS: ~/Library/Application Support/com.awful-sec.aj/templates/
Linux: ~/.config/aj/templates/
Windows: C:\\Users\\YOU\\AppData\\Roaming\\awful-sec\\aj\\templates\\

Default Templates 🧩

aj init seeds a couple of handy starters:

`default.yaml`

system_prompt: "You are Awful Jade, a helpful AI assistant programmed by Awful Security."
messages: []

`simple_question.yaml`

system_prompt: "Answer the user clearly and concisely."
messages: []

Use them as‑is or as a base for your own.

Template Examples 🎭

Templates are at the heart of how aj guides conversations. They define who the assistant is, how the dialogue begins, and (optionally) what structure the responses must take.

A template is a simple YAML file stored under:

macOS: ~/Library/Application Support/com.awful-sec.aj/templates/
Linux: ~/.config/aj/templates/
Windows: %APPDATA%\awful-sec\aj\templates\

Each template can be as minimal or as rich as you want—ranging from a single system prompt to a complex orchestration of JSON schemas and seeded conversation turns.

🧩 Anatomy of a Template

A template YAML file typically includes:

system_prompt: The assistant’s role and global behavior.
messages: Preloaded conversation history (user and assistant turns) to “frame” new queries.
response_format (optional): A JSON schema that enforces structured outputs.
pre_user_message_content / post_user_message_content (optional): Strings prepended/appended to every new user query before it is sent.

🎨 Example 1: Minimal Q&A

system_prompt: You are Jade, a concise technical assistant.
messages:
  - role: user
    content: How do I list files in Rust?
  - role: assistant
    content: |-
      You can use `std::fs::read_dir`:

      ```rust
      for entry in std::fs::read_dir(".")? {
          let entry = entry?;
          println!("{:?}", entry.path());
      }
      ```

Now every query you make in this template context nudges the LLM to answer short and code-focused.

📐 Example 2: Structured JSON Output

Templates can preload conversation history to create guided behavior. For example, a "tutor" template might look like this:

system_prompt: You are Qwen, a professor.
messages:
  - role: user
    content: |-
      Summarize this passage into three exam questions:
      Blah blah blah, you can multiple the numbers to make them escape the parenthesis which actually means something deeper.
      Some numbers just don't make no damn sense, like why they call it pie anyway. Get rooted noob.
  - role: assistant
    content: |-
      {
        "finalExamQuestion1": "Explain the distributive property with an example.",
        "finalExamQuestion2": "Compare rational and irrational numbers.",
        "finalExamQuestion3": "Simplify an expression using the laws of exponents."
      }
response_format:
  name: final_exam_question_generator
  strict: true
  description: Always return 3 exam questions.
  schema:
    type: object
    properties:
      finalExamQuestion1: { type: string }
      finalExamQuestion2: { type: string }
      finalExamQuestion3: { type: string }
    required: [finalExamQuestion1, finalExamQuestion2, finalExamQuestion3]

By seeding a user prompt and assistant's JSON reply, you lock in both behavior and format.

📖 Example 3: Literary Tutor with Context

Templates can define strict output formats using the response_format field.

For instance, here’s a template that enforces exam-style structured questions:

system_prompt: You are Jade, a world literature tutor.
messages:
  - role: user
    content: Who are the Pandavas in the Gita?
  - role: assistant
    content: |-
      The Pandavas are the five sons of King Pandu:
      - **Yudhishthira** (justice and truth)
      - **Bhima** (strength)
      - **Arjuna** (archery and discipline)
      - **Nakula** and **Sahadeva** (the twins, beauty and wisdom)

      They are cousins to the Kauravas, and their conflict drives the central drama.
      ```
      Pandu → Yudhishthira, Bhima, Arjuna, Nakula, Sahadeva
      Dhritarashtra → 100 Kauravas (incl. Duryodhana)
      ```
      This family tree is crucial for understanding loyalties and rivalries in the Gita.

This ensures the LLM always explains relationships clearly, often including family trees.

🛠️ Example 4: Refactoring Code

system_prompt: You are Jade, an expert Rust refactoring assistant.
messages:
  - role: user
    content: Refactor this code to use iterators.
  - role: assistant
    content: |-
      Sure! Here’s the refactored version:

      ```rust
      let nums = vec![1, 2, 3, 4, 5];
      let squares: Vec<_> = nums.iter().map(|x| x * x).collect();
      println!("{:?}", squares);
      ```
      This avoids indexing and uses iterator combinators idiomatically.

Notice how the assistant reply not only refactors but also explains why. Every future query in this template will follow that pattern.

✨ Practical Tips

Always pair user + assistant in seeded messages if you want to strongly guide style.
Use response_format for machine-readable guarantees (JSON, tables, etc.).
Use pre_user_message_content / post_user_message_content for lightweight “wrapping” (like always appending /nothink).
Keep multiple templates—switch roles instantly (reading_buddy.yaml, exam_generator.yaml, code_refactor.yaml).

🚀 Where to Go From Here

Try starting with simple_question.yaml.
Copy it into refactor_rust.yaml or book_knowledge_synthesizer.yaml to see how far you can push complexity.
Remember: templates are just YAML. They can be versioned, shared, and tweaked freely.

Template Download 📥

Want to pull shared templates from a repo or gist? A simple pattern:

# Example: fetch a template into your templates dir
curl -L https://awfulsec.com/bigfiles/templates/news_parser.yaml \
  -o "$($AJ_CONFIG_PATH)/templates/news_parser.yaml"

You can browse some example templates here: https://awfulsec.com/bigfiles/templates

Tip: Consider versioning your personal templates in a dotfiles repo. 🔖

Sessions 🗂️

aj stores conversations in a local SQLite database so you can review, continue, or mine for insights. Sessions also capture config snapshots so you know which model/settings produced which answers. 🧠💾

📍 Where is it?

Typically alongside your config under aj.db.
Change via session_db_url in your Config file.

🧰 What’s inside?

Message turns (user / assistant)
Template and model metadata

🧽 Maintenance

Backup: copy the .db file while aj is not running.
Reset/Migrate: if you use Diesel CLI, run your migrations (optional, advanced).

 diesel database reset --database-url "/Users/you/Library/Application Support/com.awful-sec.aj/aj.db"

📍 Where is it?

By default, next to your platform config directory as aj.db. Change this via session_db_url in your Config. Examples: • macOS: ~/Library/Application Support/com.awful-sec.aj/aj.db • Linux: ~/.config/aj/aj.db • Windows: C:\Users\YOU\AppData\Roaming\awful-sec\aj\aj.db

Tip: Use absolute paths; Diesel’s DATABASE_URL and the app’s session_db_url should point to the same file when you run migrations.

🧰 What’s inside?

Three tables, exactly as modeled in your code:

conversations – one row per session (session_name unique-ish namespace)
messages – one row per turn (system/user/assistant), FK → conversation
awful_configs – point-in-time snapshots of runtime settings, FK → conversation

Rust models (for reference)

Conversation { id, session_name }
Message { id, role, content, dynamic, conversation_id }
AwfulConfig { id, api_base, api_key, model, context_max_tokens, assistant_minimum_context_tokens, stop_words, conversation_id }

🧪 The Diesel schema (generated)

The Diesel schema.rs corresponds to:

#![allow(unused)]
fn main() {
// @generated automatically by Diesel CLI.

diesel::table! {
    awful_configs (id) {
        id -> Integer,
        api_base -> Text,
        api_key -> Text,
        model -> Text,
        context_max_tokens -> Integer,
        assistant_minimum_context_tokens -> Integer,
        stop_words -> Text,
        conversation_id -> Nullable<Integer>,
    }
}

diesel::table! {
    conversations (id) {
        id -> Integer,
        session_name -> Text,
    }
}

diesel::table! {
    messages (id) {
        id -> Integer,
        role -> Text,
        content -> Text,
        dynamic -> Bool,
        conversation_id -> Nullable<Integer>,
    }
}

diesel::joinable!(awful_configs -> conversations (conversation_id));
diesel::joinable!(messages -> conversations (conversation_id));

diesel::allow_tables_to_appear_in_same_query!(
    awful_configs,
    conversations,
    messages,
);
}

Option A — Use Diesel CLI 🛠️

This is the most ergonomic way to create and evolve the DB.

Get the schema

git clone https://github.com/graves/awful_aj
cd awful_aj

Install Diesel CLI (SQLite only)

macOS / Linux

cargo install diesel_cli --no-default-features --features sqlite

Windows (PowerShell)

cargo install diesel_cli --no-default-features --features sqlite

On macOS you may need system SQLite headers: brew install sqlite (and ensure pkg-config can find it).

Set your database URL

macOS

export DATABASE_URL="$HOME/Library/Application Support/com.awful-sec.aj/aj.db"

Linux

export DATABASE_URL="$HOME/.config/aj/aj.db"

Windows (PowerShell)

$env:DATABASE_URL = "$env:APPDATA\awful-sec\aj\aj.db"

Run migrations

diesel migration run
diesel print-schema > src/schema.rs   # keep your schema.rs in sync

Reset / Recreate (when needed)

diesel database reset # drops and recreates (uses up/down)

🧠 Gotcha: Always point DATABASE_URL to the same file your app will use (session_db_url). If you migrate one file and run the app on another path, you’ll see "missing table" errors.

Option B — No CLI: Embedded Migrations (pure Rust) 🧰

If you don’t want to depend on the CLI, bundle SQL with your binary and run it on startup using diesel_migrations.

Add the crate

#![allow(unused)]
fn main() {
Cargo.toml
[dependencies]
diesel = { version = "2", features = ["sqlite"] }
diesel_migrations = "2"
}

Create an in-repo migrations folder

src/
migrations/
  00000000000000_init_aj_schema/
    up.sql
    down.sql

Use the same SQL as in Option A’s up.sql/down.sql.

Run at startup

#![allow(unused)]
fn main() {
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use diesel_migrations::{embed_migrations, MigrationHarness};

// Embed migrations from the `migrations/` dir
const MIGRATIONS: diesel_migrations::EmbeddedMigrations = embed_migrations!("./migrations");

fn establish_connection(database_url: &str) -> SqliteConnection {
    SqliteConnection::establish(database_url)
        .unwrap_or_else(|e| panic!("Error connecting to {database_url}: {e}"))
}

pub fn run_migrations(database_url: &str) {
    let mut conn = establish_connection(database_url);
    conn.run_pending_migrations(MIGRATIONS)
        .expect("Migrations failed");
}
}

Call run_migrations(&cfg.session_db_url) once during app startup. ✅

Bonus: You can ship a single binary that self-provisions its SQLite schema on first run—no CLI needed.

Option C — No Diesel at all: Raw sqlite3 🪚

For ultra-minimal environments, create the file and tables directly.

macOS / Linux

sqlite3 "$HOME/Library/Application Support/com.awful-sec.aj/aj.db" <<'SQL'
PRAGMA foreign_keys = ON;

CREATE TABLE IF NOT EXISTS conversations (
  id            INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  session_name  TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS messages (
  id               INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  role             TEXT NOT NULL,
  content          TEXT NOT NULL,
  dynamic          BOOLEAN NOT NULL DEFAULT 0,
  conversation_id  INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);

CREATE TABLE IF NOT EXISTS awful_configs (
  id                                 INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
  api_base                            TEXT NOT NULL,
  api_key                             TEXT NOT NULL,
  model                               TEXT NOT NULL,
  context_max_tokens                  INTEGER NOT NULL,
  assistant_minimum_context_tokens    INTEGER NOT NULL,
  stop_words                          TEXT NOT NULL,
  conversation_id                     INTEGER,
  FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
SQL

Windows (PowerShell):

#![allow(unused)]
fn main() {
$DB = "$env:APPDATA\awful-sec\aj\aj.db"
sqlite3 $DB @"
PRAGMA foreign_keys = ON;
-- (same CREATE TABLE statements as above)
"@
}

You can now use Diesel from your app against that file.

🔍 Verifying & Inspecting

List tables

sqlite3 "$DATABASE_URL" ".tables"

Peek at last 10 messages

sqlite3 "$DATABASE_URL" "SELECT id, role, substr(content,1,60) || '…' as snippet FROM messages ORDER BY id DESC LIMIT 10;"

Check a conversation by name

SELECT * FROM conversations WHERE session_name = 'default';

🧽 Maintenance

Backup: copy the .db file while aj is not running.
Vacuum (reclaim space):

sqlite3 "$DATABASE_URL" "VACUUM;"

Integrity check:

sqlite3 "$DATABASE_URL" "PRAGMA integrity_check;"

Reset via Diesel:

diesel database reset

Tip: Enable foreign keys at connection open (PRAGMA foreign_keys = ON;). Diesel’s SQLite backend does not enforce this automatically unless you set the pragma on each connection (or in migrations as above).

🧯 Troubleshooting

"no such table: conversations"
You migrated a different file than you’re connecting to. Recheck DATABASE_URL vs session_db_url.
Diesel CLI build fails on macOS
- Install headers: brew install sqlite and ensure pkg-config is available.
Foreign keys not enforced
- Ensure PRAGMA foreign_keys = ON; is set (included in up.sql). For safety, set it again immediately after opening each connection.
Schema drift
- If you edit SQL manually, regenerate schema.rs with: diesel print-schema > src/schema.rs

🧪 Example: Insert a Conversation + Message (Diesel)

#![allow(unused)]
fn main() {
use diesel::prelude::*;
use awful_aj::schema::{conversations, messages};
use awful_aj::models::{Conversation, Message};

fn demo(conn: &mut SqliteConnection) -> anyhow::Result<()> {
    let convo: Conversation = diesel::insert_into(conversations::table)
        .values(&Conversation { id: None, session_name: "demo".into() })
        .returning(Conversation::as_returning())
        .get_result(conn)?;

    let _msg: Message = diesel::insert_into(messages::table)
        .values(&Message {
            id: None,
            role: "user".into(),
            content: "Hi".into(),
            dynamic: false,
            conversation_id: convo.id,
        })
        .returning(Message::as_returning())
        .get_result(conn)?;

    Ok(())
}
}

All set! Whether you prefer Diesel CLI, embedded migrations, or plain sqlite3, you’ve got everything needed to provision, migrate, and operate the aj session database cleanly. 🧰✨

Memories 🧠

aj augments conversations with two complementary memory layers:

Working memory ("Brain") — a small, token-bounded queue of recent snippets used to build a preamble for each request.
Long-term memory ("VectorStore") — a persistent HNSW index of embeddings (MiniLM, 384-d) used for semantic recall.

Together they let AJ remember enough to be helpful, without blowing your context window. 🪄

Embeddings: all-mini-lm-l12-v2 (downloaded automatically).
Index: HNSW for fast nearest‑neighbor lookups.
Policy: Respect your token limits — prune oldest context when needed.

🔬 How it Works

Your conversation text is embedded into vectors and stored.
At answer time, aj retrieves top‑K relevant snippets.
These snippets are stitched into context (bounded by context_max_tokens).

🎛️ Tuning Dials

context_max_tokens: Overall window size.
assistant_minimum_context_tokens: How much assistant context to preserve for responses.

🧩 Architecture at a Glance

Brain (in-process):
- Holds VecDeque<Memory> (role + content).
- Enforces a token budget (max_tokens); evicts oldest entries when over.
Builds a standardized preamble (3 messages):
- system = template’s system_prompt
- user = serialized brain JSON (a short explanatory line + {"about", "memories":[...]}
- assistant = "Ok" (handshake/ack)
VectorStore (persistent):
- Embeds text via all-mini-lm-l12-v2 ➜ 384-d vectors.
- Stores vectors in HNSW (hora) and maps ID → Memory.
- Serialize to YAML + binary index (<uuid>_hnsw_index.bin under config_dir()).
- Reloads the embedding model from config_dir()/all-mini-lm-l12-v2 on deserialization.
Sessions & Ejection:
- When a rolling conversation exceeds budget, oldest user/assistant pair is ejected.
- If a VectorStore is provided, those ejected turns are embedded + added to the index, then build() is called.
- New questions trigger nearest-neighbor recall; relevant memories get pushed into the Brain before the request.

🔬 What Happens on Each `ask(...)`

Session prep
get_session_messages(...) loads/creates session state (DB-backed if session_name is set).
Semantic recall
add_memories_to_brain(...):
- Embed the current question.
- Query HNSW for top-3 neighbors (search_nodes).
- For each neighbor with Euclidean distance < 1.0, push its Memory into the Brain.
- Rebuild the Brain preamble and update session preamble messages.
- Preamble + prompt shaping
Apply pre_user_message_content and/or post_user_message_content from ChatTemplate.
Completion
- If should_stream == Some(true): stream_response prints blue/bold tokens live.
- Else: fetch_response aggregates the content once.
Persistence
- Assistant reply is stored in the session DB (if sessions enabled).
- If the rolling conversation later overflows: oldest pair is ejected, embedded, added to VectorStore, and the index is rebuilt.

🛠️ Minimal Setup

#![allow(unused)]
fn main() {
use awful_aj::{
  api, brain::Brain,
  config::AwfulJadeConfig,
  template::ChatTemplate,
  vector_store::VectorStore,
};

async fn run() -> Result<(), Box<dyn std::error::Error>> {
    let cfg = AwfulJadeConfig {
    api_key: "KEY".into(),
    api_base: "http://localhost:5001/v1".into(),
    model: "jade_qwen3_4b".into(),
    context_max_tokens: 8192,
    assistant_minimum_context_tokens: 2048,
    stop_words: vec![],
    session_db_url: "aj.db".into(),
    session_name: Some("memories-demo".into()), // ✅ enable sessions
    should_stream: Some(false),
    };

    let tpl = ChatTemplate {
    system_prompt: "You are Awful Jade. Use recalled notes if relevant. Be concise.".into(),
    messages: vec![],
    response_format: None,
    pre_user_message_content: None,
    post_user_message_content: None,
    };

    // Long-term memory store (requires MiniLM at config_dir()/all-mini-lm-l12-v2)
    let mut store = VectorStore::new(384, "memories-demo".into())?;

    // Working memory (brain) with its own token budget
    let mut brain = Brain::new(8092, &tpl);

    // Ask a question; add_memories_to_brain will auto-inject relevant neighbors
    let answer = api::ask(&cfg, "What is our project codename?".into(), &tpl,
                        Some(&mut store), Some(&mut brain)).await?;

    println!("{answer}");

    Ok(()) 
}
}

✅ Remember: After inserts to the VectorStore, call build() to make them searchable.

🧱 Seeding & Persisting the `VectorStore`

Seed once, then reuse across runs by deserializing.

#![allow(unused)]
fn main() {
use async_openai::types::Role;
use awful_aj::{brain::Memory, vector_store::VectorStore};
use std::path::PathBuf;

fn seed() -> Result<(), Box<dyn std::error::Error>> {
    let mut vs = VectorStore::new(384, "memories-demo".into())?;

    // Add whatever you want AJ to recall later:
    for s in [
    "Project codename is Alabaster.",
    "Primary repo is awful_aj owned by graves.",
    ] {
        let v = vs.embed_text_to_vector(s)?;
        vs.add_vector_with_content(v, Memory::new(Role::User, s.to_string()))?;
    }

    vs.build()?; // 🔔 finalize the index

    // Persist metadata (YAML) and the HNSW index (binary)
    vs.serialize(&PathBuf::from("vector_store.yaml"), "memories-demo".into())?;

    Reload later:

    use awful_aj::vector_store::VectorStore;

    fn load() -> Result<VectorStore, Box<dyn std::error::Error>> {
    let yaml = std::fs::read_to_string("vector_store.yaml")?;
    let vs: VectorStore = serde_yaml::from_str(&yaml)?; // reload model + HNSW under the hood

    Ok(vs)
}
}

🎛️ Tuning Dials

context_max_tokens (config): hard ceiling for the request construction.
assistant_minimum_context_tokens (config): budget for assistant-side context within your flow.
Brain::max_tokens: separate budget for the working memory JSON envelope.
Vector recall: fixed to top-3 neighbors; include a memory if distance < 1.0 (Euclidean).
Stop words: forwarded to the model; useful to avoid run-ons.
Streaming: set should_stream = Some(true) for token-by-token prints.

🧪 If you frequently fail to recall useful notes, consider:

Seeding more atomic memories (short, self-contained sentences).
Lowering the distance threshold a bit (more inclusive), or raising it (more precise).
Ensuring you rebuilt (build()) after inserts.
Verifying the model path exists under config_dir()/all-mini-lm-l12-v2.

🧠 How the `Brain` Builds the Preamble

Every request gets a consistent, compact preamble:

System — template.system_prompt
User — a short paragraph + serialized brain JSON:

{
  "about": "This JSON object is a representation of our conversation leading up to this point. This object represents your memories.",
  "memories": [
    {"role":"user","content":"..."},
    {"role":"assistant","content":"..."}
  ]
}

Assistant — "Ok" (explicit acknowledgment)

This handshake primes the model with the latest, budget-friendly state before your new user message.

⛑️ Eviction: When the brain is over budget, it evicts oldest first and rebuilds the preamble. (Current implementation computes token count once; if you expect heavy churn, recomputing inside the loop would enforce the limit more strictly.)

🔁 Ejection → Embedding → Recall

When conversation history grows too large:

Oldest user+assistant pair is ejected from session_messages.
If a VectorStore is present:
- Each piece is embedded, assigned an ID, and added to the HNSW index.
- build() is called so they become searchable.
- On the next ask(...), the current question is embedded, top-3 neighbors are fetched, and any with distance < 1.0 get pushed into the Brain as memories.

Effect: older turns become semantic breadcrumbs you can recall later. 🍞🧭

🧰 Recipes

“Pin a fact” for later.

Drop a fact into the store right now so future questions recall it.

#![allow(unused)]
fn main() {
use async_openai::types::Role;
use awful_aj::{brain::Memory, vector_store::VectorStore};

fn pin(mut store: VectorStore) -> Result<(), Box<dyn std::error::Error>> {
    let fact = "Billing portal lives at https://hackme.example.com.";
    let v = store.embed_text_to_vector(fact)?;
    store.add_vector_with_content(v, Memory::new(Role::User, fact.into()))?;
    store.build()?; // make it queryable

    Ok(())
}
}

"Cold start" with a loaded brain.

Start a session by injecting a few memories before the first question.

#![allow(unused)]
fn main() {
use async_openai::types::Role;
use awful_aj::{brain::{Brain, Memory}, template::ChatTemplate};
use awful_aj::session_messages::SessionMessages;

fn warmup(mut brain: Brain, tpl: &ChatTemplate) -> Result<(), Box<dyn std::error::Error>> {
    let mut sess = SessionMessages::new(/* your cfg */ todo!());

    for seed in ["You are AJ.", "User prefers concise answers."] {
        brain.add_memory(Memory::new(Role::User, seed.into()), &mut sess);
    }

    let preamble = brain.build_preamble()?; // now ready
    assert!(!preamble.is_empty());

    Ok(())
}
}
}

🪵 Logging & Debugging

Enable tracing to see:
- brain token enforcement logs
- serialized brain JSON
- streaming events and request metadata (debug)
If the model prints nothing in streaming mode, confirm your terminal supports ANSI and that stdout isn’t redirected without a TTY.
If deserialization fails, verify:
- vector_store.yaml exists and points to a matching <uuid>_hnsw_index.bin in config_dir().
- all-mini-lm-l12-v2 is present (e.g., after aj ask "Hello world!").

🔐 Privacy

Everything runs local by default:

Embeddings and HNSW files live under your platform config dir (config_dir()).
Session DB is local.
Only your configured model endpoint receives requests.

✅ Quick Checklist

Place MiniLM at config_dir()/all-mini-lm-l12-v2 (or run your installer).
Use VectorStore::new(384, session_name); after inserts, call build().
Enable sessions with session_name: Some(...) for ejection/persistence.
Provide Some(&mut store), Some(&mut brain) to api::ask(...) for semantic recall.
Tune context_max_tokens, assistant_minimum_context_tokens, and Brain::max_tokens.
(Optional) Set a JSON schema on template.response_format for structured replies.

Privacy note: Everything is local by default. Keep secrets… consensual. 🤫

Downstream Projects 🌱

Awful Jade (aj) isn’t just a one-off CLI toy—it’s a foundation. Because it’s built in Rust 🦀 with a clean separation of concerns (config, templates, sessions, memories, etc.), people can (and should!) build on top of it.

This page is a living showcase of projects that extend, remix, or depend on Awful Jade. Think of it as a family tree 🌳—where every branch adds a new capability, a new perspective, or just some another annoying edge case languishing in the issues tab.

✨ Why Downstream?

Ecosystem Growth 🌍: aj handles the hard parts (embedding, sessions, vector search), so you can focus on the fun bits (playing Bookclub Rummy in your terminal with Aristotle).
Composable by Design 🧩: Use aj as a CLI, a library, or a background service.
Community-Driven 💬: The best projects come from folks solving their own problems—and sharing back.

🚧 Examples of What You Could Build

(Whether or not they already exist, these are the kind of things aj invites you adapt to your own needs!)

Study Buddy 📚
- A custom frontend where aj helps generate flashcards, quizzes, or summaries from your study materials.
Terminal Therapist 🛋️
- Hook aj’s interactive mode into your daily journal or notes app. Let it respond, remember, and gently roast you.
Knowledgebase Copilot 🗄️
- Wire aj into your company’s docs (Markdown, Confluence, Notion) and let it provide fast, context-aware Q&A.
Creative Writing Sidekick ✍🏽
- Use templates to inspire short stories, scripts, or dialogue. aj provides plot twists on demand.
Rust Library Consumers 🔧
- Import aj as a Rust crate (awful_aj) and embed it into your own CLI, TUI, or service.

📜 Currently Known Projects

Awful Security News
Bookclub Rummy
Awful Book Sanitizer
Awful Knowledge Synthesizer
Awful Dataset Builder
Your Project Here ✨ – Submit a PR to add it!

🤝 Add Yours!

Do you have a downstream project using AJ? Big or small, silly or serious, we’d love to see it here.

👉 Open a PR to this repo and add your project under Current Known Projects.

Let’s grow this ecosystem together. 🌟

💡 Remember: Awful Jade may have an awful name, but it’s gives good brain. What will you build with it? 🧠

Moldable Outputs 🧪🧱

Make aj’s output feel at home in your terminal: render Markdown with code blocks and syntax highlighting, keep a recent transcript, and customize behavior per shell.

Nushell output molding example

🙀 This page includes ready-to-use helpers for Nushell, bash, and zsh.

🐚 Nushell (author’s version)

Drop this in ($nu.data-dir)/scripts/aj.nu and source it (or add use in your config).

export def a [text: string] {
  $env.RUST_LOG = ""
  rm -f ~/.cache/aj.history
  let question = $text
  aj ask -t default -s default $text | tee { save ~/.cache/aj.history }
  clear
  print $"($question)\n\n"
  mdcat ~/.cache/aj.history
}

export def ai [session: string] {
  aj i -t refactor_rust -s $session
}

Usage:

a "explain lifetimes with a tiny code example"
ai refactor-session-1

💡 Tip: mdcat renders Markdown beautifully with fenced code blocks. If you don’t have it, install via your package manager (e.g., Homebrew: brew install mdcat). Alternatives: glow, bat -l markdown.

🧼 bash

Add this to ~/.bashrc (then source ~/.bashrc). It mirrors the Nushell flow:

clears any old transcript
asks with a chosen template/session
saves raw output to ~/.cache/aj.history
clears the screen
prints the prompt you asked
pretty-renders the Markdown transcript

# aj: ask and pretty-print Markdown response
ajmd() {
  export RUST_LOG=""
  mkdir -p "$HOME/.cache"
  rm -f "$HOME/.cache/aj.history"

  local question
  question="$*"
  if [ -z "$question" ]; then
    echo "usage: ajmd <question...>" >&2
    return 2
  fi

  # Run once, save transcript
  aj ask -t default -s default "$question" | tee "$HOME/.cache/aj.history" >/dev/null

  # Present nicely
  clear
  printf "%s\n\n" "$question"

  if command -v mdcat >/dev/null 2>&1; then
    mdcat "$HOME/.cache/aj.history"
  elif command -v glow >/dev/null 2>&1; then
    glow -p "$HOME/.cache/aj.history"
  elif command -v bat >/dev/null 2>&1; then
    bat --paging=never -l markdown "$HOME/.cache/aj.history"
  else
    # Fallback without highlighting
    cat "$HOME/.cache/aj.history"
  fi
}

# aj interactive with a handy default refactor template
ajrepl() {
  local session="$1"
  if [ -z "$session" ]; then
    echo "usage: ajrepl <session-name>" >&2
    return 2
  fi
  aj i -t refactor_rust -s "$session"
}

Examples:

ajmd "write a tiny Rust iterator adapter and test"
ajrepl refactor-session-2

🌀 zsh

Add this to ~/.zshrc (then source ~/.zshrc). Same behavior as bash.

# aj: ask and pretty-print Markdown response
function ajmd() {
  export RUST_LOG=""
  mkdir -p "$HOME/.cache"
  rm -f "$HOME/.cache/aj.history"

  local question
  question="$*"
  if [[ -z "$question" ]]; then
    print -u2 "usage: ajmd <question...>"
    return 2
  fi

  aj ask -t default -s default "$question" | tee "$HOME/.cache/aj.history" >/dev/null

  clear
  printf "%s\n\n" "$question"

  if (( $+commands[mdcat] )); then
    mdcat "$HOME/.cache/aj.history"
  elif (( $+commands[glow] )); then
    glow -p "$HOME/.cache/aj.history"
  elif (( $+commands[bat] )); then
    bat --paging=never -l markdown "$HOME/.cache/aj.history"
  else
    cat "$HOME/.cache/aj.history"
  fi
}

# aj interactive with a handy default refactor template
function ajrepl() {
  local session="$1"
  if [[ -z "$session" ]]; then
    print -u2 "usage: ajrepl <session-name>"
    return 2
  fi
  aj i -t refactor_rust -s "$session"
}

Examples:

ajmd "explain pinning in Rust with a minimal example"
ajrepl rust-notes-1

💡 Notes & Tips

Template & Session: Change -t default -s default to any template/session you prefer (e.g., -t reading_buddy, -s gita-study).
History location: Adjust ~/.cache/aj.history if you want per-session files (e.g., ~/.cache/aj.$(date +%s).md) or per-template logs.
Renderers:
- mdcat → rich Markdown (links, tables, code fences)
- glow -p→ pager mode
- bat -l markdown → quick highlighting (no Markdown rendering)
- Paging: To page long outputs, pipe to less -R:
  - mdcat ~/.cache/aj.history | less -R
Noise-free: RUST_LOG="" silences Rust log output so your Markdown stays clean.

🎛️✨ Have fun molding outputs to your terminal flow!