Awful Jade (aj
) 🌲
aj
is your command-line sidekick for working with Large Language Models (LLMs) — fast, friendly, and a dependable. 🦀💬
🌐 Project Links
- GitHub Repository 🐙 — Open source code, issues, and contributions welcome.
- Docs.rs Documentation 📖 — Full API reference, kept in sync with crate releases.
- Crates.io Package 📦 — Install instructions, versions, and metadata.
LLM Swiss Army knife with the best intentions. 😇
🤔 Why aj
?
- Speed & Safety: Written in Rust for rock‑solid performance.
- Talk or Tinker: One‑shot Q&A ✨ or full interactive sessions 🗣️.
- Memories: Vectordb search + embeddings so it “remembers” helpful context 🧠.
- YAML All the Things: Configs & prompt templates without tears ☕️.
⚡ TL;DR
What You’ll Find in This Book 📚
- Install: macOS, Linux, Windows — you’ll be up and running in minutes.
- Use: Quick commands, interactive mode, library usage for Rust devs.
- Config: Tune models, context windows, paths.
- Templates: Build your prompts like a pro.
- Sessions & Memories: Understand how AJ recalls things.
- Downstream Projects: See how others extend aj.
Pro tip: AJ auto‑downloads the BERT embedding model (
all-mini-lm-l12-v2
) when needed. No fuss. 📦⬇️
aj ask "What's an angel?"
Install (multiplatform)
Jade was built to work on all 3 major operating systems, aarch64 and x86_64. Installation steps are similar across all three but documentation for them is provided for convenience.
Jump to:
Install on macOS 🍎
✅ Requirements
- Miniconda (recommended) 🐍
- Python 3.11
- PyTorch 2.4.0
1. Python via conda 🧪
brew install miniconda # or use the official installer
conda create -n aj python=3.11 -y
conda activate aj
2. Install PyTorch 2.4.0 🧱
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cp
3. Environment setup 🌿
Add to your shell init (e.g., ~/.zshrc
):
export LIBTORCH_USE_PYTORCH=1
export LIBTORCH="/opt/homebrew/Caskroom/miniconda/base/pkgs/pytorch-2.4.0-py3.11_0/lib/python3.11/site-packages/torch"
export DYLD_LIBRARY_PATH="$LIBTORCH/lib"
4. Install from crates.io and initialize 📦
cargo install awful_aj
cargo init
cargo init
creates:
~/Library/Application Support/com.awful-sec.aj/
~/Library/Application Support/com.awful-sec.aj/config.yaml
~/Library/Application Support/com.awful-sec.aj/templates
~/Library/Application Support/com.awful-sec.aj/templates/default.yaml
~/Library/Application Support/com.awful-sec.aj/templates/simple_question.yaml
5. Prepare the Session Database (SQLite) 📂
aj
stores sessions, messages, and configs in a local SQLite3 database (aj.db
).
You have two ways to provision it:
Option A — Without Diesel CLI (raw sqlite3)
This is the minimal approach if you don’t want extra tooling.
# Create the DB file
sqlite3 "$HOME/Library/Application Support/com.awful-sec.aj/aj.db" <<'SQL'
PRAGMA foreign_keys = ON;
CREATE TABLE IF NOT EXISTS conversations (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
session_name TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
dynamic BOOLEAN NOT NULL DEFAULT 0,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS awful_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
api_base TEXT NOT NULL,
api_key TEXT NOT NULL,
model TEXT NOT NULL,
context_max_tokens INTEGER NOT NULL,
assistant_minimum_context_tokens INTEGER NOT NULL,
stop_words TEXT NOT NULL,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
SQL
Verify tables:
sqlite3 "$HOME/Library/Application Support/com.awful-sec.aj/aj.db" ".tables"
Option B — With Diesel CLI 🛠️
This is recommended if you want migrations and a typed schema.rs.
Grab the awful_aj
git repo.
git clone https://github.com/graves/awful_aj
cd awful_aj
Install Diesel CLI for SQLite.
cargo install diesel_cli --no-default-features --features sqlite
Configure database URL and run migrations.
export DATABASE_URL="$HOME/Library/Application Support/com.awful-sec.aj/aj.db"
diesel migration run
6. First‑run model download ⤵️
On first use needing embeddings, aj
downloads all-mini-lm-l12-v2
from https://awfulsec.com/bigfiles/all-mini-lm-l12-v2.zip into:
~/Library/Application Support/com.awful-sec.aj/
You’re ready! ✅
Try:
aj ask "Hello from macOS!"
Install on Linux 🐧
✅ Requirements
1. Python via conda 🧪
# Install Miniconda (example for Debian/Ubuntu)
sudo apt-get update
sudo apt-get install -y wget bzip2
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# Create and activate environment
conda create -n aj python=3.11 -y
conda activate aj
2. Install PyTorch 2.4.0 🧱
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cp
3. Environment setup 🌿
Add to your shell init (e.g., ~/.zshrc
):
export LIBTORCH_USE_PYTORCH=1
export LIBTORCH="$HOME/miniconda3/envs/aj/lib/python3.11/site-packages/torch"
export LD_LIBRARY_PATH="$LIBTORCH/lib:$LD_LIBRARY_PATH"
4. Install from crates.io and initialize 📦
cargo install awful_aj
cargo init
cargo init
creates:
~/.config/aj/
~/.config/aj/config.yaml
~/.config/aj/templates/
~/.config/aj/templates/default.yaml
~/.config/aj/templates/simple_question.yaml
5. Prepare the Session Database (SQLite) 📂
aj
stores sessions, messages, and configs in a local SQLite3 database (aj.db
).
You have two ways to provision it:
Option A — Without Diesel CLI (raw sqlite3)
This is the minimal approach if you don’t want extra tooling.
sqlite3 ~/.config/aj/aj.db <<'SQL'
PRAGMA foreign_keys = ON;
CREATE TABLE IF NOT EXISTS conversations (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
session_name TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
dynamic BOOLEAN NOT NULL DEFAULT 0,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS awful_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
api_base TEXT NOT NULL,
api_key TEXT NOT NULL,
model TEXT NOT NULL,
context_max_tokens INTEGER NOT NULL,
assistant_minimum_context_tokens INTEGER NOT NULL,
stop_words TEXT NOT NULL,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
SQL
Verify tables:
sqlite3 ~/.config/aj/aj.db ".tables"
Option B — With Diesel CLI 🛠️
This is recommended if you want migrations and a typed schema.rs.
Grab the awful_aj
git repo.
git clone https://github.com/graves/awful_aj
cd awful_aj
Install Diesel CLI for SQLite.
cargo install diesel_cli --no-default-features --features sqlite
Configure database URL and run migrations.
export DATABASE_URL="$HOME/.config/aj/aj.db"
diesel migration run
6. First‑run model download ⤵️
On first use needing embeddings, aj
downloads all-mini-lm-l12-v2
from https://awfulsec.com/bigfiles/all-mini-lm-l12-v2.zip into:
~/.config/aj/
You’re ready! ✅
Try:
aj ask "Hello from macOS!"
Install on Windows 🪟
✅ Requirements
- Miniconda (recommended) 🐍
- Python 3.11
- PyTorch 2.4.0
- SQLite3 (or use Diesel CLI for migrations)
1. Python via conda 🧪
Open PowerShell (with Conda available on PATH
):
winget install miniconda3 # or use the official installer
conda create -n aj python=3.11 -y
conda activate aj
2. Install PyTorch 2.4.0 🧱
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cp
3. Environment setup 🌿
Add these to your shell init PowerShell profile:
$PROFILE
→ e.g. C:\Users\YOU\Documents\PowerShell\Microsoft.PowerShell_profile.ps1
:
$env:LIBTORCH_USE_PYTORCH = "1"
$env:LIBTORCH = "C:\Users\YOU\miniconda3\envs\aj\Lib\site-packages\torch"
$env:PATH = "$env:LIBTORCH\lib;$env:PATH"
Reload your profile or open a new shell.
4. Install from crates.io and initialize 📦
cargo install awful_aj
aj init
aj init
creates:
C:\Users\YOU\AppData\Roaming\awful-sec\aj\
config.yaml
templates\
templates\default.yaml
templates\simple_question.yaml
5. Prepare the Session Database (SQLite) 📂
aj
stores sessions, messages, and configs in a local SQLite3 database (aj.db
).
You have two ways to provision it:
Option A — Without Diesel CLI (raw sqlite3)
Minimal setup if you don’t want extra tooling. Ensure you have sqlite3.exe
in PATH
.
$DB="$env:APPDATA\awful-sec\aj\aj.db"
sqlite3 $DB @"
PRAGMA foreign_keys = ON;
CREATE TABLE IF NOT EXISTS conversations (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
session_name TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
dynamic BOOLEAN NOT NULL DEFAULT 0,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS awful_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
api_base TEXT NOT NULL,
api_key TEXT NOT NULL,
model TEXT NOT NULL,
context_max_tokens INTEGER NOT NULL,
assistant_minimum_context_tokens INTEGER NOT NULL,
stop_words TEXT NOT NULL,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
"@
Verify tables:
sqlite3 $DB ".tables"
Option B — With Diesel CLI 🛠️
Recommended if you want migrations and a typed schema.rs.
- Grab the repo:
git clone https://github.com/graves/awful_aj
cd awful_aj
- Install Diesel CLI for SQLite:
cargo install diesel_cli --no-default-features --features sqlite
- Configure database URL and run migrations:
$env:DATABASE_URL="$env:APPDATA\awful-sec\aj\aj.db"
diesel migration run
6. First-run model download ⤵️
On first use needing embeddings, aj
downloads all-mini-lm-l12-v2 from
https://awfulsec.com/bigfiles/all-mini-lm-l12-v2.zip into:
C:\Users\YOU\AppData\Roaming\awful-sec\aj\
✅ Quick sanity check
aj ask "Hello from Windows!"
Build from Source 🧱
Want to hack on aj
? Let’s go! 🧑💻
🤢 Install dependencies
brew install miniconda # or use the official installer
conda create -n aj python=3.11 -y
conda activate aj
pip install torch==2.4.0 --index-url https://download.pytorch.org/whl/cp
export LIBTORCH_USE_PYTORCH=1
export LIBTORCH="/opt/homebrew/Caskroom/miniconda/base/pkgs/pytorch-2.4.0-py3.11_0/lib/python3.11/site-packages/torch"
export DYLD_LIBRARY_PATH="$LIBTORCH/lib"
🛠️ Clone & Build
git clone https://github.com/graves/awful_aj.git
cd awful_aj
cargo build
✅ Run Tests
cargo test
Tip: If you modify features that touch embeddings, ensure your Python + PyTorch environment is active before running commands that exercise memory/vector search.
🧯 Common Troubleshooting
- Linker/PyTorch libs not found: Recheck the
LIBTORCH
environment variable and your platform’s dynamic library path env var (DYLD_LIBRARY_PATH
on macOS,LD_LIBRARY_PATH
on Linux,PATH
on Windows). - Model not downloading:
- Ensure the config directory exists and is writable. See Config Paths on your OS's Install page.
- Check your network connection.
Use aj
🚀
AJ can be used in three primary ways:
- Init: Bootstrap local config/templates/DB 🏗️
- Ask: One‑shot Q&A ✨
- Interactive: Chat with memory 🧠
- As a Library: Embed AJ in your Rust code 🧩
Jump to:
- 👉 Init
- 👉 Ask
- 👉 Interactive
- 👉 As a Library
aj init
🏗️
Create default config, templates, and the session database.
aj init
📁 What it creates
config.yaml
with sensible defaultstemplates/default.yaml
,templates/simple_question.yaml
- A SQLite database
aj.db
for sessions
📍 Where these live
- macOS:
~/Library/Application Support/com.awful-sec.aj/
- Linux:
~/.config/aj/
- Windows:
C:\\Users\\YOU\\AppData\\Roaming\\awful-sec\\aj\\
🙋🏻♀️ Help
aj init --help
Initialize configuration and default templates in the platform config directory.
Creates the config file and a minimal template set if they don’t exist yet.
Usage: aj init
Options:
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
aj ask
✨
Ask a single question and print the assistant’s response.
aj ask "Is Bibi really from Philly?"
🔧 Options
--template
: Use a specific prompt template. --model
: Override model for this question. --session
: Session name for long running conversations.
✅ When to use
- Quick facts, transformations, summaries.
- Scriptable one‑liners in shell pipelines.
- Modify the default template and add a session name to give your computer a personality.
🙋🏻♀️ Help
λ aj ask --help
Ask a single question and print the assistant’s response.
If no `question` is provided, the application supplies a default prompt.
Aliases: `a`
Usage: aj ask [OPTIONS] [QUESTION]
Arguments:
[QUESTION]
The question to ask. When omitted, a default question is used
Options:
-t <template>
Name of the chat template to load (e.g., `simple_question`).
Templates live under the app’s config directory, usually at: - macOS: `~/Library/Application Support/com.awful-sec.aj/templates/` - Linux: `~/.config/aj/templates/` - Windows: `%APPDATA%\\com.awful-sec\\aj\\templates\\`
-s <session>
Session name. When set, messages are persisted under this conversation.
Using a session enables retrieval-augmented context from prior turns.
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
aj interactive
🗣️
Start a REPL‑style chat. aj
uses a vectordb to store embeddings of past messages and recalls relevant prior context.
aj interactive
🧠 Features
Remembers salient turns via HNSW + sentence embeddings.
- Limits total tokens to your configured quota (oldest context trimmed)
- Supports templates and system prompts
💡 Pro Tips
aj interactive
expects an ASCII escape code to send your message. On macOS that'sCtrl-d
.- Send
exit
orCtrl-c
to exit the REPL.
🙋🏻♀️ Help
λ aj interactive --help
Start an interactive REPL-style conversation.
Prints streaming assistant output (when enabled) and persists messages if a session name is configured by the application.
Aliases: `i`
Usage: aj interactive [OPTIONS]
Options:
-t <template>
Name of the chat template to load (e.g., `simple_question`)
-s <session>
Session name for the conversation
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
🧩 Use as a Library
Bring awful_aj
into your own Rust projects—reuse the same high-level chat plumbing that powers the CLI. 🦀💬
⚠️ Note: The public API may evolve. Check docs.rs for signatures.
📦 Add Dependency
# Cargo.toml
[dependencies]
awful_aj = "*"
tokio = "1.45.0"
🐆 Quickstart
use awful_aj::{ api::ask, config::AwfulJadeConfig, template::{self, ChatTemplate}, }; #[tokio::main] async fn main() -> Result<(), Box<dyn Error + Send + Sync>> { let config: AwfulJadeConfig = awful_aj::config::load_config("somewhere/config.yaml") let template: ChatTemplate = template::load_template("book_txt_sanitizer") .await .map_err(|e| format!("Template load error: {e}"))?; let res = ask(config, chunk.to_string(), template, None, None).await; Ok(()) }
🔎 API Highlights
AwfulJadeConfig
: Load/override runtime settings.awful_jade::api::ask(..., None, None)
: One‑shot Q&A.awful_jade::api::ask(..., vector_store, brain)
: Conversations with relevant context injected that falls outside your models maximum context length.- In-memory vectordb with flat-file persistence, powers
aj
's memory helpers behind the scenes.
🐆 Quickstart: One-Shot Q&A (Non-Streaming)
Uses api::ask
with no session and no memory. Minimal + predictable. ✅
use std::error::Error; use awful_aj::{ api, config::AwfulJadeConfig, template::ChatTemplate, }; #[tokio::main] async fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // Build config explicitly for clarity. (You can load from your own source if you prefer.) let cfg = AwfulJadeConfig { api_key: "YOUR_KEY".into(), api_base: "http://localhost:5001/v1".into(), // OpenAI-compatible endpoint model: "qwen3_30b_a3".into(), context_max_tokens: 32768, assistant_minimum_context_tokens: 2048, stop_words: vec![], // forwarded to the request session_db_url: "aj.db".into(), // unused when session_name is None session_name: None, // no session persistence should_stream: Some(false), // non-streaming }; let tpl = ChatTemplate { system_prompt: "You are Qwen, a helpful assistant.".into(), messages: vec![], // extra seed messages if you want response_format: None, // set to a JSON schema for structured output pre_user_message_content: None, // optional prepend to user input post_user_message_content: None, // optional append to user input }; let answer = api::ask(&cfg, "Hello from my app!".into(), &tpl, None, None).await?; println!("assistant: {answer}"); Ok(()) }
What happens under the hood 🧠
- Builds a Client → prepares a preamble (system + template messages).
- Applies
pre_user_message_content
and/orpost_user_message_content
. - Sends one non-streaming request (because s
hould_stream = Some(false)
). - Returns the assistant’s text (and persists to DB only if sessions are enabled—see below).
📺 Streaming Responses (Live Tokens!)
Set should_stream = Some(true)
and still call api::ask(...)
. The tokens print to stdout in blue/bold as they arrive (and you still get the final text returned).
#![allow(unused)] fn main() { let mut cfg = /* ... as above ... */ AwfulJadeConfig { // ... should_stream: Some(true), // ... }; let tpl = /* ... */; }
📝 Note: The streaming printer uses crossterm for color + attributes. It writes to the locked stdout and resets formatting at the end.
🧵 Sessions: Persistent Conversations (with Optional Memory)
Turn on sessions by setting a session_name. When the rolling conversation exceeds the token budget, the oldest user/assistant pair is ejected and (if you provide a VectorStore) embedded + stored for later retrieval. 📚➡️🧠
use std::error::Error; use awful_aj::{api, config::AwfulJadeConfig, template::ChatTemplate}; #[tokio::main] async fn main() -> Result<(), Box<dyn Error + Send + Sync>> { let cfg = AwfulJadeConfig { // ... same as before ... session_db_url: "aj.db".into(), session_name: Some("getting-started".into()), // ✅ enable session should_stream: Some(false), .. /* the rest */ AwfulJadeConfig { api_key: "KEY".into(), api_base: "http://localhost:1234/v1".into(), model: "jade_qwen3_4b".into(), context_max_tokens: 32768, assistant_minimum_context_tokens: 2048, stop_words: vec![], session_db_url: "aj.db".into(), session_name: Some("getting-started".into()), should_stream: Some(false), } }; let tpl = ChatTemplate { system_prompt: "You are Awful Jade, created by Awful Security.".into(), messages: vec![], response_format: None, pre_user_message_content: None, post_user_message_content: None, }; // First turn let a1 = api::ask(&cfg, "Remember: my project is 'Alabaster'.".into(), &tpl, None, None).await?; println!("assistant: {a1}"); // Next turn—session context is restored from DB automatically: let a2 = api::ask(&cfg, "What's the codename I told you?".into(), &tpl, None, None).await?; println!("assistant: {a2}"); Ok(()) }
Session details 🗂️
get_session_messages
loads or seeds the conversation.- On overflow, the oldest pair is ejected and (if a VectorStore is provided) embedded + persisted; the HNSW index is rebuilt.
- On each call, the assistant reply is persisted to the session DB.
💡 You control the budget with
context_max_tokens
and the preamble budget withassistant_minimum_context_tokens
(used by the brain/preamble logic).
🧠 Adding Memories (Vector Search Assist)
If you provide a VectorStore
and a Brain
, nearby memories (Euclidean distance < 1.0) are injected into the brain’s preamble before the call. This is how long-term recall is blended in. 🧲🧠✨
// RAG demo use std::{fs, path::PathBuf}; use async_openai::types::Role; use std::error::Error; use awful_aj::{api, config::AwfulJadeConfig, template::ChatTemplate, brain::Brain, vector_store::VectorStore}; #[tokio::main] async fn main() -> Result<(), Box<dyn Error + Send + Sync>> { let cfg = AwfulJadeConfig { api_key: "KEY".into(), api_base: "http://localhost:5001/v1".into(), model: "gpt-4o-mini".into(), context_max_tokens: 8192, assistant_minimum_context_tokens: 2048, stop_words: vec![], session_db_url: "aj.db".into(), session_name: Some("mem-demo".into()), should_stream: Some(false), }; let tpl = ChatTemplate { system_prompt: "You are a helpful assistant that uses prior notes when relevant.".into(), messages: vec![], response_format: None, pre_user_message_content: None, post_user_message_content: None, }; // Create a brain that will reserve a maximum of 1,024 tokens of the inference's context window. let mut brain = Brain::new(1024, &tpl); // Construct your VectorStore // 384 dims for MiniLM (as per your VectorStore) let session_name = "docs-demo"; let mut store = VectorStore::new(384, session_name.to_string())?; // Seed a few memories (they could be doc chunks, FAQs, prior chat turns, etc.) let notes = [ "Project codename is Alabaster.", "Primary repository is `awful_aj` owned by `graves`.", "Use `aj interactive` for a REPL with memory.", "Templates live in the config directory under `templates/`.", ]; for s in notes { let v = vs.embed_text_to_vector(s)?; vs.add_vector_with_content(v, Memory::new(Role::User, s.to_string()))?; } // Finalize the HNSW index so queries will see inserts store.build()?; // Persist both the index (binary) and YAML metadata so you can rehydrate later 💦 (optional) let yaml_path = PathBuf::from("vector_store.yaml"); store.serialize(&yaml_path, session_name.to_string())?; // Later, a query that should recall a nearby memory (< 1.0 distance): let ans = api::ask(&cfg, "Who owns the repo again?".into(), &tpl, Some(&mut store), Some(&mut brain)).await?; println!("assistant: {ans}"); Ok(()) }
What add_memories_to_brain
does 🔍
- Embeds the current question.
- Looks up top-k neighbors (3) in the HNSW index.
- For neighbors with distance < 1.0, injects their content into the brain.
- Rebuilds the preamble so these memories ship with the request.
📏 Threshold and k are implementation details you can tune inside your VectorStore module if you hack on
awful_aj
.
🧪 Templates: Powerful Knobs (System, Seeds, & Post-Processing)
ChatTemplate
gives you flexible pre/post shaping without touching your app logic. 🎛️
system_prompt
: The authoritative behavior message.messages
: Seed messages (system/user/assistant) to anchor behavior or provide examples.pre_user_message_content
/post_user_message_content
: Lightweight way to wrap inputs (e.g., “Answer concisely.” / “Return JSON.”).response_format
: If present, it’s forwarded as a JSON Schema so that if your model supports Tool Calling or Structured Output the inference will only emit structured output. 🧩
🧰 For structured outputs, define the schema object your server expects and place it in
template.response_format
. For example:
{
"type": "object",
"properties": {
"sanitizedBookExcerpt": {
"type": "string"
}
}
}
🧯 Error Handling (What to Expect)
All public fns bubble up errors: API/network, I/O, (de)serialization, embeddings, DB, index build. Handle with Result<_, Box
🧪 Advanced: Call Streaming/Non-Streaming Primitives Directly
You can skip api::ask
and call the lower-level primitives if you need full control (custom prompt stacks, different persistence strategy, special output handling):
stream_response(...) -> ChatCompletionRequestMessage
fetch_response(...) -> ChatCompletionRequestMessage
These expect a Client
, a SessionMessages
you’ve prepared, and your AwfulJadeConfig
+ ChatTemplate
. They return the final assistant message object (you extract its text from AssistantMessageContent::Text
).
⚠️ This is expert-mode: you manage session assembly (
prepare_messages*
) and persistence yourself.
🎨 Creative Patterns (Recipes!)
Here are ideas that use only the public API you’ve exposed—copy/paste and riff. 🧑🏽🍳
1. Batch Q&A (Non-Streaming) 📚⚡
Process a list of prompts and collect answers.
#![allow(unused)] fn main() { async fn batch_answer( cfg: &awful_aj::config::AwfulJadeConfig, tpl: &awful_aj::template::ChatTemplate, questions: impl IntoIterator<Item = String>, ) -> anyhow::Result<Vec<String>> { let mut out = Vec::new(); for q in questions { let a = awful_aj::api::ask(cfg, q, tpl, None, None).await?; out.push(a); } Ok(out) } }
2. "Sticky" Session Bot 🤝🧵
Keep a named session and call ask repeatedly—great for chat sidebars and Agents.
#![allow(unused)] fn main() { struct Sticky<'a> { cfg: awful_aj::config::AwfulJadeConfig, tpl: awful_aj::template::ChatTemplate, _marker: std::marker::PhantomData<&'a ()>, } impl<'a> Sticky<'a> { async fn send(&self, user_text: &str) -> anyhow::Result<String> { awful_aj::api::ask(&self.cfg, user_text.into(), &self.tpl, None, None).await.map_err(Into::into) } } }
3. "Context Sandwich" Wrapper 🥪
Standardize how you wrap user input with pre/post content—without changing the template.
#![allow(unused)] fn main() { async fn sandwich( cfg: &awful_aj::config::AwfulJadeConfig, base_tpl: &awful_aj::template::ChatTemplate, pre: &str, post: &str, user: &str, ) -> anyhow::Result<String> { let mut tpl = base_tpl.clone(); tpl.pre_user_message_content = Some(pre.into()); tpl.post_user_message_content = Some(post.into()); awful_aj::api::ask(cfg, user.into(), &tpl, None, None).await.map_err(Into::into) } }
4. Structured Outputs via Schema 🧾✅
Have your template include a JSON schema (set response_format
) so responses are machine-readable—perfect for pipelines. (Exact schema type depends on your async_openai
version; set it on template.response_format
and api::ask
will forward it.)
🗺️ Mental Model Recap
= Config (AwfulJadeConfig
) → client setup + budgets + behavior flags (streaming, sessions).
- Template (
ChatTemplate
) → system, seed messages, schema, pre/post hooks. - Session (
session_name
) → DB-backed rolling history with ejection on overflow. - Memory (
VectorStore
+Brain
) → ejected pairs get embedded; nearest neighbors (< 1.0
) are injected into the preamble next time. - Modes → streaming (
should_stream: true
) vs non-streaming (false/None
).
You choose how many of those dials to turn. 🎛️😄
✅ Checklist for Production
- Pin crate versions.
- Decide on streaming vs non-streaming per use-case.
- If you want history, set session_name.
- If you want long-term recall, wire a
VectorStore
and aBrain
. - Establish sensible stop words and token budgets.
- Consider a JSON schema when you need structured output.
Happy hacking! 🧩🧠✨
Config ⚙️
aj
reads a YAML configuration file from your platform directory.
📍 Paths
- macOS:
~/Library/Application Support/com.awful-sec.aj/config.yaml
- Linux:
~/.config/aj/config.yaml
- Windows:
C:\\Users\\YOU\\AppData\\Roaming\\awful-sec\\aj\\config.yaml
🧾 Example
api_base: "http://localhost:1234/v1"
api_key: "CHANGEME"
model: "jade_qwen3_4b_mlx"
context_max_tokens: 8192
assistant_minimum_context_tokens: 2048
stop_words:
- "<|im_end|>\\n<|im_start|>"
- "<|im_start|>\n"
session_db_url: "/Users/you/Library/Application Support/com.awful-sec.aj/aj.db"
should_stream: true
🔑 Key Fields
api_base
: Where requests go.api_key
: Secret authorization key (optional).model
: LLM model to use.context_max_tokens
: Context length to request from model.assistant_minimum_context_tokens
: Will always eject messages from the context if needed to make room for this many tokens to fit the response.stop_words
: Tokens to cut off model output.session_db_url
: SQLite DB path for sessions (optional). =should_stream
: Whether to stream the response or return it all at once when the inference ends.
✍️ Editing Tips
- After edits, re‑run your command (no daemon reloads required).
- Make sure you include the port number your LLM inference server is running on.
- If your using an online service you can usually create an API key on your account settings or api page.
jade_qwen3_4b_mlx
is a highly capable Qwen 3 4B model that I finetuned for Apple focused systems programming and general discussion. You can find it on huggingface or download it directly in LM Studio.
Templates 🧰
Templates are YAML files that shape prompts and behavior. Swap them per command or set a default in config.
📦 Default Locations
- macOS:
~/Library/Application Support/com.awful-sec.aj/templates/
- Linux:
~/.config/aj/templates/
- Windows:
C:\\Users\\YOU\\AppData\\Roaming\\awful-sec\\aj\\templates\\
Jump to:
Default Templates 🧩
aj init
seeds a couple of handy starters:
default.yaml
system_prompt: "You are Awful Jade, a helpful AI assistant programmed by Awful Security."
messages: []
simple_question.yaml
system_prompt: "Answer the user clearly and concisely."
messages: []
Use them as‑is or as a base for your own.
Template Examples 🎭
Templates are at the heart of how aj guides conversations. They define who the assistant is, how the dialogue begins, and (optionally) what structure the responses must take.
A template is a simple YAML file stored under:
- macOS:
~/Library/Application Support/com.awful-sec.aj/templates/
- Linux:
~/.config/aj/templates/
- Windows:
%APPDATA%\awful-sec\aj\templates\
Each template can be as minimal or as rich as you want—ranging from a single system prompt to a complex orchestration of JSON schemas and seeded conversation turns.
🧩 Anatomy of a Template
A template YAML file typically includes:
system_prompt
: The assistant’s role and global behavior.messages
: Preloaded conversation history (user and assistant turns) to “frame” new queries.response_format
(optional): A JSON schema that enforces structured outputs.pre_user_message_content
/post_user_message_content
(optional): Strings prepended/appended to every new user query before it is sent.
🎨 Example 1: Minimal Q&A
system_prompt: You are Jade, a concise technical assistant.
messages:
- role: user
content: How do I list files in Rust?
- role: assistant
content: |-
You can use `std::fs::read_dir`:
```rust
for entry in std::fs::read_dir(".")? {
let entry = entry?;
println!("{:?}", entry.path());
}
```
Now every query you make in this template context nudges the LLM to answer short and code-focused.
📐 Example 2: Structured JSON Output
Templates can preload conversation history to create guided behavior. For example, a "tutor" template might look like this:
system_prompt: You are Qwen, a professor.
messages:
- role: user
content: |-
Summarize this passage into three exam questions:
Blah blah blah, you can multiple the numbers to make them escape the parenthesis which actually means something deeper.
Some numbers just don't make no damn sense, like why they call it pie anyway. Get rooted noob.
- role: assistant
content: |-
{
"finalExamQuestion1": "Explain the distributive property with an example.",
"finalExamQuestion2": "Compare rational and irrational numbers.",
"finalExamQuestion3": "Simplify an expression using the laws of exponents."
}
response_format:
name: final_exam_question_generator
strict: true
description: Always return 3 exam questions.
schema:
type: object
properties:
finalExamQuestion1: { type: string }
finalExamQuestion2: { type: string }
finalExamQuestion3: { type: string }
required: [finalExamQuestion1, finalExamQuestion2, finalExamQuestion3]
By seeding a user prompt and assistant's JSON reply, you lock in both behavior and format.
📖 Example 3: Literary Tutor with Context
Templates can define strict output formats using the response_format
field.
For instance, here’s a template that enforces exam-style structured questions:
system_prompt: You are Jade, a world literature tutor.
messages:
- role: user
content: Who are the Pandavas in the Gita?
- role: assistant
content: |-
The Pandavas are the five sons of King Pandu:
- **Yudhishthira** (justice and truth)
- **Bhima** (strength)
- **Arjuna** (archery and discipline)
- **Nakula** and **Sahadeva** (the twins, beauty and wisdom)
They are cousins to the Kauravas, and their conflict drives the central drama.
```
Pandu → Yudhishthira, Bhima, Arjuna, Nakula, Sahadeva
Dhritarashtra → 100 Kauravas (incl. Duryodhana)
```
This family tree is crucial for understanding loyalties and rivalries in the Gita.
This ensures the LLM always explains relationships clearly, often including family trees.
🛠️ Example 4: Refactoring Code
system_prompt: You are Jade, an expert Rust refactoring assistant.
messages:
- role: user
content: Refactor this code to use iterators.
- role: assistant
content: |-
Sure! Here’s the refactored version:
```rust
let nums = vec![1, 2, 3, 4, 5];
let squares: Vec<_> = nums.iter().map(|x| x * x).collect();
println!("{:?}", squares);
```
This avoids indexing and uses iterator combinators idiomatically.
Notice how the assistant reply not only refactors but also explains why. Every future query in this template will follow that pattern.
✨ Practical Tips
- Always pair user + assistant in seeded messages if you want to strongly guide style.
- Use
response_format
for machine-readable guarantees (JSON, tables, etc.). - Use
pre_user_message_content
/post_user_message_content
for lightweight “wrapping” (like always appending/nothink
). - Keep multiple templates—switch roles instantly (
reading_buddy.yaml
,exam_generator.yaml
,code_refactor.yaml
).
🚀 Where to Go From Here
- Try starting with
simple_question.yaml
. - Copy it into
refactor_rust.yaml
orbook_knowledge_synthesizer.yaml
to see how far you can push complexity. - Remember: templates are just YAML. They can be versioned, shared, and tweaked freely.
Template Download 📥
Want to pull shared templates from a repo or gist? A simple pattern:
# Example: fetch a template into your templates dir
curl -L https://awfulsec.com/bigfiles/templates/news_parser.yaml \
-o "$($AJ_CONFIG_PATH)/templates/news_parser.yaml"
You can browse some example templates here: https://awfulsec.com/bigfiles/templates
Tip: Consider versioning your personal templates in a dotfiles repo. 🔖
Sessions 🗂️
aj
stores conversations in a local SQLite database so you can review, continue, or mine for insights. Sessions also capture config snapshots so you know which model/settings produced which answers. 🧠💾
📍 Where is it?
- Typically alongside your config under
aj.db
. - Change via
session_db_url
in your Config file.
🧰 What’s inside?
- Message turns (user / assistant)
- Template and model metadata
🧽 Maintenance
- Backup: copy the
.db
file whileaj
is not running. - Reset/Migrate: if you use Diesel CLI, run your migrations (optional, advanced).
diesel database reset --database-url "/Users/you/Library/Application Support/com.awful-sec.aj/aj.db"
📍 Where is it?
By default, next to your platform config directory as aj.db. Change this via session_db_url in your Config. Examples:
• macOS: ~/Library/Application Support/com.awful-sec.aj/aj.db
• Linux: ~/.config/aj/aj.db
• Windows: C:\Users\YOU\AppData\Roaming\awful-sec\aj\aj.db
Tip: Use absolute paths; Diesel’s DATABASE_URL and the app’s session_db_url should point to the same file when you run migrations.
🧰 What’s inside?
Three tables, exactly as modeled in your code:
conversations
– one row per session (session_name unique-ish namespace)messages
– one row per turn (system/user/assistant), FK → conversationawful_configs
– point-in-time snapshots of runtime settings, FK → conversation
Rust models (for reference)
Conversation { id, session_name }
Message { id, role, content, dynamic, conversation_id }
AwfulConfig { id, api_base, api_key, model, context_max_tokens, assistant_minimum_context_tokens, stop_words, conversation_id }
🧪 The Diesel schema (generated)
The Diesel schema.rs
corresponds to:
#![allow(unused)] fn main() { // @generated automatically by Diesel CLI. diesel::table! { awful_configs (id) { id -> Integer, api_base -> Text, api_key -> Text, model -> Text, context_max_tokens -> Integer, assistant_minimum_context_tokens -> Integer, stop_words -> Text, conversation_id -> Nullable<Integer>, } } diesel::table! { conversations (id) { id -> Integer, session_name -> Text, } } diesel::table! { messages (id) { id -> Integer, role -> Text, content -> Text, dynamic -> Bool, conversation_id -> Nullable<Integer>, } } diesel::joinable!(awful_configs -> conversations (conversation_id)); diesel::joinable!(messages -> conversations (conversation_id)); diesel::allow_tables_to_appear_in_same_query!( awful_configs, conversations, messages, ); }
Option A — Use Diesel CLI 🛠️
This is the most ergonomic way to create and evolve the DB.
- Get the schema
git clone https://github.com/graves/awful_aj
cd awful_aj
- Install Diesel CLI (SQLite only)
macOS / Linux
cargo install diesel_cli --no-default-features --features sqlite
Windows (PowerShell)
cargo install diesel_cli --no-default-features --features sqlite
On macOS you may need system SQLite headers: brew install sqlite
(and ensure pkg-config
can find it).
- Set your database URL
macOS
export DATABASE_URL="$HOME/Library/Application Support/com.awful-sec.aj/aj.db"
Linux
export DATABASE_URL="$HOME/.config/aj/aj.db"
Windows (PowerShell)
$env:DATABASE_URL = "$env:APPDATA\awful-sec\aj\aj.db"
- Run migrations
diesel migration run
diesel print-schema > src/schema.rs # keep your schema.rs in sync
- Reset / Recreate (when needed)
diesel database reset
# drops and recreates (uses up/down)
🧠 Gotcha: Always point
DATABASE_URL
to the same file your app will use (session_db_url
). If you migrate one file and run the app on another path, you’ll see "missing table" errors.
Option B — No CLI: Embedded Migrations (pure Rust) 🧰
If you don’t want to depend on the CLI, bundle SQL with your binary and run it on startup using diesel_migrations.
- Add the crate
#![allow(unused)] fn main() { Cargo.toml [dependencies] diesel = { version = "2", features = ["sqlite"] } diesel_migrations = "2" }
- Create an in-repo migrations folder
src/
migrations/
00000000000000_init_aj_schema/
up.sql
down.sql
Use the same SQL as in Option A’s up.sql
/down.sql
.
- Run at startup
#![allow(unused)] fn main() { use diesel::prelude::*; use diesel::sqlite::SqliteConnection; use diesel_migrations::{embed_migrations, MigrationHarness}; // Embed migrations from the `migrations/` dir const MIGRATIONS: diesel_migrations::EmbeddedMigrations = embed_migrations!("./migrations"); fn establish_connection(database_url: &str) -> SqliteConnection { SqliteConnection::establish(database_url) .unwrap_or_else(|e| panic!("Error connecting to {database_url}: {e}")) } pub fn run_migrations(database_url: &str) { let mut conn = establish_connection(database_url); conn.run_pending_migrations(MIGRATIONS) .expect("Migrations failed"); } }
Call run_migrations(&cfg.session_db_url)
once during app startup. ✅
Bonus: You can ship a single binary that self-provisions its SQLite schema on first run—no CLI needed.
Option C — No Diesel at all: Raw sqlite3 🪚
For ultra-minimal environments, create the file and tables directly.
macOS / Linux
sqlite3 "$HOME/Library/Application Support/com.awful-sec.aj/aj.db" <<'SQL'
PRAGMA foreign_keys = ON;
CREATE TABLE IF NOT EXISTS conversations (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
session_name TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
dynamic BOOLEAN NOT NULL DEFAULT 0,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS awful_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
api_base TEXT NOT NULL,
api_key TEXT NOT NULL,
model TEXT NOT NULL,
context_max_tokens INTEGER NOT NULL,
assistant_minimum_context_tokens INTEGER NOT NULL,
stop_words TEXT NOT NULL,
conversation_id INTEGER,
FOREIGN KEY(conversation_id) REFERENCES conversations(id) ON DELETE CASCADE
);
SQL
Windows (PowerShell):
#![allow(unused)] fn main() { $DB = "$env:APPDATA\awful-sec\aj\aj.db" sqlite3 $DB @" PRAGMA foreign_keys = ON; -- (same CREATE TABLE statements as above) "@ }
You can now use Diesel from your app against that file.
🔍 Verifying & Inspecting
List tables
sqlite3 "$DATABASE_URL" ".tables"
Peek at last 10 messages
sqlite3 "$DATABASE_URL" "SELECT id, role, substr(content,1,60) || '…' as snippet FROM messages ORDER BY id DESC LIMIT 10;"
Check a conversation by name
SELECT * FROM conversations WHERE session_name = 'default';
🧽 Maintenance
- Backup: copy the .db file while
aj
is not running. - Vacuum (reclaim space):
sqlite3 "$DATABASE_URL" "VACUUM;"
- Integrity check:
sqlite3 "$DATABASE_URL" "PRAGMA integrity_check;"
- Reset via Diesel:
diesel database reset
Tip: Enable foreign keys at connection open (PRAGMA foreign_keys = ON;). Diesel’s SQLite backend does not enforce this automatically unless you set the pragma on each connection (or in migrations as above).
🧯 Troubleshooting
- "no such table: conversations"
- You migrated a different file than you’re connecting to. Recheck
DATABASE_URL
vssession_db_url
. - Diesel CLI build fails on macOS
- Install headers: brew install sqlite and ensure pkg-config is available.
- Foreign keys not enforced
- Ensure
PRAGMA foreign_keys = ON;
is set (included inup.sql
). For safety, set it again immediately after opening each connection.
- Ensure
- Schema drift
- If you edit SQL manually, regenerate schema.rs with:
diesel print-schema > src/schema.rs
- If you edit SQL manually, regenerate schema.rs with:
🧪 Example: Insert a Conversation + Message (Diesel)
#![allow(unused)] fn main() { use diesel::prelude::*; use awful_aj::schema::{conversations, messages}; use awful_aj::models::{Conversation, Message}; fn demo(conn: &mut SqliteConnection) -> anyhow::Result<()> { let convo: Conversation = diesel::insert_into(conversations::table) .values(&Conversation { id: None, session_name: "demo".into() }) .returning(Conversation::as_returning()) .get_result(conn)?; let _msg: Message = diesel::insert_into(messages::table) .values(&Message { id: None, role: "user".into(), content: "Hi".into(), dynamic: false, conversation_id: convo.id, }) .returning(Message::as_returning()) .get_result(conn)?; Ok(()) } }
All set! Whether you prefer Diesel CLI, embedded migrations, or plain
sqlite3
, you’ve got everything needed to provision, migrate, and operate the aj session database cleanly. 🧰✨
Memories 🧠
aj
augments conversations with two complementary memory layers:
- Working memory ("
Brain
") — a small, token-bounded queue of recent snippets used to build a preamble for each request. - Long-term memory ("
VectorStore
") — a persistent HNSW index of embeddings (MiniLM, 384-d) used for semantic recall.
Together they let AJ remember enough to be helpful, without blowing your context window. 🪄
- Embeddings:
all-mini-lm-l12-v2
(downloaded automatically). - Index: HNSW for fast nearest‑neighbor lookups.
- Policy: Respect your token limits — prune oldest context when needed.
🔬 How it Works
- Your conversation text is embedded into vectors and stored.
- At answer time,
aj
retrieves top‑K relevant snippets. - These snippets are stitched into context (bounded by
context_max_tokens
).
🎛️ Tuning Dials
context_max_tokens
: Overall window size.assistant_minimum_context_tokens
: How much assistant context to preserve for responses.
🧩 Architecture at a Glance
- Brain (in-process):
- Holds
VecDeque<Memory>
(role + content). - Enforces a token budget (
max_tokens
); evicts oldest entries when over.
- Holds
- Builds a standardized preamble (3 messages):
- system = template’s system_prompt
- user = serialized brain JSON (a short explanatory line +
{"about", "memories":[...]}
- assistant = "Ok" (handshake/ack)
- VectorStore (persistent):
- Embeds text via all-mini-lm-l12-v2 ➜ 384-d vectors.
- Stores vectors in HNSW (hora) and maps ID →
Memory
. - Serialize to YAML + binary index (
<uuid>_hnsw_index.bin
underconfig_dir()
). - Reloads the embedding model from
config_dir()/all-mini-lm-l12-v2
on deserialization.
- Sessions & Ejection:
- When a rolling conversation exceeds budget, oldest user/assistant pair is ejected.
- If a
VectorStore
is provided, those ejected turns are embedded + added to the index, thenbuild()
is called. - New questions trigger nearest-neighbor recall; relevant memories get pushed into the
Brain
before the request.
🔬 What Happens on Each ask(...)
- Session prep
get_session_messages(...)
loads/creates session state (DB-backed if session_name is set).- Semantic recall
add_memories_to_brain(...)
:- Embed the current question.
- Query HNSW for top-3 neighbors (search_nodes).
- For each neighbor with Euclidean distance < 1.0, push its
Memory
into theBrain
. - Rebuild the Brain preamble and update session preamble messages.
- Preamble + prompt shaping
- Apply
pre_user_message_content
and/orpost_user_message_content
fromChatTemplate
. - Completion
- If
should_stream == Some(true)
:stream_response
prints blue/bold tokens live. - Else:
fetch_response
aggregates the content once.
- If
- Persistence
- Assistant reply is stored in the session DB (if sessions enabled).
- If the rolling conversation later overflows: oldest pair is ejected, embedded, added to VectorStore, and the index is rebuilt.
🛠️ Minimal Setup
#![allow(unused)] fn main() { use awful_aj::{ api, brain::Brain, config::AwfulJadeConfig, template::ChatTemplate, vector_store::VectorStore, }; async fn run() -> Result<(), Box<dyn std::error::Error>> { let cfg = AwfulJadeConfig { api_key: "KEY".into(), api_base: "http://localhost:5001/v1".into(), model: "jade_qwen3_4b".into(), context_max_tokens: 8192, assistant_minimum_context_tokens: 2048, stop_words: vec![], session_db_url: "aj.db".into(), session_name: Some("memories-demo".into()), // ✅ enable sessions should_stream: Some(false), }; let tpl = ChatTemplate { system_prompt: "You are Awful Jade. Use recalled notes if relevant. Be concise.".into(), messages: vec![], response_format: None, pre_user_message_content: None, post_user_message_content: None, }; // Long-term memory store (requires MiniLM at config_dir()/all-mini-lm-l12-v2) let mut store = VectorStore::new(384, "memories-demo".into())?; // Working memory (brain) with its own token budget let mut brain = Brain::new(8092, &tpl); // Ask a question; add_memories_to_brain will auto-inject relevant neighbors let answer = api::ask(&cfg, "What is our project codename?".into(), &tpl, Some(&mut store), Some(&mut brain)).await?; println!("{answer}"); Ok(()) } }
✅ Remember: After inserts to the VectorStore
, call build()
to make them searchable.
🧱 Seeding & Persisting the VectorStore
Seed once, then reuse across runs by deserializing.
#![allow(unused)] fn main() { use async_openai::types::Role; use awful_aj::{brain::Memory, vector_store::VectorStore}; use std::path::PathBuf; fn seed() -> Result<(), Box<dyn std::error::Error>> { let mut vs = VectorStore::new(384, "memories-demo".into())?; // Add whatever you want AJ to recall later: for s in [ "Project codename is Alabaster.", "Primary repo is awful_aj owned by graves.", ] { let v = vs.embed_text_to_vector(s)?; vs.add_vector_with_content(v, Memory::new(Role::User, s.to_string()))?; } vs.build()?; // 🔔 finalize the index // Persist metadata (YAML) and the HNSW index (binary) vs.serialize(&PathBuf::from("vector_store.yaml"), "memories-demo".into())?; Reload later: use awful_aj::vector_store::VectorStore; fn load() -> Result<VectorStore, Box<dyn std::error::Error>> { let yaml = std::fs::read_to_string("vector_store.yaml")?; let vs: VectorStore = serde_yaml::from_str(&yaml)?; // reload model + HNSW under the hood Ok(vs) } }
🎛️ Tuning Dials
context_max_tokens
(config): hard ceiling for the request construction.assistant_minimum_context_tokens
(config): budget for assistant-side context within your flow.Brain::max_tokens
: separate budget for the working memory JSON envelope.Vector
recall: fixed to top-3 neighbors; include a memory if distance < 1.0 (Euclidean).- Stop words: forwarded to the model; useful to avoid run-ons.
- Streaming: set
should_stream = Some(true)
for token-by-token prints.
🧪 If you frequently fail to recall useful notes, consider:
- Seeding more atomic memories (short, self-contained sentences).
- Lowering the distance threshold a bit (more inclusive), or raising it (more precise).
- Ensuring you rebuilt (
build()
) after inserts. - Verifying the model path exists under
config_dir()/all-mini-lm-l12-v2
.
🧠 How the Brain
Builds the Preamble
Every request gets a consistent, compact preamble:
- System —
template.system_prompt
- User — a short paragraph + serialized brain JSON:
{
"about": "This JSON object is a representation of our conversation leading up to this point. This object represents your memories.",
"memories": [
{"role":"user","content":"..."},
{"role":"assistant","content":"..."}
]
}
- Assistant — "Ok" (explicit acknowledgment)
This handshake primes the model with the latest, budget-friendly state before your new user message.
⛑️ Eviction: When the brain is over budget, it evicts oldest first and rebuilds the preamble. (Current implementation computes token count once; if you expect heavy churn, recomputing inside the loop would enforce the limit more strictly.)
🔁 Ejection → Embedding → Recall
When conversation history grows too large:
- Oldest user+assistant pair is ejected from session_messages.
- If a
VectorStore
is present:- Each piece is embedded, assigned an ID, and added to the HNSW index.
build()
is called so they become searchable.- On the next
ask(...)
, the current question is embedded, top-3 neighbors are fetched, and any with distance < 1.0 get pushed into the Brain as memories.
Effect: older turns become semantic breadcrumbs you can recall later. 🍞🧭
🧰 Recipes
- “Pin a fact” for later.
Drop a fact into the store right now so future questions recall it.
#![allow(unused)] fn main() { use async_openai::types::Role; use awful_aj::{brain::Memory, vector_store::VectorStore}; fn pin(mut store: VectorStore) -> Result<(), Box<dyn std::error::Error>> { let fact = "Billing portal lives at https://hackme.example.com."; let v = store.embed_text_to_vector(fact)?; store.add_vector_with_content(v, Memory::new(Role::User, fact.into()))?; store.build()?; // make it queryable Ok(()) } }
- "Cold start" with a loaded brain.
Start a session by injecting a few memories before the first question.
#![allow(unused)] fn main() { use async_openai::types::Role; use awful_aj::{brain::{Brain, Memory}, template::ChatTemplate}; use awful_aj::session_messages::SessionMessages; fn warmup(mut brain: Brain, tpl: &ChatTemplate) -> Result<(), Box<dyn std::error::Error>> { let mut sess = SessionMessages::new(/* your cfg */ todo!()); for seed in ["You are AJ.", "User prefers concise answers."] { brain.add_memory(Memory::new(Role::User, seed.into()), &mut sess); } let preamble = brain.build_preamble()?; // now ready assert!(!preamble.is_empty()); Ok(()) } } }
🪵 Logging & Debugging
- Enable tracing to see:
- brain token enforcement logs
- serialized brain JSON
- streaming events and request metadata (debug)
- If the model prints nothing in streaming mode, confirm your terminal supports ANSI and that stdout isn’t redirected without a TTY.
- If deserialization fails, verify:
vector_store.yaml
exists and points to a matching<uuid>_hnsw_index.bin
inconfig_dir()
.all-mini-lm-l12-v2
is present (e.g., afteraj ask "Hello world!"
).
🔐 Privacy
Everything runs local by default:
- Embeddings and HNSW files live under your platform config dir (config_dir()).
- Session DB is local.
- Only your configured model endpoint receives requests.
✅ Quick Checklist
- Place MiniLM at
config_dir()/all-mini-lm-l12-v2
(or run your installer). - Use
VectorStore::new(384, session_name);
after inserts, callbuild()
. - Enable sessions with
session_name: Some(...)
for ejection/persistence. - Provide
Some(&mut store)
,Some(&mut brain)
toapi::ask(...)
for semantic recall. - Tune
context_max_tokens
,assistant_minimum_context_tokens
, andBrain::max_tokens
. - (Optional) Set a JSON schema on
template.response_format
for structured replies.
Privacy note: Everything is local by default. Keep secrets… consensual. 🤫
Downstream Projects 🌱
Awful Jade (aj) isn’t just a one-off CLI toy—it’s a foundation. Because it’s built in Rust 🦀 with a clean separation of concerns (config, templates, sessions, memories, etc.), people can (and should!) build on top of it.
This page is a living showcase of projects that extend, remix, or depend on Awful Jade. Think of it as a family tree 🌳—where every branch adds a new capability, a new perspective, or just some another annoying edge case languishing in the issues tab.
✨ Why Downstream?
- Ecosystem Growth 🌍:
aj
handles the hard parts (embedding, sessions, vector search), so you can focus on the fun bits (playing Bookclub Rummy in your terminal with Aristotle). - Composable by Design 🧩: Use
aj
as a CLI, a library, or a background service. - Community-Driven 💬: The best projects come from folks solving their own problems—and sharing back.
🚧 Examples of What You Could Build
(Whether or not they already exist, these are the kind of things aj
invites you adapt to your own needs!)
- Study Buddy 📚
- A custom frontend where
aj
helps generate flashcards, quizzes, or summaries from your study materials.
- A custom frontend where
- Terminal Therapist 🛋️
- Hook
aj
’s interactive mode into your daily journal or notes app. Let it respond, remember, and gently roast you.
- Hook
- Knowledgebase Copilot 🗄️
- Wire
aj
into your company’s docs (Markdown, Confluence, Notion) and let it provide fast, context-aware Q&A.
- Wire
- Creative Writing Sidekick ✍🏽
- Use templates to inspire short stories, scripts, or dialogue.
aj
provides plot twists on demand.
- Use templates to inspire short stories, scripts, or dialogue.
- Rust Library Consumers 🔧
- Import
aj
as a Rust crate (awful_aj
) and embed it into your own CLI, TUI, or service.
- Import
📜 Currently Known Projects
- Awful Security News
- Bookclub Rummy
- Awful Book Sanitizer
- Awful Knowledge Synthesizer
- Awful Dataset Builder
- Your Project Here ✨ – Submit a PR to add it!
🤝 Add Yours!
Do you have a downstream project using AJ? Big or small, silly or serious, we’d love to see it here.
👉 Open a PR to this repo and add your project under Current Known Projects.
Let’s grow this ecosystem together. 🌟
💡 Remember: Awful Jade may have an awful name, but it’s gives good brain. What will you build with it? 🧠
Moldable Outputs 🧪🧱
Make aj’s output feel at home in your terminal: render Markdown with code blocks and syntax highlighting, keep a recent transcript, and customize behavior per shell.
🙀 This page includes ready-to-use helpers for Nushell, bash, and zsh.
🐚 Nushell (author’s version)
Drop this in ($nu.data-dir)/scripts/aj.nu
and source it (or add use in your config).
export def a [text: string] {
$env.RUST_LOG = ""
rm -f ~/.cache/aj.history
let question = $text
aj ask -t default -s default $text | tee { save ~/.cache/aj.history }
clear
print $"($question)\n\n"
mdcat ~/.cache/aj.history
}
export def ai [session: string] {
aj i -t refactor_rust -s $session
}
Usage:
a "explain lifetimes with a tiny code example"
ai refactor-session-1
💡 Tip: mdcat renders Markdown beautifully with fenced code blocks. If you don’t have it, install via your package manager (e.g., Homebrew:
brew install mdcat
). Alternatives:glow
,bat -l markdown
.
🧼 bash
Add this to ~/.bashrc
(then source ~/.bashrc
). It mirrors the Nushell flow:
- clears any old transcript
- asks with a chosen template/session
- saves raw output to
~/.cache/aj.history
- clears the screen
- prints the prompt you asked
- pretty-renders the Markdown transcript
# aj: ask and pretty-print Markdown response
ajmd() {
export RUST_LOG=""
mkdir -p "$HOME/.cache"
rm -f "$HOME/.cache/aj.history"
local question
question="$*"
if [ -z "$question" ]; then
echo "usage: ajmd <question...>" >&2
return 2
fi
# Run once, save transcript
aj ask -t default -s default "$question" | tee "$HOME/.cache/aj.history" >/dev/null
# Present nicely
clear
printf "%s\n\n" "$question"
if command -v mdcat >/dev/null 2>&1; then
mdcat "$HOME/.cache/aj.history"
elif command -v glow >/dev/null 2>&1; then
glow -p "$HOME/.cache/aj.history"
elif command -v bat >/dev/null 2>&1; then
bat --paging=never -l markdown "$HOME/.cache/aj.history"
else
# Fallback without highlighting
cat "$HOME/.cache/aj.history"
fi
}
# aj interactive with a handy default refactor template
ajrepl() {
local session="$1"
if [ -z "$session" ]; then
echo "usage: ajrepl <session-name>" >&2
return 2
fi
aj i -t refactor_rust -s "$session"
}
Examples:
ajmd "write a tiny Rust iterator adapter and test"
ajrepl refactor-session-2
🌀 zsh
Add this to ~/.zshrc
(then source ~/.zshrc
). Same behavior as bash.
# aj: ask and pretty-print Markdown response
function ajmd() {
export RUST_LOG=""
mkdir -p "$HOME/.cache"
rm -f "$HOME/.cache/aj.history"
local question
question="$*"
if [[ -z "$question" ]]; then
print -u2 "usage: ajmd <question...>"
return 2
fi
aj ask -t default -s default "$question" | tee "$HOME/.cache/aj.history" >/dev/null
clear
printf "%s\n\n" "$question"
if (( $+commands[mdcat] )); then
mdcat "$HOME/.cache/aj.history"
elif (( $+commands[glow] )); then
glow -p "$HOME/.cache/aj.history"
elif (( $+commands[bat] )); then
bat --paging=never -l markdown "$HOME/.cache/aj.history"
else
cat "$HOME/.cache/aj.history"
fi
}
# aj interactive with a handy default refactor template
function ajrepl() {
local session="$1"
if [[ -z "$session" ]]; then
print -u2 "usage: ajrepl <session-name>"
return 2
fi
aj i -t refactor_rust -s "$session"
}
Examples:
ajmd "explain pinning in Rust with a minimal example"
ajrepl rust-notes-1
💡 Notes & Tips
- Template & Session: Change
-t default -s default
to any template/session you prefer (e.g.,-t reading_buddy, -s gita-study
). - History location: Adjust
~/.cache/aj.history
if you want per-session files (e.g.,~/.cache/aj.$(date +%s).md)
or per-template logs. - Renderers:
- mdcat → rich Markdown (links, tables, code fences)
- glow -p→ pager mode
- bat -l markdown → quick highlighting (no Markdown rendering)
- Paging: To page long outputs, pipe to
less -R
:mdcat ~/.cache/aj.history | less -R
- Noise-free:
RUST_LOG=""
silences Rust log output so your Markdown stays clean.
🎛️✨ Have fun molding outputs to your terminal flow!