OLYMPUS: Building an Autonomous Career Intelligence System with 4 AI Agents

The Problem

A junior from my university reached out one evening. He was graduating in a few months, had been applying to new grad roles for weeks, and was drowning — not in rejections, but in the sheer mechanical overhead. "学长, I spend 3 hours a day just filling out the same form on different portals. Name, email, phone, LinkedIn, upload resume, upload cover letter. By the time I get to interview prep I am too exhausted to think."

I helped him for a couple of days — rewrote his resume, sent him a list of companies. Standard stuff. But his observation stuck with me. I started doing the math: if you apply to 5 companies a day, and each application takes 30-40 minutes of mechanical work (find the posting, read the JD, tailor the resume, fill the form, write a cover letter), that is 20+ hours a week of labor that requires almost zero judgment. It is expensive in time but cheap in thought.

The question was not whether AI could handle individual pieces. Claude can write a resume. Claude can fill a form. The real engineering question was: could I build a system of AI agents that coordinate the entire pipeline — job discovery, resume tailoring, application submission, interview preparation — end to end, with a human only stepping in to approve?

That question became OLYMPUS.

System Overview

OLYMPUS is a multi-agent AI system built on OpenClaw that automates the full career pipeline. Four independent agents — each running as a TypeScript plugin with its own codebase, Discord bot, and Obsidian-backed database partition — coordinate via Discord to handle the workflow from job discovery through interview preparation.

Absolute (16 MCP tools) — the orchestrator. Implements a structured 7-phase protocol: Plan, Checkpoint, Consult, Delegate, Monitor, Review, Synthesize. It decomposes complex requests into tasks, assigns them to specialist agents, quality-scores the results (1-5, configurable threshold), and retries if work falls below standard. All inter-agent coordination flows through Absolute.

Athena (20 MCP tools) — the career strategist. Manages a structured achievement bank, project lifecycles (explore, build, harvest), multi-version resume intake (supports PDF, Markdown, plaintext), JD-targeted resume tailoring with ATS optimization (targets 80%+ keyword match via regex extraction + Claude semantic analysis), and cover letter generation. Maintains a codified resume knowledge base — impact verbs to use, words to avoid, formatting rules.

Hermes (14 MCP tools) — the interview coach. Conducts multi-round mock interviews (experience screen, technical, behavioral, culture fit, hiring manager), with each round containing 4-6 questions. Evaluates responses across 7 dimensions: content relevance, STAR structure, communication clarity, specificity/metrics, depth, confidence indicators, and growth mindset. Generates targeted drills for weak dimensions. Supports voice transcription with leniency for filler words.

Artemis (24 MCP tools) — the job hunter. The most complex agent. Runs daily automated scans of 18 companies via direct API calls (Greenhouse, Lever, Ashby, Amazon, Uber, AMD) and Playwright browser automation (Google, Apple, Microsoft, IBM). Filters by location, title, and seniority. Scores jobs on four weighted dimensions: skills match (40%), level match (25%), domain relevance (20%), experience years (15%). Handles application form filling across 5 ATS platforms (Greenhouse, Lever, Workday, Ashby, generic). Monitors email inbox via IMAP for application responses.

74 MCP tools across 4 agents. 163 tests. One human touchpoint: approval before submission.

Architecture Decisions

Discord as Message Bus

When Artemis needs a tailored resume, she does not call an internal API or write to a shared queue. She mentions Athena in Discord: <@ATHENA_BOT_ID> I need a tailored resume for this JD. Athena picks it up, generates the resume, and replies in the same channel.

This is a deliberate architectural choice with three properties:

Observable — every agent interaction is a Discord message. You can scroll back and read the conversation between Artemis and Athena about why a resume emphasized backend over frontend. The system's logs are a Discord server.

Asynchronous — agents do not block on each other. Artemis posts a request and continues scanning. Athena picks it up when ready. This mirrors how human teams actually coordinate.

Extensible — adding a fifth agent means adding a bot to the Discord server. No API contracts to negotiate. No shared state to coordinate. No deployment dependencies.

Obsidian as Database

Every piece of data in OLYMPUS — job postings, applications, resumes, interview scores, project decisions, achievement banks — lives as a Markdown file with YAML frontmatter in an Obsidian vault.

The rationale: data should be readable by both machines and humans without a translation layer. Every "record" is a file you can open, inspect, and edit in Obsidian. Interview scores are browsable. Job postings are searchable. The database is human-readable by default.

The storage layer is not trivial. A shared obsidian-adapter library handles atomic writes (tmp file + rename to prevent corruption), YAML parsing via gray-matter, and an in-memory index with a 5-second TTL for fast lookups. Each agent operates on its own vault partition — no cross-agent data access except through Discord messages.

Agent Isolation

Each agent is a fully independent TypeScript plugin with its own:

SOUL.md — personality and behavioral guidelines
AGENTS.md — knowledge of other agents and when to involve them
IDENTITY.md — role definition
USER.md — user profile and preferences
Database partition in the Obsidian vault
Discord bot with its own token

No shared state. No shared code beyond the obsidian-adapter library. An agent can be updated, restarted, or replaced without affecting the others.

The Pipeline

Here is a concrete walkthrough of the system handling a full cycle:

9:00 AM — macOS launchd triggers Artemis's daily scan.

For companies with public JSON APIs, she makes direct HTTP requests. Greenhouse: boards-api.greenhouse.io/v1/boards/{token}/jobs. Lever: api.lever.co/v0/postings/{slug}. Amazon: amazon.jobs/en/search.json. Fast, structured, reliable.

For companies behind JavaScript-rendered career pages, she launches Playwright in headless mode and scrapes. She diffs every posting against her known inventory using SHA-256 hashes of normalized job text — detecting new, changed, and removed postings.

She filters: Ontario/Canada locations only. Engineering titles only (regex match for engineer, developer, swe, sde). Negative filter for manager, director, designer, analyst. Maximum seniority: senior.

She scores each surviving posting on four weighted dimensions against the user profile. Strong match: 80+. Skip: below 40.

9:15 AM — Artemis posts a daily report to #daily-job-report:

Artemis Daily Job Report

This is a real screenshot. 9:02 AM. 12 companies scanned, 4 new postings found, scored and categorized by match strength.

User approves a job. Artemis mentions Athena in Discord with the JD.

Athena pulls from the achievement bank — a structured collection of quantifiable wins maintained across all projects. She matches achievements to JD keywords, runs an ATS keyword check (regex extraction across 8 categories: languages, frameworks, cloud, databases, tools, data/ML, concepts, soft skills), targets 80%+ match rate, and generates a tailored one-page resume. She also generates a cover letter integrating soft skills from her knowledge base.

Artemis launches Playwright in headed mode. She detects the ATS platform (Greenhouse, Lever, Workday, Ashby, or generic) and loads the corresponding form handler — each with hardcoded CSS selectors for that platform's fields. She fills everything: name, email, phone, LinkedIn, resume upload, cover letter. She takes a screenshot and sends it to Discord.

User reviews the screenshot and approves. Artemis submits. She records the timestamp, sets the application status to submitted, and begins monitoring the email inbox via IMAP (imapflow library) with provider presets for Gmail, Outlook, Yahoo, and iCloud.

An interview invitation arrives. Artemis's email classifier detects the pattern via regex matching. She DMs the user immediately and posts to the channel.

User requests interview prep. Absolute creates an orchestration plan and delegates to Hermes.

Hermes ingests the JD, designs a multi-round mock interview (3-5 rounds with 4-6 questions each), and conducts it. After each round, he scores across 7 dimensions with evidence-backed ratings. He identifies weak dimensions (scoring 3 or below) and generates targeted drills. The Conductor prompt enforces interviewing rigor: calibrated probing, time-boxing, no leading questions, silence tolerance, seniority-aware difficulty.

Absolute synthesizes cross-agent insights: Hermes's scoring data combined with Athena's resume analysis, surfacing connections that no single agent would identify alone.

Security

Artemis handles sensitive data — email credentials, personal information for form filling. Credentials are encrypted with AES-256-GCM. The encryption key is derived from a three-tier hierarchy: (1) ARTEMIS_ENCRYPTION_KEY environment variable, (2) macOS machine UUID via ioreg, (3) fallback to hostname+username hash. Stored format: base64-encoded IV (12 bytes) + auth tag (16 bytes) + ciphertext.

The human-in-the-loop approval gate is enforced at the architecture level. Artemis fills the form and screenshots it but never submits without explicit user confirmation in Discord.

Reflections on AI-Native Systems

Building OLYMPUS taught me something I did not expect: the hard part of multi-agent systems is not the AI. It is the architecture.

Getting Claude to write a good resume is straightforward. Getting four instances of Claude, running in four separate processes with four separate databases, to coordinate on a shared goal without producing conflicting or redundant work — that is a systems design problem. How do agents share context without shared state? How do you make the system observable when the "code" is a conversation between bots? How do you maintain quality when the output is non-deterministic?

The answer, I believe, is to stop thinking in terms of "AI applications" and start thinking in terms of AI-native systems. An AI-native system is not an app with an AI feature bolted on. It is a system designed from the ground up around the assumption that AI agents are first-class participants — with their own state, their own tools, their own communication channels, and their own failure modes.

In OLYMPUS:

Discord is not a UI choice. It is a message bus that makes agent communication observable, asynchronous, and extensible.
Obsidian is not a quirky storage choice. It is a design principle: data should be readable by both machines and humans without a translation layer.
The 7-phase orchestration protocol is not over-engineering. It is the minimum structure needed to keep four autonomous agents producing reliable output. Without quality gates, the system degrades. Without checkpoints, you lose control. Without synthesis, cross-agent insights never surface.

On AI Automation

I want to be precise about what OLYMPUS does and does not do, because the discourse around AI automation tends toward extremes.

OLYMPUS automates the mechanical layer of the career pipeline — the scrolling, copying, form-filling, keyword-matching, scheduling. These are tasks that require time but not judgment. They are expensive in hours but cheap in thought.

What it does not automate is decision-making. It cannot determine whether a role's "fast-paced environment" means exciting innovation or chronic understaffing. It cannot read interpersonal dynamics in an interview. It cannot weigh compensation against team quality against career trajectory.

The single human approval step is not a safety mechanism. It is the design philosophy. The system is built so that a human spends time exclusively on the decisions that require judgment, and zero time on the mechanical labor that surrounds those decisions.

After building this system, applications per day increased significantly while each individual application improved in quality — because the resume is actually tailored to each JD, the ATS keywords are actually matched, and the human has energy left for the parts that matter.