← Back

Careers · Working here

What working at Tenor is like

Tenor builds configurable AI employees: named owners of real work with bounded roles, tools, memory, permissions, and company context.

You make AI employees do real work, reliably.

The agent system is the product. Your job is to strengthen the harness: execution, tools, memory, context, orchestration, evaluation, reliability, and the loop where the system checks its own work.

If you have never built or seriously modified an agent system, this will feel unfamiliar. If AI-assisted development is already your default, the pace will make sense.

A typical week is systems infrastructure, not feature work.

You will spend most of your time building the shared layer: isolated execution, reproducible environments, reliable state, semantic memory, retrieval, tool orchestration, evaluation harnesses, and permission boundaries.

The day-to-day is not traditional software engineering. Hand-writing every line is too slow for the surface area here. Much of the work is directing coding agents, reviewing their output, and deciding what evidence is good enough to trust.

Some days are product architecture. Some days are live-system reliability. Some days are making employee environments faster to build, inspect, and upgrade without losing learned state. The common thread is continuity: the system has to show up tomorrow with the context it learned today.

The work is full-stack, but mostly at the systems layer.

This is not a UI over a model. Underneath is a distributed systems problem: scoped workers, isolated execution, durable state, retrieval-heavy context, tool orchestration, observability, recovery, and safe human oversight.

The architecture has to work across the control plane that schedules work, the data plane that runs isolated and persistent runtime environments, and the network ingress that manages live connections. Bursting from zero to very large sandbox counts without relying on warm pools is one kind of systems problem; keeping long-lived Docker-style runtimes fast, inspectable, and recoverable is another.

Much of the work is shared infrastructure: faster containers, deterministic syncs, automated quality checks, safer upgrades, and live failures turned into repeatable tests.

The hard question is rarely “can the model call this API?” It is whether the system can move from intent to action while preserving context, isolation, observability, and a clean human handoff.

Context is a core system, not a prompt trick.

A real AI employee needs structured understanding of people, projects, preferences, files, conversations, tools, and decisions. That means memory architectures, semantic indexes, knowledge graphs, retrieval pipelines, and evaluation loops that keep context useful without making it noisy or unsafe.

This work sits between research and infrastructure. You need enough taste to know what the model should see, enough systems judgment to make retrieval reliable, and enough product judgment to preserve trust when memory is incomplete.

We expect high agency and careful speed.

We are early, so the work changes shape quickly. You should be comfortable moving from a customer request to a live-system diagnosis to a production patch without needing every boundary prewritten.

We care about speed, but not theater. Good work survives real users, real permissions, production load, container rebuilds, unreliable dependencies, and the operational mess of live AI employees.

The backlog is mostly hardening the shared layer.

The recurring problems are systems problems: faster employee environments, reliable tool access, local learning separated from managed behavior, privacy-preserving review artifacts, and layered checks for memory, retrieval, and tool use.

Some work is pure infrastructure: shaving container build times, making QA automatic enough to trust, improving cold-start paths, and coordinating schedulers, sandboxes, persistent runtimes, ingress, and observability as one system.

We turn one-off operational pain into shared infrastructure: better provisioning, safer upgrades, stronger isolation, fault-tolerant intake, reproducible QA, and clear paths for one employee’s improvement to become available to many.

The bar is ownership, not ticket completion.

You should be able to own a surface end to end: understand how it works, improve it, measure whether it got better, and fix it when it breaks.

We value people who can move quickly without lowering the standard, and who use agents to compress implementation time without compressing judgment. The scarce skill is not typing speed; it is directing agents well and evaluating their output with taste, rigor, and evidence.

Good judgment matters more than cleverness.

There are many ways to make an agent do something impressive once. There are fewer ways to make it useful every day inside a team that depends on it.

  • You should notice when a workflow only works in a demo and not in a live customer account.
  • You should care about the exact moment an AI employee should ask for confirmation, retry, defer work, hand off, or stop.
  • You should understand how memory can help an agent reason, and how it can quietly make the system worse when retrieval is wrong.
  • You should be able to trace a failure from the user-visible symptom through state, routes, logs, generated files, environment setup, and dependency behavior.
  • You should care about infrastructure that compounds: faster builds, better fixtures, safer syncs, and quality checks that catch regressions before users do.
  • You should be able to explain a system to a founder, a user, and another engineer without changing the truth.

Who tends to thrive here.

People do well here when they are comfortable with ambiguity, allergic to fake progress, and interested in the messy boundary between software, infrastructure, language, memory, and organizations.

If you like owning problems end to end, already use AI systems in your own work, and want to define how intelligence joins a company as configurable employees, you will probably find the work energizing.