Walma

AI Cost Management. Get the answer to what your AI tools actually cost.

Real-time data on spend, forecasts per model, budgets per team. All in one place, synced from APIM automatically.

You pay three invoices in three different currencies to three different vendors, and the sales reps call you once a month to ask if you want to upgrade. AI Hub gathers all AI usage in one place. You see exactly what every euro goes to, who is spending it, which model drives the cost, and how the month is projected to land. When budgets approach the ceiling, warnings fire automatically. When someone changes tier, the new limits take effect on the next request.

Some clients & partners

SKBVictoriahemOne MoreInseraJunglemapAlice LabsPublic PartnerOMIFAWS PartnerMicrosoft

Three AI vendors, five subscriptions, zero overview.

This is what the typical AI spend picture looks like in enterprise IT today.

See how AI Hub pulls it together

Per-token pricing you cannot budget against

Models charge per input and output token, at different rates per model. A single user can burn the monthly budget in an afternoon without anyone noticing until the invoice arrives.

No team-level view

Anthropic and OpenAI report spend per organisation, not per team or individual. When the CFO wants to allocate cost to a cost center, the honest answer is 'we are guessing'.

Usage grows faster than visibility

Every week someone adds a new model or a new tool. The tracking sheet lags behind. Nobody knows the real total cost right now.

Spend MTD, monthly forecast and daily burn rate. In real time.

Straight from APIM into your dashboard.

AI Hub gathers every AI call through the same gateway. Each request is logged with model, tokens, tool and developer. The dashboard calculates monthly MTD, remaining budget, forecast based on daily burn rate and deviation against budget pace. At the same time you see model mix and where the money actually goes.

AI Cost Management dashboard with MTD, forecast and model breakdown
Capabilities

Everything you need to run AI cost professionally

Not a report tool layered over CSV exports. Live governance where policy and cost stay tied together.

Real-time budgets

Real-time budgets

  • Monthly budgets per organisation, team and individual
  • Hard cap that suspends traffic on overrun
  • Warning levels (75 %, 90 %, 100 %) with notification
  • Daily budget rate for comparison against 30-day rolling
Forecasts and scenario planning

Forecasts and scenario planning

  • Month-end forecast based on burn rate
  • Per-model forecast and stacked area chart
  • Identify over-pace vs under-pace against budget
  • Quantify the impact of moving teams between tiers
Cache hit and savings effects

Cache hit and savings effects

  • Cache hit rate per organisation
  • Money saved through caching
  • Tokens per currency unit as an efficiency metric
  • Throttling statistics to inform capacity planning
Cost center and chargeback

Cost center and chargeback

  • Tag teams with cost centers for internal chargeback
  • Export in CSV and JSON for finance systems
  • Markup transparency (what you pay Walma vs the vendor)
  • Monthly report per cost center, ready for the finance meeting
How it works

Logging that does not slow production down

Spend data is calculated from actual tokens, not estimates. The numbers match the vendor invoice.

Async telemetry

APIM logs every request asynchronously. Your developers do not feel the gateway. Rollups run in workers and the dashboard updates every minute.

Model prices you can adjust

The model price list sits in a table you control. When a vendor lowers a price, you update it and rollups recompute history.

Multi-currency

Spend is stored in the organisation's base currency and shown in whatever currency your FinOps team prefers. FX rates update daily.

Why centralise AI cost

Not everything is AI security. Some of it is just money.

Three concrete effects you get out in the first quarter.

Saves 20 to 40 percent on model usage

Through caching, cheaper models for simple tasks and tier-based governance. We have seen customers go from 8000 to 5000 EUR on the same developer team, without the developers noticing anything.

End of invoice shock

Hard cap and forecast make sure no month ends with 'wait, what happened?'. You see in the middle of the month whether things are on track and you can act.

Real cost centers

When finance asks 'what did AI cost for business area X?', you have the answer. Not guesses, not CSV exports you patch together manually.

Leverage for vendor negotiation

When you can see exactly how much you use per model, you have real leverage with Anthropic or OpenAI. The difference between guessing and knowing is often many percent in discount.

AI spend dashboard
Onboarding

From first connection to first report in two weeks

Standard package. You have the dashboard live before the month is out.

1

Connect the vendors

We configure APIM against your Anthropic and OpenAI accounts. Existing keys keep working. The gateway sits in front.

Day 1 to 3
2

Define the teams

Cost centers, teams and budgets are entered. SCIM groups connect to tiers so new hires automatically get the right access.

Day 4 to 7
3

Pilot rollout

Two teams move over their endpoints. We monitor together and tune prices and budget thresholds.

Week 2
4

Full reporting

The remaining teams move over. The first monthly report is delivered automatically with spend, forecast and cost center breakdown.

Week 3 to 4

AI Hub. The gateway between your developers and the AI models.

Get control over cost, security and policy for Claude Code, Codex and every AI tool your teams already use.

Read more

Frequently asked questions about AI Cost Management

Details. Every request is stored with input tokens, output tokens, model, tool and developer. We show totals in the dashboard, but you can export at row level if your FinOps team needs that for internal chargeback.

The forecast takes the daily burn rate so far in the month and multiplies it by the number of days in the month. It is conservative at the start of the month and precise from day seven. For teams with strong seasonality we can switch to a seven-day rolling average.

Depends on tier policy. The default is to return an informative 429 with 'budget exceeded'. You can also choose a soft cap where requests continue but a notification goes to manager and developer.

We charge a monthly fee per organisation and a transparent markup on model usage. The markup is typically smaller than what you save through caching and tier-based governance. We show the break-even math in the first meeting.

Telemetry lands in Supabase within 60 seconds. Aggregation runs every minute. The dashboard is effectively real time.

Yes. There is a data export endpoint that gives you raw data. Many customers connect it to Power BI or Looker to cross AI spend with other FinOps data.