AI Agents for Business: Scope Before Hype

AI agents for business are being sold as the next major step after chatbots: systems that can plan, use tools, retrieve information, draft outputs, and complete tasks with limited human instruction. That sounds attractive to any executive under pressure to reduce cost, improve service, and show progress on generative AI.

The real problem is not whether AI agents are interesting. They are. The problem is that many organisations are considering them before they have defined the work, the risk, the data, or the control points.

For South African companies in Johannesburg, Cape Town and nationally, this matters. A poorly scoped agent can expose customer information under POPIA, make unreliable decisions using messy internal data, fail during infrastructure disruption, or quietly create operational errors that only surface weeks later. The starting point is not “Which agent platform should we buy?” It is “Which business process is stable enough, valuable enough, and safe enough for agentic automation?”

This article is part of Zorinthia’s Generative AI & LLM hub.

What an AI agent actually does

An AI agent is not simply a chatbot with a better name. In business use, it usually combines a large language model with access to systems, documents, rules, and workflow steps. It may read a request, decide what information is needed, call an internal tool, draft a response, update a record, and route an exception to a human.

For example, a logistics company might use an agent to assist with delivery queries. The agent could read an email from a retailer, check order status, retrieve proof of delivery, draft a response, and flag cases where the shipment record conflicts with the depot scan. That is different from a basic chatbot that only answers questions from a fixed knowledge base.

The distinction is important because every additional action increases risk. Reading a document is one risk profile. Updating a CRM record is another. Sending a customer communication is higher again. Approving a refund, changing a payment instruction, or advising on a healthcare matter requires even tighter control.

This is where executive discipline is needed. The more an agent can do, the clearer its boundaries must be.

Where agents are genuinely useful

AI agents work best where the process is repetitive, information-heavy, and bounded by clear rules. They are less suitable where the task depends on judgement that is ambiguous, ethical, regulated, or commercially sensitive.

A practical South African retail example is supplier query handling. Buyers and finance teams often deal with repeated questions about purchase orders, invoice status, credit notes, delivery discrepancies, and payment dates. An agent can help by gathering the relevant documents, checking policy rules, preparing a draft answer, and highlighting missing information. A human can then approve the response before it is sent.

In financial services, an internal agent may assist relationship managers by summarising client notes, surfacing product rules, and preparing meeting briefs. It should not independently recommend products to customers unless the organisation has addressed advice regulation, suitability checks, recordkeeping, and accountability.

In manufacturing, an agent can help maintenance teams search equipment manuals, prior fault logs, and safety procedures. It may suggest likely causes of a recurring fault, but final action should sit with qualified personnel, especially where safety, downtime, or warranty implications are involved.

The common pattern is not “replace the team”. It is “reduce the time spent assembling information and preparing routine outputs”.

When ordinary automation is better

Not every workflow needs generative AI. In many cases, a rules-based workflow, robotic process automation, or a simple system integration is cheaper, more reliable, and easier to govern.

If a process follows fixed rules, use fixed rules. If a customer refund is approved only when the invoice is valid, the return is logged within 30 days, and the product category is eligible, that logic does not require an LLM. It requires clean data and a controlled workflow.

An AI agent becomes more relevant when the input is messy or varied: long emails, scanned documents, free-text complaints, policy documents, inconsistent notes, or multiple sources that need interpretation. Even then, the agent should not be allowed to invent missing facts. It should be designed to say: “I cannot complete this because the required information is absent.”

This boundary is often missed in boardroom conversations. Executives hear “AI can handle unstructured work” and assume the machine can compensate for weak process design. It cannot. If the underlying process is confused, an agent will execute confusion faster.

Before committing to production deployment, organisations should ask whether the task needs language understanding, judgement support, or document interpretation. If it only needs routing, validation, or calculation, simpler automation may be the better investment.

Data readiness is the limiting factor

AI agents are only as useful as the information they can safely access. Many South African organisations still have fragmented customer records, old shared drives, inconsistent product codes, incomplete CRM data, and policy documents that differ by department. In that environment, an agent may retrieve the wrong version of a rule or combine facts that should never be combined.

This is not a technology complaint. It is a management issue. If no one owns the source of truth for customer status, contract terms, asset registers, or pricing rules, the agent will expose that weakness.

A hospital group considering an internal agent for patient administration, for example, must know which data sets contain personal and health information, who may access them, what purpose is lawful, and what records must be retained. POPIA applies when personal information about patients, employees, doctors, suppliers or customers is processed. That includes information supplied to the model, retrieved by the model, or generated in outputs.

An AI readiness assessment should therefore precede ambitious agent programmes. It should test data quality, access controls, process maturity, governance ownership, and operational resilience. Without that, the agent initiative becomes a glossy interface placed on top of weak foundations.

Governance cannot be added at the end

AI agents need governance before they are connected to live systems. This is because they may take actions, not merely produce text.

The governance questions are straightforward:

  • What is the agent allowed to do?
  • Which systems can it access?
  • What information is it prohibited from using?
  • Which outputs require human approval?
  • Who is accountable when it makes an error?
  • How are decisions logged and reviewed?

These questions become more serious where personal information is involved. If an agent reads employee records to draft HR responses, POPIA obligations apply. If it uses customer CRM data to personalise sales outreach, consent, purpose limitation, retention, and access rights must be understood. If it summarises complaints, the organisation must ensure sensitive information is not exposed to unauthorised users.

Boards should also expect a clear link between AI use and information governance. King IV places responsibility for technology and information governance at board level. An agent that cannot be explained, controlled, or audited is not just an IT experiment; it is a governance weakness.

For more detailed control design, see AI governance. Governance should be proportionate, but it should not be optional.

Evaluation before production deployment

A pilot that impresses executives in a demonstration is not evidence that an AI agent is ready for production deployment. Demos are usually clean. Real operations are not.

Evaluation must use realistic cases from the business. For a property company, that may include lease queries, arrears communications, maintenance requests, municipal billing disputes, and tenant complaints. The agent should be tested against current policies, historical exceptions, edge cases, and deliberately incomplete inputs.

Useful evaluation measures include:

  • accuracy of retrieved information;
  • quality of drafted responses;
  • rate of unnecessary escalations;
  • rate of missed escalations;
  • processing time compared with the current process;
  • user acceptance by operational staff;
  • compliance with privacy and access rules;
  • ability to recover when source systems are unavailable.

The last point matters in South Africa. Load-shedding, connectivity failures, and system downtime still affect operations. An agent that depends on live access to multiple systems needs a fallback process. If the warehouse management system is unavailable, does the agent pause, escalate, or generate a response based on stale data? That decision must be designed, not discovered during an incident.

Evaluation should also include human review. An LLM can sound confident while being wrong. The test is not whether the output reads well; it is whether it is correct, permitted, useful, and safe.

Monitoring after go-live

Once an AI agent is live, the risk changes. The organisation is no longer testing a concept; it is operating a system that affects customers, employees, suppliers, or internal decisions.

Monitoring should track performance, exceptions, user overrides, complaints, unusual outputs, and changes in data sources. If a policy document is updated, the agent’s retrieval behaviour may change. If a CRM field is repurposed, the agent may interpret it incorrectly. If users learn how to prompt the agent in unexpected ways, new risks can emerge.

For a CFO, the monitoring question is commercial as well as technical: is the agent reducing cost, improving speed, lowering error rates, or freeing skilled staff for higher-value work? If not, the organisation may be carrying operational risk without a clear return.

Monitoring also protects against model drift and process drift. Even when the underlying large language model is externally managed, the business remains responsible for how the system is used in its environment. Internal controls, audit logs, and escalation paths must remain visible to management.

This is where independent AI advisory can help executives separate impressive prototypes from sustainable operating models.

Buying decisions need sharper questions

Many buyers start with vendor capability. They should start with business accountability.

Before approving an AI agent initiative, the executive committee should require a short, plain-English scope document covering:

  • the exact process or task;
  • the users and affected parties;
  • the data sources involved;
  • the actions the agent may and may not take;
  • POPIA and legal considerations;
  • human approval points;
  • evaluation criteria;
  • operational fallback procedures;
  • monitoring responsibilities;
  • expected financial or service benefit.

This document should exist before procurement is finalised. It prevents the common problem where a broad generative AI ambition becomes a vague implementation with no measurable outcome.

Where a vendor or implementation partner is already involved, an independent second opinion can be useful. The role is not to block innovation, but to test whether the proposal is scoped, governed, and commercially justified. Zorinthia’s perspective on this distinction is covered under AI consulting.

The executive decision

AI agents for business can be valuable, but only when the use case is narrow enough to control and important enough to justify the effort. The strongest candidates are not the most futuristic. They are the workflows where staff waste time searching, checking, drafting, and routing information across systems.

The next executive question should be simple:

Which one process in the business is repetitive, information-heavy, measurable, and safe enough to test under controlled conditions?

If that question cannot be answered clearly, the organisation is not ready for an agent. If it can, the next step is to define the scope, risk controls, evaluation method, and owner before anyone talks about production deployment.