We’re Not in Kansas Anymore: GenAI vs Traditional Applications

Selador Team
May 18
5 min read

Updated: May 25

This post is for engineers, architects, and product leaders curious about what really changes when you build GenAI-powered applications. If you find this interesting and want to learn more about how GenAI can unlock new possibilities for your enterprise, reach out and let's chat more!

We’re not in Kansas anymore (but we’re still close)

As engineers who have been building software for a decade, we have a firm understanding of what building software looks like; Deterministic logic, structured data models, well-defined interfaces, auto scaling servers, monitoring dashboards, and reproducible deployments. Easy stuff!

From a distance, building GenAI products might look familiar. But once you really dive in, the differences start to show. Deterministic systems are replaced with probabilistic systems. Logic is replaced by prompt engineering. Scaling isn’t as easy as just adding more servers, it’s about GPU availability and inference throughput. Testing becomes less binary and now lies on a spectrum of 20 different indicators.

This post is an overview of what changes when you move from traditional apps to GenAI-powered ones. In future posts, we’ll dive deeper into each topic.

What is a GenAI application?

At its core, a GenAI application is any system that uses a large language model (LLM) to interpret natural language, reason over information, and either generate natural language responses or take actions on external systems.

These applications typically follow a simple pattern:

Input: Accept a prompt in natural language.
Understanding: Use an LLM to interpret the prompt and optionally gather relevant context via retrieval (RAG), tool use, or external APIs.
Reasoning & Execution: Use the LLM to generate a response or to decide and initiate actions based on the user’s intent.
Output: Return a response in natural language, or trigger an operation in another system

Of course, many GenAI applications go further, with features like guardrails, memory, tool use, and multimodal input. At their core though, they all follow this basic pattern.

Deterministic what?

In traditional software systems, the same input usually gives you the same output every time. These systems are deterministic, meaning their behavior is predictable and repeatable.

With GenAI, that assumption no longer holds. The engine behind most GenAI apps is a Large Language Model (LLM), which operates in a probabilistic world. That means even if you ask the same question twice, you might get slightly (or dramatically) different responses.

Since the underlying system is probabilistic, so are the systems built on top of it. This has far-reaching consequences and means we have to rethink how we test, validate, and trust our systems.

Throughput isn’t what it used to be

We thought we solved throughput with fancy things like autoscaling, horizontally scalable databases and edge computing. But GenAI is different. Now the bottleneck is inference, not throughput.

LLMs don’t run on your average web server. They need specialized GPUs, which are expensive, power-hungry, and in short supply. You can’t just spin up more EC2 instances and call it a day, you need high-end hardware like Nvidia A100s and cloud providers have limited stock.

So while traditional systems scale easily (for the most part), GenAI systems scaling is on the mercy of GPU availability and inference capacity. It’s not just “add more pods”, it’s “do we have enough silicon from Taiwan to respond to this user?”

Make latency low again

Gone are the days where we were dealing with millisecond latency and trying to optimize for that. Most LLMs take their sweet time to “think,” process the prompt, and generate a coherent response.

That’s because LLM inference isn’t just about running a function, it’s generating text token by token, each one dependent on the previous. For large models, this can take hundreds of milliseconds to several seconds (an eternity by traditional app standards).

This shift forces product and engineering teams to reconsider what “fast” means. Do we show a typing animation? Stream partial results? Pre-warm prompts? Users will usually wait, but the response has to be worth it…

A different kind of security

We thought we had security figured out with things like firewalls, CSRF tokens, input sanitization, the usuals. But GenAI introduces risks that doesn’t come from malicious users, but from the system itself.

GenAI systems can, with a well-crafted prompt, say wildly inappropriate things, leak sensitive company data, or reveal something meant only for HR to your least favorite coworker. LLMs are trained on vast amounts of data and, like your uncle at Thanksgiving, don’t have a very good filter. Things get riskier when they’re connected to internal knowledge, action-capable tools, and have memory of prior interactions. Now you’re not just worried about what the model knows, you’re also worried about what actions it takes and who it’s telling.

This changes the game. Traditional security models aren't enough. We need:

Guardrails for prompt injection and unsafe outputs
Role-based access to both inputs and model responses
Sane limits on what actions the GenAI systems can take
Logging and auditing of LLM generation

GenAI might be smart, but it still needs adult supervision.

Authorization is Hard Again

Most useful GenAI applications need access to data to be useful. But that data isn’t always public. Most of it lives behind some kind of role-based access control (RBAC) or (Access Control Lists) ACLs.

Let’s say your company has a document platform, and your GenAI app can fetch relevant files to answer a question. It's very helpful until it starts pulling up documents you were never supposed to see.

Now let’s say the app logs its responses (as many do). Suddenly, those once-protected documents are sitting in a centralized log and Junaid from the GenAI team has access to everyone's private project plans, performance reviews, and that one spreadsheet nobody was supposed to look at.

This isn’t theoretical. The moment your GenAI app has access to private systems, you need to solve for:

Per-user context fetching (only retrieve what the user is authorized to see)
Per-user output visibility (only show responses to the right people)
Per-user actions control (only allow authorized users to take allowed actions)
Logging with improved access control (because logs are now full of sensitive stuff)

In short: GenAI makes auth harder and way more important.

Testing my patience

The good news? You probably don’t need a big manual QA team anymore.

The bad news? You’ll need a data science team with a PhD in statistics—or at least someone who understands what “distribution drift” means.

Traditional software testing relies on known inputs and expected outputs. But GenAI systems don’t work that way. They operate in a probabilistic space, where there isn’t always a single “correct” answer. That means testing becomes a matter of:

Creating ground truth to first determine what even a good response would be
Evaluating output quality across large sample sets
Tracking hallucination rates, bias, and failure modes
Designing evaluation harnesses to simulate real-world prompts
Measuring whether responses are factual, policy-compliant, and user-appropriate

Conclusion

While GenAI systems are similar to traditional systems in many ways, many of the assumptions we’ve built our software practices on no longer hold. Determinism, testability, scalability, and security all take on new forms.

Some of these problems already have solutions. Others, we’re still figuring out in real time. The good news is that solving hard, ambiguous problems by sitting alone and thinking in silence for hours is.. kind of our thing. reach out and let's chat more!

We’ll dive deeper into each of the topics mentioned above in future posts. Stay tuned!