Keynote

Agentic AI: How AI Agents Transformed My Work as a Software Engineer and CEO

Name: Agentic AI: How AI Agents Transformed My Work as a Software Engineer and CEO
Start: 2026-03-11
Location: Software Architecture Summit

Dr. Simon Harrer, Co-Founder & CEO @ Entropy Data · March 11, 2026

In this keynote, I share how AI agents -- specifically agentic coding tools like Claude Code -- have fundamentally changed how I work as both a software engineer and a CEO. From writing zero lines of code to letting agents work overnight, this talk covers the practical reality of working with AI agents in 2026, and why the biggest opportunity lies not in coding but in connecting agents to enterprise data.

Note: The talk was delivered in German. The transcript below is an English translation.

Thanks to Software Architecture Summit for the opportunity, Jochen Christ for helping shape the talk, Robert Glaser for slide inspiration, and Arif Wider for the recording setup.

Introduction

Hello everyone, my name is Simon Harrer. I have a PhD in distributed systems and spent seven years at INNOQ as a software architect and consultant. Since 2025, I am Co-Founder and CEO at Entropy Data, which is a spin-off from INNOQ.

On the side, I co-authored the book "Java by Comparison" and co-translated the "Data Mesh" book into German. I am also a co-maintainer of the open source tools mob.sh and the Data Contract CLI. And I serve on the Technical Steering Committee at the Linux Foundation's BITOL project for data contracts and data products standards.

Today, I want to talk about how AI agents have completely transformed how I work -- both as a software engineer and as a CEO.

April 2025: The First Encounter

"That was my personal AI moment. The moment I realized something fundamental had shifted."

My Personal AI Moment

It was April 2025. I was on a train to Berlin with André Deuerling. Thanks to Deutsche Bahn, we ended up with over two hours of delay. So we sat down in the restaurant car and kept ordering beers. André had already been using Claude Code, so we decided to give it a try together.

The task: build a Spring Boot service that connects to a MariaDB database, reads all the schemas, tables, and columns, and pushes that metadata into our product's API. The rule was simple -- write zero lines of code. Only prompt.

We worked on it for about two hours. We reviewed what it generated, prompted more, revised architectural decisions. Then we got the API keys, configured things manually, started it up -- and it worked. The whole thing just worked.

That was my personal AI moment. The moment I realized something fundamental had shifted.

What Is an Agent?

So what is an agent? It is really quite simple: an LLM using tools in a loop.

Claude Code is exactly that. There is an LLM running in the background, and it has access to a set of tools: it can make CLI calls, read and write files, do internet research, run tests, compile code. Even memory is just a tool -- essentially a todo list that the agent accesses through a tool interface.

That is the entire concept. An LLM using tools in a loop. And yet, as you will see, this simple concept changed everything about how I work.

Getting Started

After that experience, I started using Claude Code more and more. I have to be honest: not everything was great at first. Sometimes I threw away the generated code entirely and went back to coding manually. But I kept trying.

I signed up for the $20/month plan. It felt like a Netflix subscription -- low commitment, high potential upside.

November 2025: The Game Changer

"Task length has been doubling roughly every six months. What felt impossible before quickly becomes the new normal."

The Game Changer: Opus Models

Then came Opus 4.5 in November 2025. That was the real game changer.

Let me show you the METR benchmark. It measures the task length that LLMs can successfully complete at least 50% of the time. GPT-5.1 and Codex Max land somewhere around three hours. Opus 4.5 jumped to around five hours -- a massive leap. The curve is exponential.

Then Opus 4.6 arrived in January and roughly doubled it again, to about twelve hours. You really feel this difference when working with it. Large, complex tasks that AI simply could not handle before are now within reach.

Task Length Doubles Every Six Months

The pattern is clear: task length has been doubling roughly every six months. What felt impossible before quickly becomes the new normal.

Once I saw this, I immediately upgraded to the $100/month plan. That is a five-fold increase in spending, and there was zero internal debate. The value was obvious.

What I Use Agents For

Let me walk through the concrete use cases. I use agents for:

Developing features with tests -- the bread and butter.
UX testing with Playwright MCP Server -- automated browser-based testing.
Infrastructure provisioning with Terraform -- I never have to read Terraform docs anymore.
Performance optimization with the Dash0 MCP Server -- connecting to OpenTelemetry logs, traces, and metrics. I can say "analyze my performance, look at the code, can we automatically make improvements?" and it does.
Ticket implementation via GitHub CLI -- the agent reads the ticket and implements it.
Changelog generation from git commits.
Documentation with automated screenshots -- no more hiring people to create user manuals.
Explorative testing -- testing the production app for bugs, including visual bugs and design guide compliance.
Patent writing -- I wrote a patent with a colleague using AI. We simply could not have afforded a patent attorney.
Contract review for enterprise customers.

The range is enormous. And it keeps growing every week.

How My Work Has Changed

I never write git commit messages anymore. I just say "claude push" and it creates the message from context. I no longer write code -- I let code be written, and then I review it. A lot. That is now my main activity: giving instructions and reviewing.

And let me be honest: I do not review everything anymore. I review selectively, risk-based. That is a trade-off, and I am aware of it. But here is the spoiler: without making that trade-off, the software simply would not exist. It would have taken too long.

I also work with two to three terminals in parallel because there are natural waiting times while the AI processes. While one agent is working on a task, I am reviewing or prompting in another.

YOLO Mode

For about three to four weeks now, I have had YOLO mode active. That means Claude can do everything it wants -- except anything that touches production.

Before this, I had the safety model enabled, where Claude asks for permission before running commands. But I developed what I call "approve fatigue." The AI kept stopping to ask trivial questions like "May I run this find command?" I found myself just pressing Enter without even reading the prompt. At that point, the safety model is theater, not security.

So I flipped it around: let the AI work autonomously until it thinks the task is done. I do not want to go back. I still wish for a better security model -- security is not irrelevant -- but this is currently the best trade-off for productivity.

That was fast

"I worked the same way for ten years. Then, in eleven months, everything changed."

Pause and Reflect

Let us pause for a moment. That was a lot of change, and it happened fast.

I worked the same way for ten years. Then, in just eleven months -- from April 2025 to March 2026 -- everything changed. I am still building a software product. But how that software gets created is completely different.

And it is "just" an LLM using tools in a loop. That is all it is. Yet it changed everything.

And it continues

"The direction is clear -- and it does not stop at coding."

Subagents and Agent Teams

But it does not stop there. The next evolution is subagents: the main agent spawns smaller agents for research tasks. For example, "find out how to best instrument this library." The subagent runs off, does its research, and returns the results. This protects the main agent's context from noise. Some of this already happens automatically -- Claude's built-in researcher, for instance.

Beyond that, there are agent teams: you set up a team with an architect who critiques the design, a UX person who critiques the interface, and a coder who builds. You let them interact and iterate. I have tried this once or twice. It is still too early to judge, and it costs significantly more tokens. But the direction is clear.

☁

"I can kick off work before going to sleep, while traveling, or even right before a keynote."

Moving to the Cloud

The path naturally leads to the cloud. Everything I described so far happened on my local MacBook. But now there is a web interface. I can start tasks from my phone or browser. I can kick off work before going to sleep, while traveling, or even right before a keynote like this one.

The agent works, and when it is done, it shows me something like "69 lines added, 5 deleted." I can review the changes and create a pull request. The agent works while I am not at my desk.

🤖 + ☁

"The agents are becoming part of the infrastructure."

Event-Driven Agents

The trigger is changing too. It is no longer just a human saying "do this." When we get an alert through our observability tools, an agent can automatically evaluate it. If it is an exception -- say, someone used something in a template that does not exist -- the agent can automatically create a fix and open a pull request. We just click to merge.

The same applies to failed CI pipelines, post-commit checks (like verifying documentation stays in sync with code), and daily or weekly cron jobs for agents. All of this runs from standard GitHub or GitLab CI/CD workflows. The agents are becoming part of the infrastructure.

zZZ

"I start ten tasks before going to bed and review them in the morning."

Spec-Driven Development: Agents Work While You Sleep

I was inspired by Anthropic's job posting -- the one that offered $500,000/year and required "maximal Claude usage." It got me thinking: how do I let agents work even when I am not working?

The answer is spec-driven development with a tool called AutoClaude. It is essentially a Kanban board UI: I throw in tasks, describe them (the AI can even help me write the descriptions), set a work-in-progress limit of three, and let it run overnight. I had to upgrade to the 200 EUR plan because the token limits on the lower plan were not enough.

Technically, it clones the repository, creates a branch using git worktree for each task, and works locally. I start ten tasks before going to bed and review them in the morning.

The workflow is: plan, then code, then a QA loop, then AI review, and finally human review. The tool generates a spec, produces a QA sign-off, and can even help define the product roadmap through competitive analysis and feature gap identification.

The key point: the agent works while I sleep, and I come back to review results in the morning.

The quality is not as high as before, of course, …

"The software would not exist without AI."

The Quality Trade-off

I want to be transparent about something important: the quality is not as high as when I coded everything manually. The architecture is not as clean. Naming could be better. The code is not as polished.

But here is the thing -- the software would not exist without AI. If I had insisted on doing everything to my previous standard, we would not have shipped.

If you are in a business where absolute perfection is necessary -- building airplane controls or train safety systems -- this trade-off may not work. But for B2B business software? It is more than fine. The speed gain far outweighs the quality gap.

What does this all mean for me as a CEO?

"The scaling happens through AI costs, not headcount."

CEO Perspective: The AI Budget

Let me switch perspective to my role as CEO. Here is how our AI budget per person per month has evolved:

2025: 20 EUR/person/month
2026: 100 EUR (actually already 200)
2027: 1,000 EUR (conservative estimate)
2028: 2,000 EUR

In Silicon Valley, companies are already calculating AI budgets at 10-20% of salary. That comes out to about 20,000 EUR per year. They are already where I expect us to be in 2028.

Slide 30: Not renewing IntelliJ licenses

The IDE Is Dead?

We will not renew our IntelliJ licenses. We even had a 50% startup discount. But I do not see the value anymore. I only use IntelliJ out of habit and for reviewing code changes. Why pay 1,000 EUR per year just for that? Claude Code costs 100 EUR per month and is a much better investment. It is an opportunity cost question. And with the Community Edition no longer being a viable option for commercial use, the calculus is even clearer.

I recently saw a LinkedIn post by Ralf Müller arguing that "The IDE Is Dead." It had 27 likes but over 53 comments (Update March 12, 2026: 113 comments) -- everyone defending their IntelliJ because they love it. It is an emotional topic. People are attached to their tools. But attachment should not drive budget decisions.

Slide 31: 6 engineers can do the work of 60

6 Engineers = 60

You can build a B2B SaaS startup with only six engineers. Those six can accomplish what used to require sixty. We will not hire many more people. Companies will see much higher profit per employee. The scaling happens through AI costs, not headcount.

Many small companies will emerge because of this. We deliver features incredibly fast because our engineers are product engineers, not just software engineers. They make quick product decisions. They understand the customer. That is the key differentiator -- the ability to delegate product decisions to people who truly understand both the technology and the customer.

Building is almost free now. The real bottleneck is deciding what to build and delegating the responsibility for those decisions.

Who We Hire

We only hire selectively. Our top choices are:

Juniors -- they are AI-native. They do not need to unlearn old workflows. They grew up with these tools.
Principals -- they bring product focus, deep customer understanding, and strategic thinking. They are worth the investment.
Seniors -- third choice. Not bad, but in an AI-native world, the other two profiles are more valuable.

In Summary: Agentic Coding is the AI Killer Feature

"It has changed how software gets built, how teams are structured, how budgets are allocated, and how fast products can ship. But the real story is just getting started."

The opportunities of agentic AI are even bigger somewhere else…

"As developers, we live in our coding world -- but business people think about data."

The Bigger Opportunity: Enterprise Data

But the opportunities are even bigger elsewhere. Agentic coding only uses code, telemetry data, and tickets. As developers, we live in our coding world -- but business people think about data. The truly exciting part is when you add enterprise data to the mix.

Imagine asking: "Why have energy costs in production fluctuated more in the last twelve months?" (Source: BARC) An agent could answer that question in an hour. It would analyze production data, cross-reference energy prices, look at scheduling patterns, and synthesize an answer.

Remember: an agent is still just an LLM using tools in a loop. It can answer business questions too -- not just coding questions. But now it works with enterprise data: customer data, health data, financial data, operational data.

Through tools, agents can fetch enterprise data and do anything with it. This is where the real value lies for most organizations.

The Governance Layer

An army of agents is coming, and they all want access to enterprise data. Eighty percent of employees want this. But there are serious challenges to solve:

Access control -- who can access what?
Discoverability -- how do agents find the right data?
Data semantics -- what does this data actually mean?
Terms of use -- under what conditions can this data be used?
Data quality -- how reliable is this data?
SLAs and guarantees -- what performance and availability can be expected?

You need a governance layer between the agents and the enterprise data and APIs. Without it, you either block the agents entirely or create a security and compliance nightmare.

Live demo: AI agent querying enterprise data

Live Demo: Agents Meet Enterprise Data

Let me show you a live demo. I ask the agent: "Which support tickets occur most frequently?"

The agent searches for available data offerings and finds a relevant table in a database. It requests access -- which is automatically approved since this is non-sensitive data. Then it runs SQL queries, analyzing the data from multiple perspectives.

The result: incidents account for 40% of tickets, followed by requests, then problems, and finally changes. From here, the agent could continue working -- drilling deeper, cross-referencing with other data sources, or generating recommendations. The governance layer makes all of this possible while keeping it safe and controlled.

Who is best positioned in the enterprise to design and implement such a governance layer?

"This is a software architecture problem at its core."

A Call to Action for Architects

Architects. You.

This is a software architecture problem at its core. It involves quality goals, component integration, interface management, security boundaries, and organizational design. It is probably the most important architectural task in the near future.

The agents above are only as good as this governance layer beneath them. Without proper data governance, agents either cannot work or cannot be trusted.

My appeal: if there is an initiative in your organization to build this layer, get involved. Help shape it. It will determine the future. Because the agents are coming -- this is simply too powerful and too useful for it not to happen.

Thank You

Thank you for your attention and for the great questions. Let us shape this world together. AI should not just happen to us -- we should co-create it. Including the energy questions, the political questions, the ethical questions. We are architects, and we can shape this. That is our task.

If you want to continue the conversation, you can find me on LinkedIn, reach me at simon.harrer@entropy-data.com, or visit www.entropy-data.com.

Q&A

Selected questions from the audience after the talk.

Q: Won't AI providers like Anthropic or OpenAI just build the governance layer themselves? Or the big cloud vendors?

What I see is that it is the data platform vendors -- Databricks, Snowflake, Google -- who are building these layers. The AI companies like Anthropic and OpenAI focus more on the upper layer: how to manage and schedule agents, how to provide a good runtime environment for them. The problem is the enterprise data below. Data sitting in on-premise systems, behind REST APIs, in legacy formats like EDIFACT. How do you integrate all of that? That is an architecture problem. And if you let a single vendor build it all for you, you are creating one of the biggest lock-ins imaginable. You will never get out of that.

Q: What about nearshoring and offshoring? Does AI replace the need for large offshore teams?

I believe nearshoring will significantly reduce. Someone mentioned they have 30 people in an offshore team. I think two people with a clear product vision can accomplish what those large teams used to do. If they have no language barrier, no cultural barrier, and a strong product vision -- that might actually be better. Of course, if someone offshore has that product vision too, you still want that person. You just do not need the large team around them anymore. It comes down to product vision, not headcount.

Q: If juniors have never coded manually, how can they meaningfully review AI-generated code?

That is a great question. But here is my counter-question: why must a human do the reviewing? Someone who is truly AI-native does not need to manually code to build a great product. They just need to know they want to build something great. They could have the AI generate features, then have five other AIs review the output -- each with its own review focus: architecture, security, UX, performance, correctness. The assumption that "a human must review" is deeply ingrained in us because we have done it for so long. AI-native juniors come with a completely different mindset. Whether universities should still teach Quicksort implementation or rather focus on product management -- I honestly do not know. But I think the skills that matter are shifting.

Q: What about quality attributes like maintainability, performance, and security? Are they no longer relevant?

No, quality goals remain important -- but the trade-offs shift. Take maintainability: the new model might be "Design for Replaceability." The AI can look at a component, understand how it works, throw it away, and rebuild it with two changes. Then you deploy the new version. That is a fundamentally different approach to maintenance. We need to rethink our quality models entirely. And for regulated industries -- industrial automation, safety-critical systems, train control -- the answer might be what Amazon just announced: AI-generated code must be reviewed by two humans who sign off on it. We will have to learn what these tools mean for critical systems. Quality is not irrelevant, but our trade-offs are different now. Sometimes we accept lower quality in exchange for the software existing at all.

Q: What about data sovereignty and the dependence on US-based AI providers?

I have deliberately left this topic out of the talk. I wish we had alternatives. I personally see no viable alternative right now. I use Anthropic because it is simply the best tool for my business case. How this ends, I do not know. You can criticize me for not prioritizing corporate responsibility here. But right now, I am trying to build a company under the current conditions. These are important points, but they are also political questions that I simply cannot solve on my own.