The Thinking Machine and the End of the Tool: A Position Paper on AI, Software, and the Category Error Shaping Global Policy

THE GEOSTRATA
Mar 29
7 min read

There is an idea so deeply woven into how we build and govern software that most people in the industry have stopped noticing it is even an idea, and the idea is that software is a tool. The human thinks, the software executes. Every SaaS contract ever signed, every product liability ruling ever handed down, every regulatory text written about information technology in the last four decades takes this separation as a given, the way a fish takes water as a given. We talk about using software in the same grammatical construction we use for wrenches and saws, as though the object contributes nothing of its own intelligence and the human supplies all the cognitive work. Martin Heidegger described this relationship with beautiful precision when he introduced the concept of the ready-to-hand, which captures how a well-functioning tool vanishes from the awareness of the person holding it. You do not think about the hammer when the hammer is working. You think about the nail. Software has behaved this way for fifty years, and nobody experiences a spreadsheet recalculating a column as a cognitive event, because it isn't one. The thinking stays with the human, and the entire legal, commercial, and regulatory architecture of the technology industry was built on the quiet confidence that it always would.

The Thinking Machine and the End of the Tool: A Position Paper on AI, Software, and the Category Error Shaping Global Policy

Illustration by The Geostrata

Something worth saying plainly, because the claim is easy to misunderstand. The argument here is not that AI is sentient, or conscious, or alive, or any of the other things that make for entertaining dinner party arguments but terrible policy analysis. The argument is narrower and, in its consequences, much larger, which is that AI performs cognitive work in a way no previous software did, and the wholesale failure to take that distinction seriously is producing bad products, bad laws, and a growing accountability vacuum that nobody seems eager to fill. When an AI system interprets a legal contract, or triages patient symptoms, or decides which fifteen out of three hundred applicants a recruiter should see, or writes production code from a two-sentence prompt, it is not executing instructions the way a database executes a query. It is reading ambiguity, drawing inferences from incomplete and sometimes contradictory information, and arriving at outputs that the people who built it cannot fully predict and often cannot explain. That is not a faster spreadsheet. That is a different kind of thing, and the difference matters.

Hannah Arendt, writing in The Human Condition nearly seventy years ago, drew what remains the most useful philosophical line for understanding what has changed. She separated human activity into three modes. Labor was the repetitive cyclical work of biological survival. Work was the fabrication of durable objects and institutions. And action was the exercise of judgment, initiative, and freedom within a shared world, the capacity to begin something genuinely new and to bear responsibility for having begun it. Every generation of software before the current one lived comfortably inside the first two categories. Databases labored by storing and retrieving. Spreadsheets worked by calculating and producing structured outputs. Workflow engines shuffled tasks from one column to another with the cheerful obedience of a conveyor belt. All of it was instrumental, all of it was subordinate, and the third category, Arendt's domain of judgment, remained a human monopoly. What makes AI philosophically interesting, and what makes it a governance problem rather than just an engineering upgrade, is that it trespasses into that third domain. When a model reads a chest X-ray and flags a suspicious mass, it is exercising something that functions, for all practical and legal purposes, as clinical judgment, and the fact that philosophers could spend a hundred years debating whether the machine really judges does nothing to change the policy reality, which is that a consequential decision was made by a system and not by the professional who will answer for it.

There is, admittedly, a philosophical tradition that tries with some elegance to make this whole problem disappear. In 1998 Andy Clark and David Chalmers published The Extended Mind, arguing that cognition does not stop at the boundary of the skull and that tools can become genuine parts of a person's cognitive system, the notebook literally becoming part of memory, the calculator literally becoming part of mathematical reasoning. If they are right, then maybe AI is simply the latest cognitive extension and the human remains the rightful author of the thought even when the machine contributes substantially to producing it. It is an appealing picture that stops being right at exactly the point where the stakes become highest. Clark and Chalmers built a condition into their argument that later enthusiasts quietly drop, which is that the human controls the external component and the component's behavior is bounded and predictable. You write in the notebook. You punch the numbers. AI violates this condition fundamentally, because when a large language model generates a legal analysis it is producing its own reasoning, shaped by training data no user has reviewed, following inferential paths no user can retrace, arriving at conclusions the user may accept without reconstructing how they were reached. That is not extension. That is delegation. And the difference is not academic, because extension preserves accountability while delegation shatters it. When a notebook stores a fact and a person acts on it, the person remains the author of the action. When a system reasons on someone's behalf and that someone signs off without grasping the reasoning, they have become not the author of the conclusion but merely its endorser. The legal systems, professional standards, and product frameworks that govern software were all designed for a world of extended minds, and they have almost nothing useful to say about a world of delegated ones.

The practical fallout from this confusion is already here and it is, at times, faintly absurd. Product liability law assumes a clean separation between manufacturer, product, and user. When a contractor misuses a power saw the liability falls on the contractor, because the saw did not have opinions about how it ought to be used. But when the product itself starts having opinions, when an AI feature inside a SaaS platform generates a financial recommendation that turns out to be expensively wrong, that separation collapses and the question of who pays becomes a legal food fight resolved not by principle but by whoever drafted the more aggressive limitation-of-liability clause. In professional services the situation gets stranger still, because a doctor relying on an AI diagnostic system is still licensed, still liable, and still officially the one practicing medicine, even though the actual clinical reasoning has migrated to a system that holds no license, took no oath, and cannot be deposed. Financial advisors, lawyers, engineers, product managers, they all face the same quiet mismatch. What is being constructed, without much public conversation about it, is a new class of professionals who carry the full legal weight of accountability while performing a shrinking share of the actual thinking the accountability was designed to govern.

Meanwhile the governments writing the rules are, with the best of intentions, solving a different problem. The EU AI Act, the US proposals, the national strategies emerging across Asia and Latin America, all treat AI as a product to be classified by risk level, benchmarked, certified, and shipped with documentation. The approach borrows from how we regulate medical devices, which makes sense if AI is a tool, because tools can be tested against specifications and stamped as safe. But the entire point of judgment is that it operates in situations specifications do not anticipate, which is why nobody certifies a judge once and then never checks again, and why physicians face ongoing peer review rather than a single conformity assessment at the start of their careers. There are centuries of thoughtful practice for governing agents who exercise judgment, including professional licensing, fiduciary duties, and ethical codes, and almost none of it is being brought to bear on AI. The policy world is trying to stuff a new kind of entity into the product certification box, and the seams are splitting.

The strongest objection to all of this deserves honest engagement. Critics will reasonably point out that autopilot has flown aircraft for decades, that algorithmic trading makes thousands of consequential decisions every second, and that automated diagnostics have existed since the 1970s, so if machine judgment were truly a new category we would have needed new governance long ago. This is partly right, but the comparison hides a distinction that turns out to be decisive, which is that every previous automated system we successfully governed operated within a tightly bounded domain, with well-specified parameters and failure modes known in advance. An autopilot that malfunctions behaves wrong in ways that are mechanically obvious, because departures from the operating envelope trip alarms. General-purpose AI is different in kind, because the domain is open-ended, the reasoning is opaque, and the failures take the form not of engineering errors but of plausible-sounding judgments that happen to be wrong, delivered with the same confidence as the correct ones. An autopilot that fails sets off a warning. An AI that delivers bad judgment hands you a clean paragraph that reads exactly like a good one. That asymmetry between apparent and actual reliability represents the central governance challenge this technology presents, and nothing on the table anywhere in the world addresses it seriously.

The technology industry and the policymakers trying to regulate it need to stop treating AI as a better version of the tools we have built for fifty years and begin treating it as what it is, which is a new category of technology that performs cognitive work on behalf of its users in ways those users often cannot verify or reconstruct. For product builders, this means accepting that when the feature you ship exercises judgment, you own something qualitatively different from a workflow automation, and that product management needs to account for the reasoning your system performs and its consequences. For policymakers, it means recognizing that the certification paradigm is structurally inadequate to systems whose behavior cannot be specified in advance, and that adequate governance would borrow more from professional regulation than from product safety. And for the professionals who use AI tools every day, it means confronting a question that the speed of deployment has allowed everyone to politely avoid, which is who exactly is responsible for the thinking that increasingly runs our institutions, and whether the humans in the loop are exercising genuine judgment or merely endorsing conclusions generated by a process nobody fully understands. The modern software industry was built on a clean division between human thought and machine execution. It was always a simplification, but a useful one that allowed clear contracts and identifiable accountability. AI makes it untenable, and the longer everyone pretends otherwise, the wider the gap grows between what this technology does and what our institutions are prepared to govern.

BY ASISH SINGH

TEAM GEOSTRATA

info@thegeostrata.com