LLM을 새로운 고수준 언어로

LLMs as the new high level language

168 pointsby swah2026. 2. 3.324 comments

요약

이 글은 대규모 언어 모델(LLM)과 그 에이전트들이 어셈블리에서 벗어난 상위 언어처럼 새로운 상위 프로그래밍 언어가 되고 있다고 제안합니다. 자율적으로 작업하는 LLM 에이전트가 개발자 생산성을 크게 향상시킬 수 있다고 주장합니다. 코드 품질과 이해 가능성과 같은 잠재적 과제를 인정하면서도, LLM이 명세와 시나리오가 에이전트 워크플로우를 주도하는 비대화면 개발의 새로운 패러다임을 가능하게 하여 웹 및 소프트웨어 개발을 재정의할 수 있다고 제안합니다.

toprerules•19시간 전

After working with the latest models I think these "it's just another tool" or "another layer of abstraction" or "I'm just building at a different level" kind of arguments are wishful thinking. You're not going to be a designer writing blueprints for a series of workers to execute on, you're barely going to be a product manager translating business requirements into a technical specification before AI closes that gap as well. I'm very convinced non-technical people will be able to use these tools, because what I'm seeing is that all of the skills that my training and years of experience have helped me hone are now implemented by these tools to the level that I know most businesses would be satisfied by.

The irony is that I haven't seen AI have nearly as large of an impact anywhere else. We truly have automated ourselves out of work, people are just catching up with that fact and the people that just wanted to make money from software can now finally stop pretending that "passion" for "the craft" was every really part of their motivating calculus.

[삭제된 댓글]

shahbaby•19시간 전

> what I'm seeing is that all of the skills that my training and years of experience have helped me hone are now implemented by these tools to the level that I know most businesses would be satisfied by.

So when things break or they have to make changes, and the AI gets lost down a rabbit hole, who is held accountable?

asa400•19시간 전

If all you (not you specifically, more of a royal “you” or “we”) are is a collection of skills centered around putting code into an editor and opening pull requests as fast as possible, then sure, you might be cooked.

But if your job depends on taste, design, intuition, sociability, judgement, coaching, inspiring, explaining, or empathy in the context of using technology to solve human problems, you’ll be fine. The premium for these skills is going _way_ up.

[삭제된 댓글]

hackyhacky•18시간 전

> The irony is that I haven't seen AI have nearly as large of an impact anywhere else.

We are in this pickle because programmers are good at making tools that help programmers. Programming is the tip of the spear, as far as AI's impact goes, but there's more to come.

Why pay an expensive architect to design your new office building, when AI will do it for peanuts? Why pay an expensive lawyer to review your contract? Why pay a doctor, etc.

Short term, doing for lawyers, architects, civil engineers, doctors, etc what Claude Code has done for programmers is a winning business strategy. Long term, gaining expertise in any field of intellectual labor is setting yourself up to be replaced.

raincole•13시간 전

> translating business requirements into a technical specification

a.k.a. Being a programmer.

> The irony is that I haven't seen AI have nearly as large of an impact anywhere else.

What lol. Translation? Graphic design?

anonnon•9시간 전

> After working with the latest models I think these "it's just another tool" or "another layer of abstraction" or "I'm just building at a different level" kind of arguments are wishful thinking. You're not going to be a designer writing blueprints for a series of workers to execute on, you're barely going to be a product manager translating business requirements into a technical specification before AI closes that gap as well

I think it's doubtful you'll be even that; certainly not with the salary and status that normally entails.

> I'm very convinced non-technical people will be able to use these tools

This suggests that the skill ceiling of "Vibe Coding" is actually quite low, calling into question the sense of urgency with which certain AI influnecers present it, as if it were a skill that you need to invest major time & effort to hone now (with their help, of course), lest you get left behind and have to "catch up" later. Yet one could easily see it being akin to Googling, which was also a skill (when Google was usable), one that did indeed increase your efficiency and employable, but with a low ceiling, such that "Googler" was never a job by itself, the way some suggest "prompt engineer" will be. The Google analogy is apt, in that you're typing keywords into a blackbox until it spits out what you want; quite akin to how people describe "prompt engineering."

Also the Vibe Coding skillset--a bag of tricks and book of incantations you're told can cajole the model--has a high churn rate. Once, narrow context windows meant restarting a session de novo was advisable if you hit a roadblock, but now it's usually the opposite.

If this all true, then wouldn't the correct takeaway, rather than embracing and mastering "Vibe Coding" (as influencers suggest), be to "pivot" to a new career, like welding?

> The irony is that I haven't seen AI have nearly as large of an impact anywhere else. We truly have automated ourselves out of work, people are just catching up with that fact

What's funny is artists immediately, correctly perceived the threat of AI. You didn't see cope about it being "just another tool, like Photoshop."

echelon•19시간 전

These models are nothing short of astounding.

I can write a spec for an entirely new endpoint, and Claude figures out all of the middleware plumbing and the database queries. (The catch: this is in Rust and the SQL is raw, without an ORM. It just gets it. I'm reviewing the code, too, and it's mostly excellent.)

I can ask Claude to add new data to the return payloads - it does it, and it can figure out the cache invalidation.

These models are blowing my mind. It's like I have an army of juniors I can actually trust.

tryauuum•19시간 전

> this is in Rust and the SQL is raw, without an ORM.

where's the catch? SQL is an old technology, surely an LLM is good with it

Calavar•19시간 전

I'm not sure I'd call agents an army of juniors. More like a high school summer intern who has infinite time to do deep dives into StackOverflow but doesn't have nearly enough programming experience yet to have developed a "taste" for good code

In my experience, agentic LLMs tend to write code that is very branchy with cyclomatic complexity. They don't follow DRY principles unless you push them very hard in that direction (and even then not always), and sometimes they do things that just fly in the face of common sense. Example of that last part: I was writing some Ruby tests with Opus 4.6 yesterday, and I got dozens of tests that amounted to this:

   x = X.new
   assert x.kind_of?(X)

This is of course an entirely meaningless check. But if you aren't reading the tests and you just run the test job and see hundreds of green check marks and dozens of classes covered, it could give you a false sense of security

TZubiri•19시간 전

"Following this hypothesis, what C did to assembler, what Java did to C, what Javascript/Python/Perl did to Java, now LLM agents are doing to all programming languages."

This is not an appropriate analogy, at least not right now.

Code Agents are generating code from prompts, in that sense the metaphor is correct. However Agents then read the code and it becomes input and they generate more code. This was never the case for compilers, an LLM used in this sense is strictly not a compiler because it is not cyclic and not directional.

danparsonson•14시간 전

I think it's appropriate in terms of the results rather than the process; the bigger problem I see is that programming languages are designed to be completely unambiguous, whereas human language is not ("Go to the shop and buy one box of eggs, and if they have carrots, buy three") so we're transitioning from exactly specifying what we want the software to do, to tying ourselves in knots trying to specify it exactly, while a machine tries to disambiguate our request. I bet lawyers would make good vibe coders.

tomaytotomato•19시간 전

I would like to hijack the "high level language" term to mean dopamine hits from using an LLM.

"Generate a Frontend End for me now please so I don't need to think"

LLM starts outputting tokens

Dopamine hit to the brain as I get my reward without having to run npm and figure out what packages to use

Then out of a shadowy alleyway a man in a trenchcoat approaches

"Pssssttt, all the suckers are using that tool, come try some Opus 4.6"

"How much?"

"Oh that'll be $200.... and your muscle memory for running maven commands"

"Shut up and take my money"

----- 5 months later, washed up and disconnected from cloud LLMs ------

"Anyone got any spare tokens I could use?"

cyberax•19시간 전

> and your muscle memory for running maven commands

Here's $1000. Please do that. Don't bother with the LLM.

jatora•15시간 전

If you're disconnected from cloud LLM's you've got bigger problems than coding can solve lol

allovertheworld•11시간 전

aka a mind virus

imiric•9시간 전

I can't tell if your general premise is serious or not, but in case it is: I get zero dopamine hits from using these tools.

My dopamine rush comes from solving a problem, learning something new, producing a particularly elegant and performant piece of code, etc. There's an aspect of hubris involved, to be sure.

Using a tool to produce the end result gives me no such satisfaction. It's akin to outsourcing my work to someone who can do it faster than me. If anything, I get cortisol hits when the tool doesn't follow my directions and produces garbage output, which I have to troubleshoot and fix myself.

stared•19시간 전

No, prompts are not the new source code, vide https://quesma.com/blog/vibe-code-git-blame/.

[삭제된 댓글]

dsr_•19시간 전

It's not a programming language if you can't read someone else's code, figure out what it does, figure out what they meant, and debug the difference between those things.

"I prompted it like this"

"I gave it the same prompt, and it came out different"

It's not programming. It might be having a pseudo-conversation with a complex system, but it's not programming.

[삭제된 댓글]

furyofantares•19시간 전

I think I 100% agree with you, and yet the other day I found myself telling someone "Did you know OpenClaw was written Codex and not Claude Code?", and I really think I meant it in the same sense I'd mean a programming language or framework, and I only noticed what I'd said a few minutes later.

Closi•19시간 전

> It's not a programming language if you can't read someone else's code, figure out what it does, figure out what they meant, and debug the difference between those things.

Well I think the article would say that you can diff the documentation, and it's the documentation that is feeding the AI in this new paradigm (which isn't direct prompting).

If the definition of programming is "a process to create sets of instructions that tell a computer how to perform specific tasks" there is nothing in there that requires it to be deterministic at the definition level.

beefsack•19시간 전

Prompting isn't programming. Prompting is managing.

problynought•19시간 전

All programming achieves the same outcome; requests the OS/machine set aside some memory to hold salient values and mutate those values in-line with mathematical recipe.

Functions like:

updatesUsername(string) returns result

...can be turned into generic functional euphemism

takeStringRtnBool(string) returns bool

...same thing. context can be established by the data passed in, external system interactions (updates user values, inventory of widgets)

as workers SWEs are just obfuscating how repetitive their effort is to people who don't know better

the era of pure data driven systems is arrived. in-line with the push to dump OOP we're dumping irrelevant context in the code altogether: https://en.wikipedia.org/wiki/Data-driven_programming

hackyhacky•18시간 전

> "I gave it the same prompt, and it came out different"

I wrote a program in C and and gave it to gcc. Then I gave the same program to clang and I got a different result.

I guess C code isn't programming.

orbital-decay•18시간 전

>"I prompted it like this"

>"I gave it the same prompt, and it came out different"

1:1 reproducibility is much easier in LLMs than in software building pipelines. It's just not guaranteed by major providers because it makes batching less efficient.

Daviey•18시간 전

How did you come up with this definition?

[삭제된 댓글]

rvz•19시간 전

So we are going to certainly see more of these incidents then [0] from those not understanding LLM written code as now 'engineers' will let their skills decay because the 'LLMs know best'.

[0] https://sketch.dev/blog/our-first-outage-from-llm-written-co...

apical_dendrite•19시간 전

I'm trying to work with vibe-coded applications and it's a nightmare. I am trying to make one application multi-tenant by moving a bunch of code that's custom to a single customer into config. There are 200+ line methods, dead code everywhere, tons of unnecessary complexity (for instance, extra mapping layers that were introduced to resolve discrepancies between keys, instead of just using the same key everywhere). No unit tests, of course, so it's very difficult to tell if anything broke. When the system requirements change, the LLM isn't removing old code, it's just adding new branches and keeping the dead code around.

I ask the developer the simplest questions, like "which of the multiple entry-points do you use to test this code locally", or "you have a 'mode' parameter here that determines which branch of the code executes, which of these modes are actually used? and I get a bunch of babble, because he has no idea how any of it works.

Of course, since everyone is expected to use Cursor for everything and move at warp speed, I have no time to actually untangle this crap.

The LLM is amazing at some things - I can get it to one-shot adding a page to a react app for instance. But if you don't know what good code looks like, you're not going to get a maintainable result.

danparsonson•14시간 전

You've just described the entirely-human-made project that I'm working on now.... at least now we can deliver the intractable mess much more quickly!

nly•18시간 전

I have a source file of a few hundred lines implementing an algorithm that no LLM I've tried (and I've tried them all) is able to replicate, or even suggest, when prompted with the problem. Even with many follow up prompts and hints.

The implementations that come out are buggy or just plain broken

The problem is a relatively simple one, and the algorithm uses a few clever tricks. The implementation is subtle...but nonetheless it exists in both open and closed source projects.

LLMs can replace a lot of CRUD apps and skeleton code, tooling, scripting, infra setup etc, but when it comes to the hard stuff they still suck.

Give me a whiteboard and a fellow engineer anyday

jatora•15시간 전

i bet i could replicate it if you showed me the source file

prxm•14시간 전

This is one of my favourite activites with LLMs as well. After implementing some sort of idea for an algorithm, I try seeing what an LLM would come up with. I hint it as well and push it in the correct direction with many iterations but never tell the most ideal one. And as a matter of fact they can never reach the quality I did with my initial implementation.

kranner•14시간 전

I'm seeing the same thing with my own little app that implements several new heuristics for functionality and optimisation over a classic algorithm in this domain. I came up with the improvements by implementing the older algorithm and just... being a human and spending time with the problem.

The improvements become evident from the nature of the problem in the physical world. I can see why a purely text-based intelligence could not have derived them from the specs, and I haven't been able to coax them out of LLMs with any amount of prodding and persuasion. They reason about the problem in some abstract space detached from reality; they're brilliant savants in that sense, but you can't teach a blind person what the colour red feels like to see.

chasd00•13시간 전

Well I think that’s kind of the point or value in these tools. Let the AI do the tedious stuff saving your energy for the hard stuff. At least that’s how I use them, just save me from all the typing and tedium. I’d rather describe something like auth0 integration to an LLM than do it all myself. Same goes for like the typical list of records, clock one, view the details and then a list of related records and all the operations that go with that. Like it’s so boring let the LLM do that stuff for you.

[삭제된 댓글]

simianwords•11시간 전

There's very low chance this is possible. If you can share the problem, I'm 90% sure an LLM can come up with a non buggy implementation.

Its easy to claim this and just walk away. But better for overall discussion to provide the example.

dgb23•4시간 전

> but when it comes to the hard stuff they still suck.

Also much of the really annoying, time consuming stuff, like frontend code. Writing UIs is not rocket science, but hard in a bad way and LLMs are not helping much there.

Plus, while they are _very_ good at finding common issues and gotchas quickly that are documented online (say you use some kind of library that you're not familiar with in a slightly wrong way, or you have a version conflict that causes an issue), they are near useless when debugging slightly deeper issues and just waste a ton of time.

kazinator•18시간 전

This is a good summary of any random week's worth of AI shilling from your LinkedIn feed, that you can't get rid of.

OutOfHere•18시간 전

The side effect of using LLMs for programming is that no new programming language can now emerge to be popular, that we will be stuck with the existing programming languages forever for broad use. Newer languages will never accumulate enough training data for the LLM to master them. Granted, non-LLM AIs with true neural memory can work around this, as can LLMs with an infinite token frozen+forkable context, but these are not your everyday LLMs.

spacephysics•15시간 전

I wouldn’t be surprised if in the next 5-10 years the new and popular programming language is one built with the idea of optimizing how well LLM’s (or at that point world models) understand and can use it.

Right now LLMs are taking languages meant for humans to understand better via abstraction, what if the next language is designed for optimal LLM/world model understanding?

Or instead of an entirely new language, theres some form of compiling/transpiling from the model language to a human centric one like WASM for LLMs

raincole•13시간 전

I don't think we need that many programming languages anyway.

I'm more worried about the opposite: the next popular programming paradigm will be something that's hard to read for humans but not-so-hard for LLM. For example, English -> assembly.

slopusila•10시간 전

you can invent a new language, ask LLM to translate existing code bases into it, then train on that

Just like AlphaZero ditched human Go matches and trained on synthetic ones, and got better this way

AlexeyBrin•17시간 전

This is an exaggeration, if you store the prompt that was "compiled" by today's LLMs there is no guarantee that in 4 months from now you will be able to replicate the same result.

I can take some C or Fortran code from 10 years ago, build it and get identical results.

gnatolf•10시간 전

That is a wobbly assertion. You certainly would need to run the same compiler, forgo any recent optimisations, architecture updates and the likes if your code has numerical sensitive parts.

You certainly can get identical results, but it's equally certainly not going to be that simple a path frequently.

ekropotin•14시간 전

IDK how everyone else feel about it, but a non-deterministic “compiler” is the last thing I need.

ChrisGreenHeur•14시간 전

I may have bad news for you on how compilers typically work.

robrenaud•14시간 전

A compiler that can turn cash into improved code without round tripping a human is very cool though. As those steps can get longer and succeed more often in more difficult circumstances, what it means to be a software engineer changes a lot.

booleandilemma•8시간 전

Well I've been seeing on HN how everyone else feels about it and I'm terrified.

pjmlp•7시간 전

I use them everywhere since the late 1990's, it is called managed runtime.

rzmmm•5시간 전

I think it's technically possible to achieve determinism with LLM output. The LLM makers typically make them non-deterministic by default but it's not inherent to them.

WoodenChair•14시간 전

The article starts with a philosophically bad analogy in my opinion. C-> Java != Java -> LLM because the intermediate product (the code) changed its form with previous transitions. LLMs still produce the same intermediate product. I expanded on this in a post a couple months back:

https://www.observationalhazard.com/2025/12/c-java-java-llm....

"The intermediate product is the source code itself. The intermediate goal of a software development project is to produce robust maintainable source code. The end product is to produce a binary. New programming languages changed the intermediate product. When a team changed from using assembly, to C, to Java, it drastically changed its intermediate product. That came with new tools built around different language ecosystems and different programming paradigms and philosophies. Which in turn came with new ways of refactoring, thinking about software architecture, and working together.

LLMs don’t do that in the same way. The intermediate product of LLMs is still the Java or C or Rust or Python that came before them. English is not the intermediate product, as much as some may say it is. You don’t go prompt->binary. You still go prompt->source code->changes to source code from hand editing or further prompts->binary. It’s a distinction that matters.

Until LLMs are fully autonomous with virtually no human guidance or oversight, source code in existing languages will continue to be the intermediate product. And that means many of the ways that we work together will continue to be the same (how we architect source code, store and review it, collaborate on it, refactor it, etc.) in a way that it wasn’t with prior transitions. These processes are just supercharged and easier because the LLM is supporting us or doing much of the work for us."

valenterry•14시간 전

What would you say if someone has a project written in, let's say, PureScript and then they use a Java backend to generate/overwrite and also version control Java code. If they claim that this would be a Java project, you would probably disagree right? Seems to me that LLMs are the same thing, that is, if you also store the prompt and everything else to reproduce the same code generation process. Since LLMs can be made deterministic, I don't see why that wouldn't be possible.

renewiltord•13시간 전

We’re missing the boat here. There are already companies with millions in revenue that are pure agent loops of English text. They can do things our traditional software cannot.

kristjansson•11시간 전

An example or two would go a long way here.

paulhebert•10시간 전

Can you share some examples?

QuadrupleA•12시간 전

Can we stop repeating this canard, over and over?

Every "classic computing" language mentioned, and pretty much in history, is highly deterministic, and mind-bogglingly, huge-number-of-9s reliable (when was the last time your CPU did the wrong thing on one of the billions of machine instructions it executes every second, or your compiler gave two different outputs from the same code?)

LLMs are not even "one 9" reliable at the moment. Indeed, each token is a freaking RNG draw off a probability distribution. "Compiling" is a crap shoot, a slot machine pull. By design. And the errors compound/multiply over repeated pulls as others have shown.

I'll take the gloriously reliable classical compute world to compile my stuff any day.

kykat•12시간 전

Agreed, yet we will have to keep seeing this take over and over again. As if I needed more reasons to believe the world is filled with morons.

anon946•12시간 전

Isn't this a little bit of a category error? LLMs are not a language. But prompts to LLMs are written in a language, more or less a natural language such as English. Unfortunately, natural languages are not very precise and full of ambiguity. I suspect that different models would interpret wordings and phrases slightly differently, leading to behaviors in the resulting code that are difficult to predict.

empressplay•11시간 전

Right, but that's the point -- prompting an LLM still requires 'thinking about thinking' in the Papert sense. While you can talk to it in 'natural language' that natural language still needs to be _precise_ in order to get the exact result that you want. When it fails, you need to refine your language until it doesn't. So prompts = high-level programming.

pjmlp•9시간 전

Not really, because when they are feed into agents, those agents will take over tasks that previously required writing some kinds of classical programming.

I have already watched integrations between SaaS being deployed with agents instead of classical middleware.

lofaszvanitt•12시간 전

Why using agents if there are absolutely zero need for them? It's the usual, here, we spent a shitton of money on this, now find out how we MUST include this horrible thing into our already bloated dev environment.

DavidPiper•12시간 전

One of the reasons we have programming languages is they allow us to express fluently the specificity required to instruct a machine.

For very large projects, are we sure that English (or other natural languages) are actually a better/faster/cheaper way to express what we want to build? Even if we could guarantee fully-deterministic "compilation", would the specificity required not balloon the (e.g.) English out to well beyond what (e.g.) Java might need?

Writing code will become writing books? Still thinking through this, but I can't help but feel natural languages are still poorly suited and slower, especially for novel creations that don't have a well-understood (or "linguistically-abstracted") prior.

kristjansson•11시간 전

Perhaps we'll go the way of the Space Shuttle? One group writes a highly-structured, highly-granular, branch-by-branch 2500 page spec, and another group (LLM) writes 25000 lines of code, then the first group congratulates itself on on producing good software without have to write code?

abcde666777•12시간 전

Are these kinds of articles a new breed of rage bait? They keep ending up on the front page with thriving comment sections, but in terms of content they're pretty low in nutritional value.

So I'm guessing they just rise because they spark a debate?

eastbound•12시간 전

There’s barely any debate, people don’t answer each other; It’s rather about invoking the wonder and imagination of everyone’s brain. Like spatial conquest or an economic crisis: It will change everything but you can’t do anything immediately about it, and everyone tries to understand what it will change so they can adapt. It’s more akin to 24hrs junk news cycle, where everything is presented as an alert but your tempted to listen because it might affect you.

sph•11시간 전

It’s both rage and hype bait, farming karma.

The optimists upvote and praise this type of content, then the pessimists come to comment why this field is going to the dogs. Rinse and repeat.

simianwords•11시간 전

What I find interesting is that similar propositions made 2 years ago was ragebait to the same people but they ended up coming true.

zkmon•10시간 전

> So I'm guessing they just rise because they spark a debate?

Precisely. Attention economy. It rules.

xyzsparetimexyz•9시간 전

This one didn't contain a sepia tinted ai slop diagram so it beats the average

j45•9시간 전

Maybe it still could be new to some.

feverzsj•9시간 전

You can get to the front page easily with dozen upvotes, like from your colleagues and friends. Sadly, that's possibly the only way to get your post some attention here now.

elzbardico•8시간 전

Vibe coders are the new eternal september

[삭제된 댓글]

g947o•6시간 전

I complained about the same thing, but apparently people take the bait.

Which is why I only quickly scan through the comments to see if there are new insights I haven't seen in the past few months. Surprise, almost never.

bonoboTP•6시간 전

This comment has even lower nutritional value. It's just a "dislike" with more words. You could have offered your counterarguments or if you're too tired of it but still feel you need to be heard, you could have linked to a previous comment or post of yours.

etrvic•6시간 전

I once read here on HN that a good metric for filtering controversial comment sections is number of upvotes/comments. If it's bellow one, the thread is probably controversial.

kristjansson•11시간 전

Code written in a HLL is a sufficient[1] description of the resulting program/behavior. The code, in combination with the runtime, define constraints on the behavior of the resulting program. A finished piece of HLL code encodes all the constraints the programmer desired. Presuming a 'correct' compiler/runtime, any variation in the resulting program (equivalently the behavior of an interpreter running the HLL code) varies within the boundaries of those constraints.

Code in general is also local, in the sense that small perturbation to the code has effects limited to a small and corresponding portion of the program/behavior. A change to the body of a function changes the generated machine code for that function, and nothing else[2].

Prompts provided to an LLM are neither sufficient nor local in the same way.

The inherent opacity of the LLM means we can make only probabilistic guarantees that the constraints the prompt intends to encode are reflected by the output. No theory (that we now know) can even attempt to supply such a guarantee. A given (sequence of) prompts might result in a program that happens to encode the constraints the programmer intended, but that _must_ be verified by inspection and testing.

One might argue that of course an LLM can be made to produce precisely the same output for the same input; it is itself a program after all. However, that 'reproducibility' should not convince us that the prompts + weights totally define the code any more than random.Random(1).random() being constant should cause us to declare python's .random() broken. In both cases we're looking at a single sample from a pRNG. Any variation whatsoever would result in a different generated program, with no guarantee that program would satisfy the constraints the programmer intended to encode in the prompts.

While locality falls similarly, one might point out the an agentic LLM can easily make a local change to code if asked. I would argue that an agentic LLMs prompts are not just the inputs from the user, but the entire codebase in its repo (if sparsely attended to by RAG or retrieval tool calls or w/e). The prompts _alone_ cannot be changed locally in a way that guarantees a local effect.

The prompt LLM -> program abstraction presents leaks of such volume and variety that it cannon be ignored like the code -> compiler -> program abstraction can. Continuing to make forward progress on a project requires the robot (and likely the human) attend to the generated code.

Does any of this matter? Compilers and interpreters themselves are imperfect, their formal verification is incomplete and underutilized. We have to verify properties of programs via testing anyway. And who cares if the prompts alone are insufficient? We can keep a few 100kb of code around and retrieve over it to keep the robot on track, and the human more-or-less in the loop. And if it ends up rewriting the whole thing every few iterations as it drifts, who cares?

For some projects where quality, correctness, interoperability, novelty, etc don't matter, it might be. Even in those, defining a program purely via prompts seems likely to devolve eventually into aggravation. For the rest, the end of software engineering seems to be greatly exaggerated.

[1]: loosely in the statistical sense of containing all the information the programmer was able to encode https://en.wikipedia.org/wiki/Sufficient_statistic

[2]: there're of course many tiny exceptions to this. we might be changing a function that's inlined all over the place; we might be changing something that's explicitly global state; we might vary timing of something that causes async tasks to schedule in a different order etc etc. I believe the point stands regardless.

1zael•11시간 전

There's nothing novel in this article. This is what every other AI clickbait article is propagating.

mock-possum•11시간 전

Hasnt natural language always been the highest level language?

gloosx•4시간 전

Natural language is not the highest level – it is just the highest level that still needs words.

PunchyHamster•10시간 전

At this point I'm just waiting for people claiming they managed team of 20 people, where "20 people" were LLMs being fed a prompt

pavlov•10시간 전

This claim has been going around for a while.

I know someone who has been spending $7k / month on Cursor tokens for the past six months, managing such a team of agents… But curiously the results seem to be endless PDF-ware, and every month there’s a new reason why the project is not yet quite ready to be used on real data.

LLMs are very good at making you think they’re giving you what you hoped to get.

rl3•10시간 전

It's not a complete agentic setup until your chain of command goes at least nine levels deep.*

I want my C-suite and V-suite LLMs to feel like they earned their positions through hard work, values, and commitment to their company.

* = (Not to be confused with a famous poem by Dante Alighieri)

geon•10시간 전

> The code that LLMs make is much worse than what I can write: almost certainly; but the same could be said about your assembler

Has this been true since the 90s?

I pretty much only hear people saying modern compilers are unbeatable.

dainiusse•10시간 전

Does the product work? Is it maintainable?

Everything else is secondary.

zkmon•10시간 전

Why isn't there a downvote button for the thread?

fullstackchris•10시간 전

> Using LLM agents is expensive: if they give you already 50% more productivity, and your salary is an average salary, they are not. And LLMs will only get cheaper. They are only expensive in absolute, not in relative terms.

critical distinction: unless your getting paid comparable to your ouput (literally 0 traditional 9-5 software jobs I know unfortunately) this is infact the opposite - a subscription to any of these services reduces your overall salary, it doesnt make it higher...

then there is the case i know the dishonest are doing is firing of claude or whatever and going for a walk

svilen_dobrev•9시간 전

so, prompt engineering it is. Happy new LLMium.

redbell•9시간 전

As per Andrej Karpathy's viral tweet from three years ago [1]:

  The hottest new programming language is English

______________

1. https://x.com/i/status/1617979122625712128

DaedalusII•9시간 전

The biggest and least controversial thing will be when anthropic create a onedrive/googledrive integration that lets white collar employees create, edit, and export word documents into pdfs, referring to other files in the folder. This alone will increase average white paper employee productivity by 100x and lead to the most job displacement.

For instance: Here is an email from my manager at 1pm today. Open the policy document he is referring to, create a new version, and add the changes he wants. refer to the entire codebase (our company onedrive/google drive/dropbox whatever) to make sure it is contextually correct.

>Sure, here is the document for your review

Great, reply back to manager with attachment linked to OneDrive

sublinear•9시간 전

Your example actually perfectly describes why it won't displace anyone.

The user still has to be in the loop to instruct the LLM every time, and the tiniest nuances in each execution of that work still matters. If it didn't we'd have replaced people with bash scripts many decades ago. When we've tried to do that, the maintenance of those scripts became a game of whack-a-mole that never ended and they're eventually abandoned. I think sometimes people forget how ineffective most software is even when written as good as it can be. LLMs don't unlock any new abilities there.

What this actually does is make people more available for meeting time. Productivity doesn't budge at all. :)

In other words, the "busy work" has always been faster to do than the decision making, and if someone has meetings to attend they don't strictly do busy work.

Maybe the more interesting outcome is that with the increased meeting time comes much deeper drinks of the kool-aid, and businesses become more cultish than they already are. That to me sounds far more sinister than kicking people out onto the curb. Employees become "agents of change" through the money and influence they're given. They might actually hire more :D

titaniumrain•9시간 전

lol, this author doesn't even understand what llm is

ares623•9시간 전

I'll believe it when companies and projects start committing prompts into Github and nothing else. Let CI/CI regenerate the entire thing on each build.

We don't commit compiled blobs in source control. Why can't the same be done for LLMs?

kaapipo•9시간 전

If we can treat the prompts as the versionable source code artefact, then sure. But as long as we need to fine-tune the output that's not a high level language. In the same way no one would edit the assembly that a compiler produces

imdsm•9시간 전

If we're able to produce an LLM which takes a seed and produces the same output per input, then we'd be able to do this

pvtmert•9시간 전

Except that the output depends on stars' alignment.

Imagine a machine that does the job sometimes but fails on some other times. Wonderful isn't it?

pjmlp•9시간 전

Already there for anyone using iPaaS platforms, and despite their flaws, it is the new normal in many enterprise consulting scenarios.

niobe•9시간 전

Utterly brainless article. Why am I even commenting.

matheus-rr•8시간 전

The intermediate product argument is the strongest point in this thread. When we went from assembly to C, the debugging experience changed fundamentally. When we went from C to Java, how we thought about memory changed. With LLMs, I'm still debugging the same TypeScript and Python I was before.

The generation step changed. The maintenance step didn't. And most codebases spend 90% of their life in maintenance mode.

The real test of whether prompts become a "language" is whether they become versioned, reviewed artifacts that teams commit to repos. Right now they're closer to Slack messages than source files. Until prompt-to-binary is reliable enough that nobody reads the intermediate code, the analogy doesn't hold.

BudapestMemora•8시간 전

"Until prompt-to-binary is reliable enough that nobody reads the intermediate code, the analogy doesn't hold."

1. OK, let's create 100 instances of prompt under the hood, 1-2 will hallucinate, 3-5 will produce something different from 90% of remaining, and it can compile based on 90% of answers

2. computer memory is also not 100% reliable , but we live with it somehow without man-in-the-middle manually check layer?

pjmlp•7시간 전

We went from Assembly to Fortran, with several languages in between, until C came to be almost 15 years later.

andai•4시간 전

>With LLMs, I'm still debugging the same TypeScript and Python I was before.

Aren't you telling Claude/Codex to debug it for you?

surajrmal•3시간 전

Note that a lot of people also still work in C.

badgersnake•8시간 전

More garbage content on the front page. It’s a constant AI hype pieces with zero substance from people who just happen to work for AI companies. Hacker news is really going downhill.

phplovesong•8시간 전

Please stop with these rage click posts. There is so much wrong in this article i wont even start...

BudapestMemora•8시간 전

And sooner or later it will happen, imho. With probabalistic compiling. And several "prompts/agents" under the hood. The majority of "replies" wins to compile. Of course good context will contribute to better refined probability.

Ask yourself "Computer memory and disk are also not 100% reliable , but we live with it somehow without man-in-the-middle manual check layer, yes?" Answer about LLM will be the same, if good enough level of similarity/same asnwers is achieved.

karmasimida•8시간 전

LLMs are new runtimes.

coffeebeqn•7시간 전

Alright I’m out

[삭제된 댓글]

heikkilevanto•7시간 전

If we consider the prompts and LLM inputs to be the new source code, I want to see some assurance we get the same results every time. A traditional compiler will produce a program that behaves the same way, given the same source and options. Some even go out of their way to guarantee they produce the same binary output, which is a good thing for security and package management. That is why we don't need to store the compiled binaries in the version control system.

Until LLMS start to get there, we still need to save the source code they produce, and review and verify that it does what it says on the label, and not in a totally stupid way. I think we have a long way to go!

pjmlp•7시간 전

Anyone doing benchmarks with managed runtimes, or serverless, knows it isn't quite true.

Which is exactly one of the AOT only, no GC, crowds use as example why theirs is better.

energy123•7시간 전

Greedy decoding gives you that guarantee (determinism). But I think you'll find it to be unhelpful. The output will still be wrong the same % of the time (slightly more, in fact) in equally inexplicable ways. What you don't like is the black box unverifiable aspect, which is independent of determinism.

aurareturn•6시간 전

> If we consider the prompts and LLM inputs to be the new source code, I want to see some assurance we get the same results every time.

Give a spec to a designer or developer. Do you get the same result every time?

I’m going to guess no. The results can vary wildly depending on the person.

The code generated by LLMs will still be deterministic. What is different is the product team tools to create that product.

At a high level, does using LLMs to do all or most of the coding ultimately help the business?

properbrew•5시간 전

> I want to see some assurance we get the same results every time

Genuine question, but why not set the temperature to 0? I do this for non-code related inference when I want the same response to a prompt each time.

afavour•3시간 전

> If we consider the prompts and LLM inputs to be the new source code, I want to see some assurance we get the same results every time.

There’s a related issue that gives me deep concern: if LLMs are the new programming languages we don’t even own the compilers. They can be taken from us at any time.

New models come out constantly and over time companies will phase out older ones. These newer models will be better, sure, but their outputs will be different. And who knows what edge cases we’ll run into when being forced to upgrade models?

(and that’s putting aside what an enormous step back it would be to rent a compiler rather than own one for free)

koiueo•7시간 전

Then commit your prompt to a git repository.

Gosh, LLMs been a thing only for a few years, but people became stupid already.

> what Javascript/Python/Perl did to Java

FFS... What did python do to java?

cess11•7시간 전

Was StackOverflow "the new high level language"? The proliferation of public git repos?

Because that's pretty much what "agentic" LLM coding systems are an automation of, skimming through forums or repos and cribbing the stuff that looks OK.

omarreeyaz•7시간 전

I’m not sure I buy this. GPT-5.2 codex still makes design errors that I as an engineer have to watch and correct. The only way I know how to catch it and then steer the model towards a correction is to be able to read the code and write some code into the prompt. So one can’t abstract programming language away through an agent…

freetonik•6시간 전

There's a reason we distinguish between programmers and managers; if "LLMs are just the new high level language", then a manager is just another programmer operating on a level above code. I mean, sure, we can say that, but words become kind of meaningless at this point.

asim•6시간 전

I wouldn't call it the new high level language. It's a new JIT. But that's not doing it justice. There's a translation of natural language to a tokenizer and processor that's akin to the earliest days of CPUs. It's a huge step change from punch cards. But there's also a lot to learn. I think we will eventually develop a new language that's more efficient for processing or multiple layers of transformers. Tbh Google is leapfrogging everyone in this arena and eventually we're going to more exotic forms of modelling we've never seen before except in nature. But from an engineering perspective all I can see right now is a JIT.

dankobgd•6시간 전

retinaros•6시간 전

As long as SOTA is dumblooping until LLM verify some end goal and spend as many token as possible it wont be a language. At best an inelegant dialect.

frigg•6시간 전

>Following this hypothesis, what C did to assembler, what Java did to C, what Javascript/Python/Perl did to Java, now LLM agents are doing to all programming languages.

What did Javascript/Python do to Java? They are not interchangeable nor comparable. I don't think Federico's opinion is worth reading further.

niels8472•6시간 전

Plus Python and Perl predate Java.

randallsquared•5시간 전

Three of Java's top application categories are webapps, banking/financial services, and big data. Node and Pyspark have displaced quite a lot of that.

Verdex•5시간 전

If the LLM is a high level language, then why aren't we saving the prompts in git?

Last I checked with every other high level language, you save the source and then rerun the compiler to generate the artifact.

With LLMs you throw away the 'source' and save the artifact.

andai•4시간 전

The "prompt" is a guy who wrestled with Claude Code for several hours straight.

dgb23•4시간 전

Some people do to a degree. Just not quite in the sense that this headline suggests.

There are now approaches were the prompt itself is being structured in a way (sort of like a spec) so you get to a similar result quicker. Not sure how well those work (I actually assume they suck, but I have not tried them).

Also some frameworks, templates and so on, provide a bunch of structured markdown files that nudges LLM assistance to avoid common issues and do things in a certain way.

Akef•4시간 전

Paradigm shift ahead, folks. What I observe in the comments—often more compelling than the article itself—is the natural tension within the scientific community surrounding the 'scientific method,' a debate that's been playing out for... what, a year now? Maybe less? True, this isn't perfect, nor does it come with functionality guarantees. Talking about 10x productivity? That's relative—it hinges on the tool, the cultural background of the 'orchestra conductor,' or the specific, hands-on knowledge accumulated by the conductor, their team, organization, and even the target industry.

In essence: we're witnessing a paradigm shift. And for moments like these—I invite you—it's invaluable to have studied Popper and Kuhn in those courses.

An even more provocative hypothesis: the 'Vienna Circle' has morphed into the 'Circle of Big Tech,' gatekeepers of the data. What's the role of academia here? What happened to professional researchers? The way we learn has been hijacked by these brilliant companies, which—at least this time—have a clear horizon: maximizing profits. What clear horizon did the stewards of the scientific method have before? Wasn't it tainted by the enunciator's position? The personal trajectory of the scientist, the institution (university) funding them? Ideology, politics?

This time, it seems, we know exactly where we're headed.

(This comment was translated from Spanish, please excuse the rough edges)

voxleone•4시간 전

One thing I think the “LLM as new high-level language” framing misses is the role of structure and discipline. LLMs are great at filling in patterns, but they struggle with ambiguity, the exact thing we tolerate in human languages.

A practical way to get better results is to stop prompting with prose and start providing explicit models of what we want. In that sense, UML-like notations can act as a bridge between human intent and machine output. Instead of:

“Write a function to do X…”

we give:

“Here’s a class diagram + state machine; generate safe C/C++/Rust code that implements it.”

UML is already a formal, standardized DSL for software structure. LLMs have no trouble consuming textual forms (PlantUML, Mermaid, etc.) and generating disciplined code from them. The value isn’t diagrams for humans but constraining the model’s degrees of freedom.

gloosx•4시간 전

A novice prefers declarative control, an expert prefers procedural control

Beginner programmers want: "make this feature"

Experienced devs want: control over memory, data flow, timing, failure modes

That is why abstractions feel magical at first and suffocating later which sparks this whole debate.

manuelabeledo•4시간 전

After re-reading the post once again, because I honestly thought I was missing something obvious that would make the whole thing make sense, I started to wonder if the author actually understands the scope of a computer language. When he says:

> LLMs are far more nondeterministic than previous higher level languages. They also can help you figure out things at the high level (descriptions) in a way that no previous layer could help you dealing with itself. […] What about quality and understandability? If instead of a big stack, we use a good substrate, the line count of the LLM output will be much less, and more understandable. If this is the case, we can vastly increase the quality and performance of the systems we build.

How does this even work? There is no universe I can imagine where a natural language can be universal, self descriptive, non ambiguous, and have a smaller footprint than any purpose specific language that came before it.

slfnflctd•3시간 전

To be generous and steelman the author, perhaps what they're saying is that at each layer of abstraction, there may be some new low-hanging fruit.

Whether this is doable through orchestration or through carefully guided HITL by various specialists in their fields - or maybe not at all! - I suspect will depend on which domain you're operating in.

onlyrealcuzzo•2시간 전

You're going to pretty hard pressed to do Rust better than Rust.

There's minimal opportunity with lifetime annotations. I'm sure very small options elsewhere, too.

The idea of replacing Rust with natural language seems insane. Maybe I'm being naive, but I can't see why or how it could possibly be useful.

Rust is simply Chinese unless you understand what it's doing. If you translate it to natural language, it's still gibberish, unless you understand what it does and why first. In which case, the syntax is nearly infinitely more expressive than natural language.

That's literally the point of the language, and it wasn't built by morons!

euroderf•3시간 전

The US military loves zillion-page requirements documents. Has anyone (besides maybe some Ph.Dork at DARPA) tried feeding a few to coder LLMs to generate applications - and then thrown them at test suites ?

rco8786•3시간 전

The thing that’s always missing from these critiques isnt about code quality or LoC or slop.

The issue is that if you fire off 10 agents to work autonomously for an extended period of time at least 9 of them will build the WRONG THING.

The problem is context management and decision making based on that context. LLMs will always make assumptions about what you want, and the more assumptions they make the higher the likelihood that one or more of them is wrong.

gaigalas•2시간 전

But what we really want is to cut down layers of abstraction, not increase them.

I mean, we only have them because it is strictly necessary. If we could make architectures friendly to programming directly, we would have.

In that sense, high level languages are not a marvelous thing but a burden we have to carry because of the strict requirements of low level ones. The less burdens like those we have, the better.

senfiaj•2시간 전

For me LLMs feel closer to IDE with steroids. Unless LLMs produce the same output from the same input, I can't view them as compilers.

Razengan•1시간 전

If you see magazines, articles, ads and TV shows from the 1980s (there are lots on YouTube and a fun rabbit hole, like the BBC Archive), the general promise was "Computers can do anything, if you program them."

Well, nobody could figure out how to program them. Except the few outcasts like us who went on to suffer for the rest of our lives for it :')

With phones & LLMs this is the closest we have come to that original promise of a computer in every home and everyone being able to do anything with it, that isn't pre-dictated by corporations and their apps:

Ideally ChatGPT etc should be able to create interactive apps on the fly on iPhone etc. Imagine having a specific need and just being able to say it and get an app right away just for you on your device.

podgorniy•1시간 전

I want to see an example of the application with well written documentation which produces well working application based on those docs.

I discovered that it is not trivial to conceptualize app to that extent of clarity which is required for deterministic output of LLM. It's way easier to say than to actually implement by yourself (that's why examples are so interesting to see).

Backwards dynamics when you get spec/doc based on the source code does not work good enough.

LLM을 새로운 고수준 언어로

요약

댓글 (341)