Professional software developers don't vibe, they control

(arxiv.org)

209 points | by dpflan 1 day ago

23 comments

simonw 1 day ago
This is pretty recent - the survey they ran (99 respondents) was August 18 to September 23 2025 and the field observations (watching developers for 45 minute then a 30 minute interview, 13 participants) were August 1 to October 3.
The models were mostly GPT-5 and Claude Sonnet 4. The study was too early to catch the 5.x Codex or Claude 4.5 models (bar one mention of Sonnet 4.5.)
This is notable because a lot of academic papers take 6-12 months to come out, by which time the LLM space has often moved on by an entire model generation.
[-]
- utopiah 1 day ago
  > academic papers take 6-12 months to come out, by which time the LLM space has often moved on by an entire model generation.
  This is a recurring argument which I don't understand. Doesn't it simply mean that whatever conclusion they did was valid then? The research process is about approximating a better description of a phenomenon to understand it. It's not about providing a definitive answer. Being "an entire model generation" behind would be important if fundamental problems, e.g. no more hallucinations, would be solved but if it's going from incremental changes then most likely the conclusions remain correct. Which fundamental change (I don't think labeling newer models as "better" is sufficient) do you believe invalidate their conclusions in this specific context?
  [-]
  - soulofmischief 1 day ago
    2025 has been a wild year for agentic coding models. Cutting-edge models in January 2025 don't hold a candle to cutting edge models in December 2025.
    Just the jump from Sonnet 3.5 to 3.7 to 4.5, and Opus 4.5 has been pretty massive in terms of holistic reasoning, deep knowledge as well as better procedural and architectural adherence.
    GPT-5 Pro convinced me to pay $200/mo for an OpenAI subscription. Regular 5.2 models, and 5.2 codex, are leagues better than GPT-4 when it comes to solving problems procedurally, using tools, and deep discussion of scientific, mathematic, philosophical and engineering problems.
    Models have increasingly longer context, especially some Google models. OpenAI has released very good image models, and great editing-focused image models in general have been released. Predictably better multimodal inference over the short term is unlocking many cool near-term possibilities.
    Additionally, we have seen some incredible open source and open weight models released this year. Some fully commercially viable without restriction. And more and more smaller TTS/STT projects are in active development, with a few notable releases this year.
    Honestly, the landscape at the end of the year is impressive. There has been great work all over the place, almost too much to keep up with. I'm very interested in the Genie models and a few others.
    For an idea:
    At the beginning of the year, I was mildly successful getting at coding models to make changes in some of my codebases, but the more esoteric problems were out of reach. Progress in general was deliberate and required a lot of manual intervention.
    By comparison, in the last week I've prototyped six applications at levels that would take me days to weeks individually, often developing multiple at the same time, monitoring agentic workflows and intervening only when necessary, relying on long preproduction phases with architectural discussions and development of documentation, requirements, SDDs... and detailed code review and refactoring processes to ensure adherence to constraints. I'm morphing from a very busy solo developer into a very busy product manager.
    [-]
    - orwin 22 hours ago
      > Just the jump from Sonnet 3.5 to 3.7 to 4.5, and Opus 4.5 has been pretty massive in terms of holistic reasoning, deep knowledge as well as better procedural and architectural adherence.
      I don't really agree. Aside from how it handled frontend code, changes in Sonnet did not truly impact my overall productivity (from Sonnet 3.7 to 4 to 4.5, i did not try 3.5). Opus 4.5/Codex 5.2 are when the changes truly happenned for me (and i'm still a bit distrustfull of Codex 5.2, but i use it basically to help me during PRs).
      [-]
      - soulofmischief 16 hours ago
        That's fine. Maybe you're holding it wrong, or maybe your work is too esoteric/niche/complex for newer models to be bigger productivity boosters. Some of mine certainly is, I get that. But for other stuff, these newer models are incredible productivity boosters.
        I also chat with these models for long hours about deep, complicated STEM subjects and am very impressed with the level of holistic knowledge and wisdom compared to models a year ago. And the abstract math story has gotten sooooo much better.
    - foldr 22 hours ago
      >By comparison, in the last week I've prototyped six applications at levels that would take me days to weeks individually [...]
      I don't doubt that the models have got better, but you can go back two or three years and find people saying the exact same stuff about the latest models back then.
      [-]
      - simonw 20 hours ago
        I don't think that's true of three years ago - that's taking us back into GPT-3 territory.
        And two years ago we were mostly still stuck with GPT-4 which had an 8,000 input context limit, very challenging to get real coding work done with that.
        Easy enough to prove though, find some examples of people saying that 2-3 years ago and I shall concede the point!
        [-]
        foldr 19 hours ago
        GPT-4 was released in March 2023, so it pretty clearly comes under the heading of “two or three years” ago. It’s only three months shy of its third birthday.
        I see that 2023 LinkedIn has (deservedly) gone down your memory hole, but it is very easy to find innumerable examples of people saying this kind of thing:
        https://www.reddit.com/r/ChatGPTCoding/comments/11zu7l7/i_bu...
        [-]
        simonw 15 hours ago
        Good link, I shall concede the point!
        soulofmischief 16 hours ago
        Crazy how progress works! It just keeps getting better, and people have rightfully noticed.
  - simonw 20 hours ago
    The problem is with how people interpret these results.
    A paper comes out that says "we did a study of developers and found that AI-assistance had no impact on their productivity (using the state of the art models available in September 2024) and a lot of people will point to that as incontestable evidence that "AI doesn't work".
- ActionHank 1 day ago
  For what it’s worth I know this is likely intended to read as the new generation of models will somehow better than any paper will be able to gauge, that hasn’t been my experience.
  Results are getting worse and less accurate, hell, I even had Claude drop some Chinese into a response out of the blue one day.
  [-]
  - danielbln 1 day ago
    I can absolutely not corroborate this, Opus 4.5 has been nothing but stellar.
  - mannycalavera42 1 day ago
    same here. While getting a commandline for ffmpeg instead of giving me the option "soft-knee" it used "soft-膝" (where 膝 is the chinese for knee) was easy to spot and figure out but still... pretty rubbishy ¯ \ _ (ツ) _ / ¯
- reactordev 1 day ago
  I knew in October the game had changed. Thanks for keeping us in the know.
  [-]
  - mikasisiki 1 day ago
    I'm not sure what you mean by “the game has changed.” If you’re referring to Opus 4.5, it’s somewhat better, but it’s far from game-changing.
    [-]
    - reactordev 22 hours ago
      You’re looking in from the outside. I’m on the inside. This next generation of models will show. It’s about to get wild.
      We now have extremely large context windows, we now have memory, we now have recall, we now can put an agent to the task for 24 hours.
- bbor 1 day ago
  I’m glad someone else noticed the time frames — turns out the lead author here has published 28 distinct preprints in the past 60 days, almost all of which are marked as being officially published already/soon.
  Certainly some scientists are just absurdly efficient and all 28 involved teams, but that’s still a lot.
  Personally speaking, this gives me second thoughts about their dedication to truly accurately measuring something as notoriously tricky as corporate SWE performance. Any number of cut corners in a novel & empirical study like this would be hard to notice from the final product, especially for casual readers…TBH, the clickbait title doesn’t help either!
  I don’t have a specific critique on why 4 months is definitely too short to do it right tho. Just vibe-reviewing, I guess ;)
  [-]
  - aaronblohowiak 1 day ago
    are they a PI with a lab? in this field, does the PI get first or last author?
    [-]
- dheera 1 day ago
  > academic papers take 6-12 months to come out
  It takes about 6 months to figure out how to get LaTeX to position figures where you want them, and then another 6 months to fight with reviewers
  [-]
  - zeristor 1 day ago
    Couldn't AI help with the LaTeX?
    Cutting it down to 6 minutes
    [-]
    - jsrozner 1 day ago
      I have found it to be pretty bad at formatting tables
- joenot443 1 day ago
  Thanks Simon - always quick on the draw.
  Off your intuition, do you think the same study with Codex 5.2 and Opus 4.5 would see even better results?
  [-]
  - simonw 1 day ago
    Depends on the participants. If they're cutting-edge LLM users then yes, I think so. If they continue to use LLMs like they would have back in the first half of 2025 I'm not sure if a difference would be noticeable.
    [-]
    - mkozlows 1 day ago
      I'm not remotely cutting edge (just switched from Cursor to Codex CLI, have no fancy tooling infrastructure, am not even vaguely considering git worktrees as a means of working), but Opus 4.5 and 5.2 Codex are both so clearly more competent than previous models that I've started just telling them to do high-level things rather than trying to break things down and give them subtasks.
      If people are really set in their ways, maybe they won't try anything beyond what old models can do, and won't notice a difference, but who's had time to get set in their ways with this stuff?
      [-]
      - christophilus 1 day ago
        I mostly agree, but today, Opus 4.5 via Claude code did something pretty dumb stuff in my codebase— N queries where one would do, deep array comparison where a reference equality check would suffice, very complex web of nested conditionals which a competent developer would have never written, some edge cases where the backend endpoints didn’t properly verify user permissions before overwriting data, etc.
        It’s still hit or miss. The product “worked” when I tested it as a black box, but the code had a lot of rot in it already.
        Maybe that stuff no longer matters. Maybe it does. Time will tell.
        [-]
        remich 1 day ago
        I have had a lot of success lately when working with Opus 4.5 using both the Beads task tracking system and the array of skills under the umbrella of Bad Dave's Robot Army. I don't have a link handy, but you should be able to find it on GitHub. I use the specialized skills for different review tasks (like Architecture Review, Performance Review, Security Review, etc.) on every completed task in addition to my own manual review, and I find that that helps to keep things from getting out of hand.
        ManuelKiessling 1 day ago
        As someone who’s responsible for some very clean codebases and some codebases that grew over many years, warts and all, I always wonder if being subjected to large amounts of not-exactly-wonderful code has the same effect on an LLM that it arguably also has on human developers (myself included occasionally): that they subconsciously lower their normally high bar for quality a bit, as in „well there‘s quite some smells here, let’s go a bit with the flow and not overdo the quality“.
        mkozlows 1 day ago
        I don't think they generally one-shot the tasks; but they do them well enough that you can review the diff and make requests for changes and have it succeed in a good outcome more quickly than if you were spoon-feeding it little tasks and checking them as you go (as you used to have to do).
      - nineteen999 1 day ago
        Also not a cutting edge user, but do run my own LLM's at home and have been spending a lot of time with Claude CLI last few months.
        It's fine if you want Claude to design your API's without any input, but you'll have less control and when you dig down into the weeds you'll realise it's created a mess.
        I like to take both a top-down and bottoms-up approach - design the low level API with Claude fleshing out how it's supposed to work, then design the high level functionality, and then tell it to stop implementing when it hits a problem reconciling the two and the lower level API needs revision.
        At least for things I'd like to stand the test of time, if its just a throwaway script or tool I care much less as long as it gets the job done.
    - drbojingle 1 day ago
      What's the difference between using llms now vs the first half of 2025 among the best users?
      [-]
      - simonw 1 day ago
        Coding agents and much better models. Claude Code or Codex CLI plus Claude Opus 4.5 or GPT 5.2 Codex.
        The latest models and harnesses can crunch on difficult problems for hours at a time and get to working solutions. Nothing could do that back in ~March.
        I shared some examples in this comment: https://news.ycombinator.com/item?id=46436885
        [-]
        William_BB 1 day ago
        Ok I will bite.
        Every single example you gave is in a hobby project territory. Relatively self-contained, maintainable by 3-4 devs max, within 1k-10k lines of code. I've been successfully using coding agents to create such projects for the past year and it's great, I love it.
        However, lots of us here work on codebases that are 100x, 1000x the size of these projects you and Karpathy are talking about. Years of domain specific code. From personal experience, coding agents simply don't work at that scale the same way they do for hobby projects. Over the past year or two, I did not see any significant improvement from any of the newest models.
        Building a slightly bigger hobby project is not even close to making these agents work at industrial scale.
        [-]
        rjzzleep 1 day ago
        I think that in general there is a big difference between javascript/typescript projects big or small and other projects that actually address a specific project domain. These two are not the same. The same claude code agent can create a lot of parts of a function web project, but will struggle providing anything functional but a base frame for you to build on if you were to create a new SoC support in some drone firmware.
        The problem is that everyone working on those more serious projects knows that and treats LLMs accordingly, but the people that come from the web space come in with the expectation that they can replicate the success they have in their domain just as easily, when oftentimes you need to have some domain knowledge.
        I think the difference simply comes down to the sheer volume of training material, i.e. web projects on github. Most "engineers" are actually just framework consumers and within those frameworks llms work great.
        simonw 1 day ago
        Most of the stuff I'm talking about here came out in November. There hasn't been much time for professional teams to build new things with it yet, especially given the holidays!
        [-]
        qweiopqweiop 1 day ago
        For what it's worth, I'm working with it on a huge professional monorepo, and the difference was also stark.
        reactordev 1 day ago
        For what it’s worth, I have Claude coding away at Unreal Engine codebase. That’s a pretty large c++ codebase and it’s having no trouble at all. Just a cool several million lines of C++ lovely.
        drbojingle 1 day ago
        Everything is made of smaller parts. I'd like to think we can sub divide a code base into isolated modules at least.
        [-]
        tracker1 16 hours ago
        Depends on what kinds of problems you're solving...
        I'd put it in line with monolith vs microservices... You're shifting complexity somewhere, if it's on orchestration or the codebase. In the end, the piper gets paid.
        Also, not all problems can be broken down cleanly into smaller parts.
        devin 1 day ago
        In the real world, not all problems decompose nicely. In fact, I think it may be the case that the problems we actually get paid to solve with code are often of this type.
        baq 1 day ago
        That’s right, but it also hints at a solution: split big code bases into parts that are roughly the size of a big hobby project. You’ll need to write some docs to be effective at it, which also helps agents. CICD means continuous integration continuous documentation now.
        [-]
        bccdee 1 day ago
        Splitting one big codebase into 100 microservices always seems tempting, except that big codebases already exist in modules and that doesn't stop one module's concerns from polluting the other modules' code. What you've got now is 100 different repositories that all depend on each other, get deployed separately, and can only be tested with some awful docker-compose setup. Frankly, given the impedance of hopping back and forth between repos separated by APIs, I'd expect an LLM to do far worse in a microservice ecosystem than in an equivalent monolith.
        majormajor 1 day ago
        I wonder if anyone has tried this thing before, like... micro-projects or such... ;)
        rjzzleep 1 day ago
        It's not the size that's the issue, it's the domain that is. It's tempting to say that adding drivers to Linux is hard because Linux is big, but that's not the issue.
        oooyay 1 day ago
        I worked at Slack earlier this year. Slack adopted Cursor as an option in December of 2024 if memory serves correctly. I had just had a project cut due to a lot of unfortunate reasons so I was working on it with one other engineer. It was a rewrite of a massive and old Python code base that ran Slack's internal service catalog. The only reason I was able to finish rewrites of the backend, frontend, and build an SLO sub-system is because of coding agents. Up until December I'd been doing that entire rewrite through sixteen hour days and just pure sweat equity.
        Again, that codebase is millions of lines of Python code and frankly the agents weren't as good then as they are now. I carefully used globbing rules in Cursor to navigate coding and testing standards. I had a rule that functioned as how people use agents.md now, which was put on every prompt. That honestly got me a lot more mileage than you'd think. A lot of the outcomes of these tools are how you use them and how good your developer experience is. If professional software engineers have to think about how to navigate and iterate on different parts of your code, then an LLM will find it doubly difficult.
        epolanski 1 day ago
        Cool, but most developers do mundane stuff like glueing APIs and implementing business logic, which require oversight and review.
        Those crunching hard problems will still review what's produced in search of issues.
        [-]
        generic92034 1 day ago
        What is (in general) mundane about business logic? This can be highly complex, with deep process integration all over your modules.
        [-]
        epolanski 14 hours ago
        Which is why it requires detailed oversight.
        mkozlows 1 day ago
        I was going back and looking at timelines, and was shocked to realize that Claude Code and Cursor's default-to-agentic-mode changes both came out in late February. Essentially the entire history of "mainstream" agentic coding is ten months old.
        (This helps me understand better the people who are confused/annoyed/dismissive about it, because I remember how dismissive people were about Node, about Docker, about Postgres, about Linux when those things were new too. So many arguments where people would passionately talk about all those things were irredeemably stupid and only suitable for toy/hobby projects.)
        [-]
        HarHarVeryFunny 1 day ago
        The entire history of RL-trained "reasoning models" from o1 to DeepSeek_R1 is basically just a year old!
        drbojingle 1 day ago
        Are there techniques though? Tech pairing? Something we know now that we didn't then? Or just better models?
        [-]
        simonw 1 day ago
        Lots of technique stuff. A common observation among LLM nerds is that if the models stopped being improved and froze in time for a year we could still spend all twelve months discovering new capabilities and use-cases for the models we already have.
        [-]
        drbojingle 21 hours ago
        Any specifics you'd recommend?
- trq126154 1 day ago
  [flagged]
runtimepanic 1 day ago
The title is doing a lot of work here. What resonated with me is the shift from “writing code” to “steering systems” rather than the hype framing. Senior devs already spend more time constraining, reviewing, and shaping outcomes than typing syntax. AI just makes that explicit. The real skill gap isn’t prompt cleverness, it’s knowing when the agent is confidently wrong and how to fence it in with tests, architecture, and invariants. That part doesn’t scale magically.
[-]
- asmor 1 day ago
  Is anyone else getting more mentally exhausted by this? I get more done, but I also miss the relaxing code typing in the middle of the process.
  [-]
  - agumonkey 1 day ago
    I think there are two groups of people emerging. deep / fast / craft-and-decomposition-loving vs black box / outcome-only.
    I've seen people unable to work at average speed on small features suddenly reach above average output through a llm cli and I could sense the pride in them. Which is at odds with my experience of work.. I love to dig down, know a lot, model and find abstractions on my own. There a llm will 1) not understand how my brain work 2) produce something workable but that requires me to stretch mentally.. and most of the time I leave numb. In the last month I've seen many people expressing similar views.
    ps: thanks everybody for the answers, interesting to read your pov
    [-]
    - remich 1 day ago
      I get what you're saying, but I would say that this does not match my own experience. For me, prior to the agentic coding era, the problem was always that I had way more ideas for features, tools, or projects than I had the capacity to build when I had to confront the work of building everything by hand, also dealing with the inevitable difficulties in procrastination and getting started.
      I am a very above-average engineer when it comes to speed at completing work well, whether that's typing speed or comprehension speed, and still these tools have felt like giving me a jetpack for my mind. I can get things done in weeks that would have taken me months before, and that opens up space to consider new areas that I wouldn't have even bothered exploring before because I would not have had the time to execute on them well.
    - ronsor 1 day ago
      The sibling comments (from remich and sanufar) match my experience.
      1. I do love getting into the details of code, but I don't mind having an LLM handle boilerplate.
      2. There isn't a binary between having an LLM generate all the code and writing it all myself.
      3. I still do most of the design work because LLMs often make questionable design decisions.
      4. Sometimes I simply want a program to solve a problem (outcome-focused) over a project to work on (craft-focused). Sometimes I need a small program in order to focus on the larger project, and being able to delegate that work has made it more enjoyable.
      [-]
      - zahlman 1 day ago
        > I do love getting into the details of code, but I don't mind having an LLM handle boilerplate.
        My usual thought is that boilerplate tells me, by existing, where the system is most flawed.
        I do like the idea of having a tool that quickly patches the problem while also forcing me to think about its presence.
        > There isn't a binary between having an LLM generate all the code and writing it all myself. I still do most of the design work because LLMs often make questionable design decisions.
        One workflow that makes sense to me is to have the LLM commit on a branch; fix simple issues instead of trying to make it work (with all the worry of context poisoning); refactor on the same branch; merge; and then repeat for the next feature — starting more or less from scratch except for the agent config (CLAUDE.md etc.). Does that sound about right? Maybe you do something less formal?
        > Sometimes I simply want a program to solve a purpose (outcome-focused) over a project to work on (craft-focused). Sometimes I need a small program in order to focus on the larger project, and being able to delegate that work has made it more enjoyable.
        Yeah, that sounds about right.
    - sanufar 1 day ago
      I think for me, the difference really comes down to how much ownership I want to take in regards to the project. If it’s something like a custom kernel that I’m building, the real fun is in reading through docs, learning about systems, and trying to craft the perfect abstractions; but if it’s wiring up a simple pipeline that sends me a text whenever my bus arrives, I’m happy to let an LLM crank that out for me.
      I’ve realized that a lot of my coding is on this personal satisfaction vs utility matrix and llms let me focus a lot more energy onto high satisfaction projects
    - zahlman 1 day ago
      > deep / fast / craft-and-decomposition-loving vs black box / outcome-only
      As a (self-reported) craft-and-decomposition lover, I wouldn't call the process "fast".
      Certainly it's much faster than if I were trying to take the same approach without the same skills; and certainly I could slow it down with over-engineering. (And "deep" absolutely fits.) But the people I've known that I'd characterize as strongly "outcome-only", were certainly capable of sustaining some pretty high delta-LoC per day.
  - jghn 1 day ago
    That's kind of the point here. Once a dev reached a certain level, they often weren't doing much "relaxing code typing" anyways before the AI movement. I don't find it to be much different than being a tech lead, architect, or similar role.
    [-]
    - remich 1 day ago
      As a former tech lead and now staff engineer, I definitely agree with this. I read a blog post a couple of months ago that theorized that the people that would adopt these technologies the best were people in the exact roles that you describe. I think because we were already used to having to rely on other people to execute on our plans and ideas because they were simply too big to accomplish by ourselves. Now that we have agents to do these things, it's not really all that different - although it is a different management style working around their limitations.
      [-]
      - jghn 1 day ago
        Exactly. I've been a tech lead, have led large, cross-org projects, been an engineering manager, and similar roles. For years, when mentoring upcoming developers what I always to be the most challenging transition was the inflection point between "I deliver most of my value by coding" to "I deliver most of my value by empowering other people to deliver". I think that's what we're seeing here. People who have made this transition are already used to working this way. Both versions have their own quirks and challenges, but at a high level it abstracts.
        [-]
        9rx 1 day ago
        LLMs are just a programming language/compiler/REPL, though, so there is nothing out of the ordinary for developers. Except what is different is the painfully slow compile time to code ratio. You write code for a few minutes... and then wait. Then spend a few more minutes writing code... and then wait. That is where the exhaustion comes from.
        At least in the olden days[1] you could write code for days before compiling, which reduced the pain. Long compilation times has always been awful, but it is less frustrating when you could defer it until the next blue moon. LLMs don't (yet) seem to be able to handle that. If you feed them more than small amounts of code at a time they quickly go off the rails.
        With that said, while you could write large amounts of code and defer it until the next blue moon, it is a skill to be able to do that. Even in C++, juniors seem to like to write a few lines of code and then turn to compiling the results to make sure they are on the right track. I expect that is the group of people who is most feeling at home with LLMs. Spending a few minutes writing code and then waiting on compilation isn't abnormal for them.
        But presumably the tooling will improve with time.
        [1] https://xkcd.com/303/
        [-]
        recursive 1 day ago
        Programming languages are structured and have specifications. It is possible to know what code will do just by reading it.
        [-]
        9rx 1 day ago
        Well designed ones do, at least. LLMs, in their infancy, still bring a lot of undefined behaviour, which is you end up stuck in the code for a few minutes -> compile -> wait -> repeat cycle. But that is not a desirable property and won't remain acceptable as the technology matures.
        [-]
        recursive 1 day ago
        I don't see any way this is changing, acceptable or not.
        [-]
        9rx 1 day ago
        It is quite possible the tools will never improve beyond where they sit today, sure, but then usage will naturally drift away from that fatiguing use (not all use, obviously). The constant compile/wait cycle is exhausting exactly because it is not productive.
        Businesses are currently willing to accept that lack of productivity as an investment into figuring out how to tame the tools. There is a lot of hope that all the problems can be solved if we keep trying to solve them. And, in fairness, we have gotten a lot closer than we were just a year or so ago towards that end, so the optimism currently remains strong. However, that cannot go on forever. At some point the investment has to prove itself, else the plug will be pulled.
        And yes, it may ultimately be a dead end. Absolutely. It wouldn't be the first failure in software development.
  - tikimcfee 1 day ago
    Ya know, I have to admit feeling something like this. Normally, the amount of stuff I put together in a work day offers a sense of completion or even a bit of a dopamine bump because of a "job well done". With this recent work I've been doing, it's instead felt like I've been spending a multiplier more energy communicating intent instead of doing the work myself; that communication seems to be making me more tired than the work itself. Similar?
    [-]
    - whynotminot 1 day ago
      It feels like we all signed up to be ICs, but now we’re middle managers and our reports are bots.
      [-]
      - MikeTheGreat 1 day ago
        I forget where I saw this (a Medium post, somewhere) but someone summed this up as "I didn't sign up for this just to be a tech priest for the machine god".
        [-]
        whstl 1 day ago
        Someone commented yesterday that managers and other higher-ups are "already ok with non-deterministic outputs", because that's what engineers give them.
        As a manager/tech-lead, I've kind of been a tech priest for some time.
        [-]
        danielbln 20 hours ago
        Which is why it's so funny to hear seasoned engineers lament the probabilistic nature of AI systems, and how you have to be hand setting code to really think about the problem domain.
        They seem to all be ICs that forget that there are abstraction layers above them where all of that happens (and more).
      - senshan 1 day ago
        > and our reports are bots.
        With no gossip, rivalry or backstabbing. Super polite and patient, which is very inspiring.
        We also brutally churning them by "laying off" the previously latest model once the new latest is available.
    - perfmode 1 day ago
      You’re possibly not entering into the flow state anymore.
      Flow is effortless. and it is rejuvenating.
      I believe:
      While communication can be satisfying, it’s not as rejuvenating as resting in our own Being and simply allowing the action to unfold without mental contraction.
      Flow states.
      When the right level of challenge and capability align and you become intimate with the problem. The boundaries of me and the problem dissolve and creativity springs forth. Emerging satisfied. Nourished.
      [-]
      - danielbln 20 hours ago
        Flow state can happen at various levels of abstraction, not just when hand writing code in a gen 3 language.
    - johnsmith1840 1 day ago
      This is why I think LLMs will make us all a LOT smarter. Raw code made it so we stopped heavily thinking in between but now it's just 100% the most intense thought processes all day long.
      [-]
      - falkensmaize 1 day ago
        It seems pretty obvious that the opposite is true. I know I’ve experienced some serious skill atrophy that I’m now having to actively resist. There’s a lot lost by no longer having to interact with the raw materials of your craft.
        Thinking is a skill that is reinforced by reading, designing and writing code. When you outsource your thinking to an LLM your ability to think doesn’t magically improve…it degrades.
  - simonw 1 day ago
    Yes, absolutely, I can be mentally wiped out by lunch.
  - epolanski 1 day ago
    Yes it's taxing and mentally draining, reading code and connecting dots is always harder than writing it.
    And if you let the AI too loose, as when you try to vibe code an entirely new program, I end up in the situation where in 1 day I have a good prototype and then I can spend easily 5 times as much sorting the many issues and refactoring in order to have it scale to the next features.
  - bccdee 1 day ago
    So far what I've been doing is, I look for the parts that seem like they'd be rewarding to code and I do them myself with no input from the machine whatsoever. It's hard to really understand a codebase without spending time with the code, and when you're using a model, I think there's a risk of things changing more quickly than you can internalize them. Also, I worry I'll get too comfortable bossing chatbots around & I'll become reluctant to get my hands dirty and produce code directly. People talk about ruining their attention spans by spending all their time on TikTok until they can no longer read novels; I think it'd be a real mistake to let that happen to my professional skill set.
  - SJMG 1 day ago
    I think it's the serial waiting game and inevitable context switching while you wait.
    Long iteration cycles are taxing
  - bugglebeetle 1 day ago
    Nah, I don’t miss at all typing all the tests, CLIs, and APIs I’ve created hundreds of times before. I dunno if I it’s because I do ML stuff, but it’s almost all “think a lot about something, do some math, and and then type thousands of lines of the same stuff around the interesting work.”
  - mupuff1234 1 day ago
    For me it's the opposite, I'm wasting less energy over debugging silly bugs and fighting/figuring out some annoying config.
    But it does feel less fulfilling I suppose.
  - teaearlgraycold 1 day ago
    I like to alternate focusing on AI wrangling and writing code the old fashioned way.
- AlotOfReading 1 day ago
  It's difficult to steer complex systems correctly, because no one has a complete picture of the end goal at the outset. That's why waterfall fails. Writing code agentically means you have to go out of your way to think deeply about what you're building, because it won't be forced on you by the act of writing code. If your requirements are complex, they might actually be a hindrance because you're going have to learn those lessons from failed iterations instead of avoiding them preemptively.
- codeformoney 1 day ago
  The stereotype that writing code is for junior developers needs to die. Some devs are hired with lofty titles specifically for their programming aptitude and esoteric systems knowlege, not to play implementation telephone with inexperienced devs.
  [-]
  - remich 1 day ago
    I don't think that anyone actually believes that writing code is only for junior developers. That seems to be a significant exaggeration at the very least. However, it is definitely true that most organizations of this size are hiring people into technical lead, staff engineer, or principal engineer roles are hiring those people not only for their individual expertise, or ability to apply that expertise themselves, but also for their ability to use that expertise as a force multiplier to make other less experienced people better at the craft.
    [-]
    - codeformonkey 1 day ago
      In my world there are Hard Problems that need to be solved for bu$ine$$ rea$on$, no being a "force multiplier" required (whatever that really means).
    - inkyoto 1 day ago
      > I don't think that anyone actually believes that writing code is only for junior developers.
      That is, unquestionably, how it ought to be. However, the mainstream – regrettably – has devolved into a well-worn and intellectually stagnant trajectory, wherein senior developers are not merely encouraged but expected to abandon the coding altogether, ascending instead into roles such as engineering managers (no offence – good engineering managers are important, it is the quality that has been diluted across the board), platform overseers (a new term for stage gate keepers), or so-called solution architects (the ones who are imbued with compliance, governance and do not venture out past that).
      In this model, neither role is expected – and in some lamentable cases, is explicitly forbidden[0] – to engage directly with code. The result is a sterile detachment from the very systems they are charged with overseeing.
      Worse still, the industry actively incentivises ill-considered career leaps – for instance, elevating a developer with limited engineering depth into the position of a solution designer or architect. The outcome is as predictable as it is corrosive: individuals who can neither design nor architect.
      The number of organisations in which expert-level coding proficiency remains the norm at senior or very senior levels has dwindled substantially over the past couple of decades or so – job ads explicitly call out the management experience, knowledge of vacuous or limited usefulness architectural frameworks (TOGAF and alike). There do remain rare islands in an ever-expanding ocean of managerial abstraction where architects who write code, not incessantly but when a need be, are still recognised as invaluable. Yet their presence is scarce.
      The lamentable state of affairs has led to a piquant situation on the job market. In recent years, headhunters have started complaining about being unable to find an actually highly proficient, experienced, and, most importantly, technical architect. One's loss is another one's gain, or at least an opportunity, of course.
      [0] Speaking from firsthand experience of observing a solution architect to have quit their job to run a bakery (yes) due to the head of architecture they were reporting to explicitly demanding the architect quit coding. The architect did quit, albeit in a different way.
- llmslave2 1 day ago
  Does using an LLM to craft Hackernews comments count as "steering systems"?
  [-]
  - coip 1 day ago
    You're totally right! It's not steering systems -- it's cooking, apparently
- Madmallard 1 day ago
  "it’s knowing when the agent is confidently wrong and how to fence it in with tests, architecture, and invariants."
  Strongly suspect this is simply less efficient than doing it yourself if you have enough expertise.
danavar 1 day ago
So much of my professional SWE jobs isn't even programming - I feel like this is a detail missed by so many. Generally people just stereotype SWE as a programmer, but being an engineer (in any discipline) is so much more than that. You solve problems. AI will speed up the programming work-streams, but there is so much more to our jobs than that.
[-]
- ciaranmca 14 hours ago
  ^This 100%. Junior SWE here. Agentic coding has kinda felt like a promotion for me. I code less by hand and spend more time on the actual engineering side of things. There’s hype in both directions though. I don’t AI is replacing me anytime soon(fingers crossed), but it’s already way more useful than the skeptics give it credit for. Like most things the truth’s somewhere in the middle.
- whstl 19 hours ago
  Agreed.
  Most of the work brought to me gets done before I even think about sitting down to type.
  And it's interesting to see the divide here between "pure coder" and "coder + more". A lot of people seem to be in the job to just do what the PM, designer and business people ask. A lot of work is pushing back against some of those requests. In conversations here in HN about "essential complexity" I even see commenters arguing that the spec brought to you is entirely essential. It's not.
- danielbln 20 hours ago
  There is also so much more you can automate and use AI agents for than "programming". It's the world's best rubber duck, for one. It also can dig through code bases and compile information on data flows, data models and so on. Hell, it can automate effectively any task you do on the terminal.
lesuorac 1 day ago
> Most Recent Task for Survey
> Number of Survey Respondents
> Building apps 53
> Testing 1
I think this sums up everybody complaints about AI generated code. Don't ask me to be the one to review work you didn't even check.
[-]
- rco8786 1 day ago
  Yea. Nobody wants to be a full-time code reviewer.
  [-]
  - jaggederest 1 day ago
    Hi it's me, the guy who wants to be a full-time code reviewer.
    [-]
    - sarchertech 1 day ago
      If you really did that full time and never wrote code, you’d be a terrible reviewer.
      [-]
      - littlestymaar 1 day ago
        This is fine for us who've been building code by hand for many years before the advent of LLMs but it's definitely going to be a problem going forward.
        [-]
        mannycalavera42 1 day ago
        strong +1 here :-)
    - nemo 1 day ago
      Be careful what you wish for.
- throw-12-16 1 day ago
  I fired someone over this a few months ago.
AYBABTME 1 day ago
It feels like we're doing another lift to a higher level of abstraction. Whereas we had "automatic programming" and "high level programming languages" free us from assembly, where higher level abstractions could be represented without the author having to know or care about the assembly (and it took decades for the switch to happen), we now once again get pulled up another layer.
We're in the midst of another abstraction level becoming the working layer - and that's not a small layer jump but a jump to a completely different plane. And I think once again, we'll benefit from getting tools that help us specify the high level concepts we intend, and ways to enforce that the generated code is correct - not necessarily fast or efficient but at least correct - same as compilers do. And this lift is happening on a much more accelerated timeline.
The problem of ensuring correctness of the generated code across all the layers we're now skipping is going to be the crux of how we manage to leverage LLM/agentic coding.
Maybe Cursor is TurboPascal.
websiteapi 1 day ago
we've never seen a profession drive themselves so aggressively to irrelevance. software engineering will always exist, but it's amazing the pace to which pressure against the profession is rising. 2026 will be a very happy new year indeed for those paying the salaries. :)
[-]
- simonw 1 day ago
  We've been giving our work away to each other for free as open source to help improve each other's productivity for 30+ years now and that's only made our profession more valuable.
  [-]
  - websiteapi 1 day ago
    I see little proof open source has resulted in higher wages and not the fact that everything is being digitized and the subsequent demand for such people to assist in such.
    [-]
    - simonw 1 day ago
      I'm not sure how I can prove it, but ~25 years ago building software without open source sucked. You had to build everything from scratch! It took months to get even the most basic things up and running.
      I think open source is the single most important productivity boost to our industry that's ever existed. Automated testing is a close second.
      Google, Facebook, many others would not have existed without open source to build on.
      And those giants and others like them that were enabled by open source employed a TON of people, at competitive rates that greatly increased our salaries.
      [-]
      - christophilus 1 day ago
        25 years ago, I was slinging apps together super fast using VB6. It was awesome. It was a level of productivity few modern stacks can approach.
        [-]
        ipdashc 1 day ago
        I'm too young to have used VB in the workforce, but I did use it in school, and honestly off that alone I'm inclined to agree.
        I've seen VB namedropped frequently, but I feel like I've yet to see a proper discussion of why it seems like nothing can match its productivity and ease of use for simple desktop apps. Like, what even is the modern approach for a simple GUI program? Is Electron really the best we can do?
        MS Access is another retro classic of sorts that, despite having a lot of flaws, it seems like nothing has risen to fill its niche other than SaaS webapps like airtable.
        [-]
        simonw 1 day ago
        You can add Macromedia Flash to that list - nothing has really replaced it, and as a result the world no longer has an approachable tool for building interactive animations.
        whateverboat 1 day ago
        https://www.youtube.com/watch?v=hnaGZHe8wws
        This is a nice video on why Electron is the best you might be able to do.
        [-]
        ipdashc 1 day ago
        Thanks for the link - this is a cool video. Though it seems like it's mostly focusing on the performance/"bloat" side of things. I do agree that's an annoying aspect of Electron, and I do think his justifications for it are totally fair, but I was more so thinking about ease of use, especially for nontechnical people / beginners.
        My memory of it is very fuzzy, but I recall VB being literally drag-and-drop, and yet still being able to make... well, acceptable UIs. I was able to figure it out just fine in middle school.
        In comparison, here's Electron's getting started page: https://www.electronjs.org/docs/latest/ The "quick start" is two different languages across three different files. The amount of technologies and buzzwords flying around is crazy, HTML, JS, CSS, Electron, Node, DOM, Chromium, random `charset` and `http-equiv` boilerplate... I have to imagine it'd be rather demoralizing as a beginner. I think there's a large group of "nontechnical" users out there (usually derided by us tech bros as "Excel programmers" or such) that can perfectly understand the actual logic of programming, but are put off by the amount of buzzwords and moving parts involved, and I don't blame them at all.
        (And sure, don't want to go in too hard on the nostalgia. 2000s software was full of buzzwords and insane syntax too, we've improved a lot. But it had some upsides.)
        It just feels like we lost the plot at some point when we're all using GUI-based computers, but there's no simple, singular, default path to making a desktop GUI app anymore on... any, I think, of the popular desktop OSes?
        [-]
        whateverboat 11 hours ago
        You are totally right. Going even way back, in days of TurboPascal, you could include graphics.h and get a very cool snake game going within half an hour. Today, doing anything like that is a week of advanced stuff. Someone wanted to recreated that experience today and came up with this: https://github.com/dascandy/pixel
        But as you can see how much boiler plate was needed to be written for them to write this.
        https://github.com/dascandy/pixel/blob/master/examples/simpl...
        See the user example and then look at src for boilder plate.
        In old days, you could easily write a full operating system from scratch on 8051 while use PS/2 peripherals. Today, all peripherals are USB and USB 2.0 standard is 500 pages long.
        I also agree that we have left behind the idea of teaching probably or at least removed it from the mainstream.
        cheema33 1 day ago
        > 25 years ago, I was slinging apps together super fast using VB6. It was awesome. It was a level of productivity few modern stacks can approach.
        If that were too, wouldn't we all be using VB today?
        [-]
        majormajor 1 day ago
        Ever try to maintain a bunch of specialized one-off thrown-together things like that? I inherited a bunch of MS Access apps once ...
        everything old is new again
        cuu508 1 day ago
        Excel (and spreadsheets in general) is not quite the same as VB but is similar in that it solves practical problems and normal people can work with it.
        zqna 1 day ago
        Agentic coding is just another rhyme of 25 y/o frenzy of "let's outsource everything to India." The new generation thinks this time is really special with us. Let's check again in 25 years
        xpe 1 day ago
        How are you measuring productivity?
        What one can make with VB6 (final release in 1998) is very far from what can make with modern stacks. (My efficiency at building LEGO structures is unbelievable! I put the real civil engineers to shame.)
        Perhaps you mean that you can go from idea to working (in the world and expectations of 1998) very quickly. If so, that probably felt awesome. But we live in 2025. Would you reach for VB6 now? How much credit does VB6 deserve? Also think about how 1998 was a simpler time, with lower expectations in many ways.
        Will I grant advantages to certain aspects of VB6? Sure. Could some lessons be applicable today? Probably. But just like historians say, don't make the mistake of ignoring context when you compare things from different eras.
      - throw1235435 1 day ago
        Indeed it did; I remember those times. All else being equal I still think SWE salaries on average would of been higher if we kept it like that given basic economics - there would of been a lot less people capable of doing it but the high ROI automation opportunities would of still been there. The fact that "it sucked" usually creates more scarcity on the supply side; which all being equal means higher wages and in our capitalist society - status. Other professions that are older as to the parent comment already know this and don't see SWE as very "street smart" disrupting themselves. I've seen articles recently like "at least we aren't in coding" from law, accounting, etc an an anecdote to this.
        With AI at least locally I'm seeing the opposite now - less hiring, less wage pressure and in social circles a lot less status when I mention I'm a SWE (almost sympathy for my lot vs respect only 5 years ago). While I don't care for the status aspect, although I do care for my ability to earn money, some do.
        At least locally inflation adjusted in my city SWE wages bought more and were higher in general compared to others in the 90's-2000's than on wards (ex big tech). Partly because this difficulty and low level knowledge meant only very skilled people could participate.
        [-]
        ipdashc 1 day ago
        > ex big tech
        I mean, this seems like a pretty big thing to leave out, no? That's where all the crazy high salaries were!
        Also, there are still legacy places that more or less build software like it's 1999. I get the impression that embedded, automotive, and such still rely a lot on proprietary tools, finicky manual processes, low level languages (obviously), etc. But those are notorious for being annoying and not very well paid.
        [-]
        throw1235435 1 day ago
        I'm talking about what I perceive to be the median salary/conditions with big tech being only a part of that. My point is more that I remember back in that period good salaries could be had outside big tech too even in the boring standard companies that you state. I remember banks, insurance, etc paying very well for example compared to today for an SWE/tech worker - the good opportunities seemed more distributed. For example I've seen contract rates for some of the people we hire haven't really changed for 10 years for developers. Now at best they are on par with other professional white collar workers; and the competition seems fiercer (e.g. 5 interviews for a similar salary with leetcode games rather than experienced based interviews).
        Making software easier and more abstract has allowed less technical people into the profession, allowed easier outsourcing, meant more competition/interview prep to filter out people (even if the skills are not used in the job at all), more material for AI to train on, etc. To the parent comment's point I don't think it has boosted salaries and/or conditions on average for the SWE - in the long run (10 years +) it could be argued that economically the opposite has occurred.
        luckylion 1 day ago
        Monopolizing the work doesn't work unless you have the power to suppress anyone else joining the competition, i.e. "certified developers only".
        Otherwise people would have realized they can charge 3x as much by being 5x as productive with better tools while you're writing your code in notepad for maximum ROI, and you would have either adjusted or gone out of business.
        Increased productivity isn't a choice, it's a result of competition. And that's a good thing overall, even if it sucks for some developers who now have to actually work for the first time in decades. But it's good for society at large, because more things can be done.
        [-]
        throw1235435 1 day ago
        Sure - I agree with that, and I agree its good for society but as you state probably not as good for the SWE who has to work harder for the same which was my point and I think you agree. Other professions have done what you have stated (i.e. certification) and seen higher wages than otherwise which also proves my point. They see this as the "street smart" thing to do, and generally society respects them for it putting their profession on a higher pedestal as a result. People respect people who take care of themselves first generally I find as well. Personally I think there should be a balance between the two (i.e. a fair go for all parties; a fair day's work with some job security over a standard career lifetime but not extortionary).
        Also your notion of "better tools" may of not happened, or happened more slowly without open source, AI, etc which would of meant higher salaries for longer most probably. That's where I disagree with the parent poster's claim of higher salaries - AI seems to be a great recent example of "better tools" disrupting the premium SWE's enjoy rather than improving their salaries. Whether that's fair or not is a different debate.
        I was just doubting the notion of the parent comment that "open source software" and "automated testing" create higher salaries. Usually efficiency economically (some exceptional cases) creates lower salaries for the people who are made more efficient all else being equal - and the value shifts from them to either consumers or employers.
        [-]
        luckylion 22 hours ago
        > Other professions have done what you have stated (i.e. certification) and seen higher wages than otherwise which also proves my point.
        I'd generally agree with that if it regards to safety (e.g. industrial control systems), but we manage that by certifying the manufacturer, not the individual developer. But otherwise I think it's harmful to society, even if beneficial to the individuals - but there's a lot of things falling in that bucket, and it's usually not the things we strive for at a societal level.
        In my experience, getting better and faster has always translated into being paid more. I don't know that there's a direct relationship to specific tools, but I'm pretty sure that the mainstreaming of software development has caused the huge inflation of total comp that you see in many companies. If it was slow and there's only this handful of people that can do it, but they're not really adding a huge amount of value, you wouldn't be seeing that kind of multiplier vs the average job.
        [-]
        throw1235435 2 hours ago
        > But otherwise I think it's harmful to society, even if beneficial to the individuals
        I disagree a little in that stability/predictability to people also adds some benefit to society - constant disruption/change for the sake of efficiency I believe at extreme levels would be bad for mental health at the very least and probably cause some level of outrage and dysfunction. I know as an SWE tbh I'm feeling a bit of it - can't imagine if it was everyone.
        I personally think there is a tradeoff; people on average have limits to adaptability in their lifetimes and so it needs to be worth it for people to invest and enter in a given profession (some level of economic profit that makes their limited time worth spending in it). It shouldn't be excessive though - it should be where both client and producer get fair/equal value for the time/effort they both need to put in.
      - websiteapi 1 day ago
        even if that's true it's clear enough AI will reduce the demand for swe
        [-]
        simonw 1 day ago
        I don't think that's certain. I'm hoping for a Jevons paradox situation where AI drives down the cost of producing software to the point that companies that previously weren't in the market for custom software start hiring software engineers. I think we could see demand go up.
  - aussieguy1234 1 day ago
    This makes sense. Imagine PHP or NodeJS without a framework, or front end development without React. Your projects would take much longer to build. The time saved with the open source frameworks and libraries is more than what an AI agent can save you.
  - fshacf 1 day ago
    [flagged]
- cheema33 1 day ago
  > we've never seen a profession drive themselves so aggressively to irrelevance.
  Should we be trying to put the genie back in the bottle? If not, what exactly are you suggesting?
  Even if we all agreed to stop using AI tools today, what about the rest of world? Will everybody agree to stop using it? Do you think that is even a remote possibility?
  [-]
  - dinkumthinkum 1 day ago
    Does the rest of the world want to make money in a way not involving digging ditches? I feel like people from developing countries that spend 18 hours a day studying, giving their entire childhood to some standardized test, may not want yo be rewarded with no job prospects. Maybe that’s a crazy position.
- throw-12-16 1 day ago
  Software Engineers will still exist.
  Software Devs not so much.
  There is a huge difference between the two and they are not interchangeable.
  [-]
  - wiseowise 1 day ago
    Good luck convincing new overlords.
    Your take is this meme https://knowyourmeme.com/memes/dig-the-fucking-hole.
    [-]
    - throw-12-16 22 hours ago
      sorry i don't speak meme
- mkoubaa 1 day ago
  Don't care have too much to do must automate away my today responsibilities so I can do more tomorrow trvst the plqn
- zwnow 1 day ago
  Also it really baffles me how many are actually in on the hype train. Its a lot more than the crypto bros back in the day. Good thing AI still cant reason and innovate stuff. Also leaking credentials is a felony in my country so I also wont ever attach it to my codebases.
  [-]
  - aspenmartin 1 day ago
    I think the issue is folks talk past each other. People who find coding agents useful or enjoyable are labeled “on the hype train” and folks for which coding agents don’t work for them or their workflow are considered luddites. There are an incredible number of contradicting claims and predictions out there as well, and I believe what we see is folks projecting their reaction to some amalgamation of them onto others. I see a lot of “they” language, and a lot of viral articles about business leadership “shoving AI down our throats” and it becomes a divisive issue like American political scene with really no one having a real conversation
    [-]
    - llmslave2 1 day ago
      I think the reason for the varying claims and predictions is because developers have wildly different standards for what constitutes working code. For the developers with a lower threshold, AI is like crack to them because gen ai's output is similar to what they would produce, and it really is a 10x speedup. For others, especially those who have to fix and maintain that code, it's more like a 10x slowdown.
      Hence why you have in the same thread, some developer who claims that Claude writes 99% of their code and another developer who finds it totally useless. And of course others who are somewhere in the middle.
      [-]
      - remich 1 day ago
        Have you considered that it's a bit dismissive to assume that developers who find use out of AI tools necessarily approve of worse code than you do, or have lower standards?
        It's fine to be a skeptic. Or to have tried out these tools and found that they do not work well for your particular use case at this moment in time. But you shouldn't assume that people who do get value out of them are not as good at the job as you are, or are dumber than you are, or slower than you are. That's just not a good practice and is also rude.
        [-]
        llmslave2 1 day ago
        I never said anything about being worse, dumber, and definitely not slower. And keep in mind worse is subjective - if something doesn't require edge case handling or correctness, bugs can be tolerated etc, then something with those properties isn't worse is it?
        I'm just saying that since there is such a wide range of experiences with the same tools, it's probably likely that developers vary on their evaluations of the output.
        [-]
        remich 1 day ago
        Okay, I certainly agree with you that different use cases can dictate different outcomes when using AI tooling. I would just encourage everyone who thinks similar to you to be cautious about assuming that someone who experiences a different result with these tools is less skilled or dealing with a less difficult use case - like one that has no edge cases or has greater tolerance for bugs. It's possible that this is the case, but it is just as possible that they have found a way to work with these tools that produces excellent output.
        [-]
        llmslave2 1 day ago
        Yeah I agree, it doesn't really have to do with skill or different use cases, it's just what your threshold is for "working" or "good".
      - throw1235435 1 day ago
        There's also the effect of different models. Until the most recent models, especially for concise algorithms, I felt it was still easier to sometimes do it myself (i.e. a good algo can be concise/more concise than a lossy prompt) and leave the "expansion/repetitive" boilerplate code to the LLM. At least for me the latest models do feel like a "step change" in that the problems can be bigger and/or require less supervision on each problem depending on the tradeoff you want.
    - mhitza 1 day ago
      Hard to have a conversation when often the critics of LLM output receive replies like "What, you used last week's model?! No, no, no, this one is a generational leap"
      Too many people are invested into AI's success to have a balanced conversation. Things will return to normal after a market shakedown of a few larger AI companies.
      [-]
      - aspenmartin 20 hours ago
        On HN I think you overestimate the number of optimists that are optimists because they have some vested interest. Everyone everywhere arguably has a vested interest. I would also argue all of the folks on HN that are hostile and dismissive of coding agents also have a vested interest (just for the sake of contrasting your argument). If coding agents were really crappy I wouldn’t be using them just like I didn’t use them until end of 2025.
        What conversation is hard to have? If you mean trying to convince people coding agents can or cannot do a specific thing then that may never go away. If you take an overall theme or capability, in some cases it will “just work” and in other cases it needs some serious steering or scaffolding, and in other cases it will just waste as much time as you will let it. It’s an imperfect tool and it may always be, and two people insisting it can do something and it cannot do that same thing may both be right.
        What is troubling to me is the attitude of folks that are heavily hostile towards these models and the people that use them. People routinely conflate market promises and actual delivered tools and capabilities and lump people who enjoy and get lots of mileage out of these tools into what appears to be a big strawman camp of fawning fans who don’t understand or appreciate Real Software Engineering; people who would write bad code anyway and not know. It’s quite insulting but also wrong. Not saying you are part of this camp! But as one lonely optimist in a sea of negativity that’s certainly the perspective I’ve developed from the “conversations” I’ve seen on HN
    - zwnow 1 day ago
      Its all a hype train though. People still believe in the AI gonna bring utopia bullshit while the current infra is being built on debt. The only reason it still exists is that all these AI companies believe in some kind of revenue outside of subscriptions. So its all about:
      Owning the infrastructure and enshittify (ads) once enough products are based on AI.
      Its the same chokehold Amazon has on its Vendors.
  - fragmede 1 day ago
    your credentials shouldn't be in your codebase to begin with!
    [-]
    - zwnow 1 day ago
      .env files are a thing in tons of codebases
      [-]
      - iwontberude 1 day ago
        but thats at runtime, secrets are going to be deployed in a secure manner after the code is released
        [-]
        zwnow 1 day ago
        .env files are used to develop as well, for some things like PayPal u dont have to change the credentials, you just enable sandbox mode. If I had some LLM attached to my codebase, it would be able to read those credentials from the .env file.
        This has nothing to do with deployment. I never talked about deployment.
        [-]
        Carrok 1 day ago
        If you have your PayPal creds in your repository, you are doing it wrong.
        [-]
        zwnow 1 day ago
        .gitignore is a thing
        [-]
        Carrok 1 day ago
        Which every AI tool I’m aware of respects and ignores by default.
        [-]
        zwnow 1 day ago
        Why is it that they can add new env variables then?
        [-]
        Carrok 20 hours ago
        It is trivial to append to files without reading them. Also, no AI provider even wants your secrets, they are a liability. Do whatever you want though, I'm not here to convince you of anything.
      - mkozlows 1 day ago
        If your secrets are in your repo, you've probably already leaked them.
geldedus 1 day ago
The "Ai-assisted programming" mistaken for "vibe coding" is getting old and annoying
banbangtuth 1 day ago
You know what. After seeing all these articles about AI/LLM for these past 4 years, about how they are going to replace me as software developers and about how I am not productive enough without using 5 agents and being a project manager.
I. Don't. Care.
I don't even care about those debates outside. Debates about do LLM work and replace programmers? Say they do, ok so what?
I simply have too much fun programming. I am just a mere fullstack business line programmer, generic random replaceable dude, you can find me dime a dozen.
I do use LLM as Stack Overflow/docs replacement, but I always code by hand all my code.
If you want to replace me, replace me. I'll go to companies that need me. If there are no companies that need my skill, fine, then I'll just do this as a hobby, and probably flip burgers outside to make a living.
I don't care about your LLM, I don't care about your agent, I probably don't even care about the job prospects for that matter if I have to be forced to use tools that I don't like and to use workflows I don't like. You can go ahead find others who are willing to do it for you.
As for me, I simply have too much fun programming. Now if you excuse me, I need to go have fun.
[-]
- llmslave2 1 day ago
  I simply will not spend my life begging and coaxing a machine to output working code. If that is what becomes of this profession, I will just do something else :)
  [-]
  - ryanobjc 1 day ago
    If I wanted to do that, I'd just move into engineering management and work with something less temperamental and predictable - humans.
    I'd at least be more likely to get a boost in impact and ability to affect decision making, maybe.
    [-]
    - lifetimerubyist 1 day ago
      Until you realize you're just begging and coaxing a human to better beg and coax a machine to output working code - when you could just beg and coax the machine yourself.
      [-]
      - llmslave2 1 day ago
        At least I'd be the one interfacing with a human instead of a machine :P
        [-]
        lifetimerubyist 1 day ago
        [dead]
  - aspenmartin 1 day ago
    It would definitely be the profession if we stopped developing things today. Think about the idea of coding agents 2 years ago, I personally found them very unrealistic and am now coding exclusively with them despite them being either a neutral or net negative to my development time simply because I see the writing on the wall that in 6 mos to a year they will probably be a huge net positive and in 2-3 years the dismissive attitude towards adoption will start to look kind of silly (no offense). To me we are _just_ at the inflection point where using and not using coding agents are both totally sensible decisions.
- lifetimerubyist 1 day ago
  Hear hear. I didn't spend half my life getting an education, competing in the corporate crab bucket, retraining and upskilling just to turn into a robot babysitter.
  [-]
  - danielbln 20 hours ago
    Then continue to write code as a hobby, noone is going to take that away from you. But if you want someone to pay you for hand setting code the way you always have then .. well you might find that harder and harder as time goes on.
    [-]
    - lifetimerubyist 11 hours ago
      [dead]
- hecanjog 19 hours ago
  I appreciate this perspective. I'm actually hoping LLM hype will help to pop the bubble of tech salaries, make the profession roughly as profitable as going into teaching, so maybe the gold diggers will clear out and go play the stock market or something, rest of us can stick around and build things. Maybe software quality will even improve as a result? Would be nice...
  [-]
  - falkensmaize 5 hours ago
    Man, come on - what planet are you from, seriously? I got into this business because I enjoy programming, but I also wanted to for once in my life make a decent living and be able to save something. I have kids I'd like to send to college. I'd like to be able to retire someday. I have aging parents that need expensive care. This is one of the few professions that you can upskill into without years of expensive degrees.
    People need to make money to survive, now more than ever. It seems incredibly selfish to wish for that to disappear just so you can "purify" the profession.
- yacthing 1 day ago
  Easy to say if you either:
  (1) already have enough money to survive without working, or
  (2) don't realize how hard of a life it would be to "flip burgers" to make a living in 2026.
  We live very good lives as software developers. Don't be a fool and think you could just "flip burgers" and be fine.
  [-]
  - banbangtuth 1 day ago
    Ah, I actually did flip burgers. So I know.
    I also did dry cleaning, cleaning service, deli, delivery guy, etc.
    Yup I now have enough money to survive without working.
    But I also am very low maintenance, thanks to my early life being raised in harsh conditions.
    I am not scared to go back flipping burgers again.
    [-]
    - falkensmaize 4 hours ago
      "I am not scared to go back flipping burgers again."
      You should be - in all likelihood you'd have to work 3 burger-flipping jobs to make enough money to pay rent and buy food. Inflation and housing issues have hit a lot harder than most people who make 6-figure incomes realize. It's really tough out there right now. I am very, very grateful for the income I have and don't take it for granted.
    - Madmallard 1 day ago
      "Yup I now have enough money to survive without working" Your opinion is borderline irrelevant then.
      [-]
      - banbangtuth 1 day ago
        Indeed, after all I am just replaceable dime a dozen software engineer like I said above.
        [-]
        Madmallard 1 day ago
        that part doesn't matter
        it's the part where you don't have to work that matters
- dinkumthinkum 1 day ago
  I hear you but I feel like you (and really others like you, in mass) should not be so passive about your replacement. For most programmers, simply flipping burgers for money to enjoy programming a few hours a week is not going to work. Making a living is a thing. If you are reduced to having to flip burgers that means the economy will gave collapsed and there won’t be any magic Elon UBI money to save us.
  [-]
  - banbangtuth 1 day ago
    We will have bigger problems when that happens. I am not worried.
- agentifysh 1 day ago
  having fun isn't tied to employment unless you are self-employed even then what's fun should not be the driving force
  [-]
  - lifetimerubyist 1 day ago
    "get a job doing something you enjoy and you'll never work a day in your life"
    or something like that
  - llmslave2 1 day ago
    That sounds miserable to me :(
    [-]
    - agentifysh 1 day ago
      you work on somebody's dime, its no longer your choice
      [-]
      - zem 1 day ago
        it's your choice whose dime you work on. they can compete for your work by making it fun for you.
        [-]
        agentifysh 1 day ago
        sure unemployment is also a choice
        [-]
        zem 1 day ago
        fun work > tedious work > unemployment
        not sure why so many people feel like factoring fun into what job you want to take is so unthinkable, or that it's just a false dichotomy between the ideal job and unemployment
        [-]
        agentifysh 1 day ago
        you are describing the ideal which is not a reality for many many people as it is not common
        [-]
        zem 1 day ago
        it's a trade-off; you need a job but you typically interview at several places, collect offers, and weigh them according to various criteria. all the pro-fun posters are saying is that "enjoy the job" is a very highly ranked criterion for us.
        throw-12-16 22 hours ago
        this is known as privilege
        [-]
        zem 11 hours ago
        it's definitely a privilege to be able to find a fun job! but note that I'm not saying that everyone should hold out until they find one, I'm pushing back against the dour people who are convinced in their heart of hearts that "it's a job, it's not supposed to be fun" and that you are being an idiot for thinking it's even possible to find a job you really enjoy.
      - llmslave2 1 day ago
        It's my life, it's my choice.
  - banbangtuth 1 day ago
    Why? It is a matter of values. Fun can be a driving force just like money and stability is. It is simply a matter of your values (and your sacrifices).
    Like I said, I am just a generic replaceable dime a dozen programmer dude.
    [-]
    - agentifysh 1 day ago
      you dont get paid to have fun but to produce as a laborer
      a job isn't supposed to be fun its nice when it is but it shouldn't be what drives decisions
      [-]
      - banbangtuth 1 day ago
        You mean it shouldn't be the driving force of your employer to make decision. Yes I agree 10000%
        I meant it can be your (not necessarily your employer) driving decision in life.
        Of course, you need to suffer. That's about having tradeoffs.
        [-]
        agentifysh 1 day ago
        almost all employers are going to expect you to use AI and produce more with it
        you can definitely choose not to participate and give the opportunity someone who are happy to use AI and still have fun with it.
        [-]
        banbangtuth 1 day ago
        Indeed, please find others to do it, not me.
        tikhonj 1 day ago
        most organizations have awful leadership, sure
        but that doesn't mean you can't (or shouldn't) work around it
        [-]
        agentifysh 1 day ago
        have you tried telling your boss you won't use the AI anymore while the rest of the team uses it ?
        how do you imagine such conversation to play out im curious
        [-]
        tikhonj 1 day ago
        what I've done is avoid the sort of boss who would mandate AI use
        in a past job I did tell a boss that I wasn't going to be doing the whole tickets/estimates/schedule tetris thing, and that actually worked out... because the leaders I worked with understood the value of being flexible and trusting their lead engineers
  - throw-12-16 1 day ago
    i think you angered the hustle bros
    [-]
    - agentifysh 1 day ago
      were these bots? so strange all green nicks
      [-]
      - throw-12-16 22 hours ago
        nah just privileged dudes who think the whole world gets to pick and choose how to earn a living
andy99 1 day ago
Is the title an ironic play on AI’s trademark writing style, is it AI generated, or is the style just rubbing off on people?
[-]
- mattnewton 1 day ago
  I think it’s a popular style before gen ai and the training process of LLMs picked up on that.
  [-]
  - andy99 1 day ago
    That’s not how LLMs work, it’s part of the reinforcement learning or SFT dataset, data labelers would have written or generated tons of examples using this and other patterns (all the emoji READMEs for example) that the models emulate. The early ones had very formulaic essay style outputs that always ended with “in conclusion”, lots of the same kind of bullet lists, and a love of adjectives and delving, all of which were intentionally trained in. It’s more subtle now but it’s still there.
    [-]
    - mattnewton 1 day ago
      Maybe I was being imprecise, but I’m not sure what you mean by “not how LLMs work” - discovering patterns of how humans write is exactly the signal they are trained against. Either explicitly curated like SFT or coaxed out during RLHF, no?
      It could even have been picked up in pretraining and then rewarded during rlhf when the output domain was being refined; I haven’t used enough LLMs before post training to know what step it usually becomes noticeable.
senshan 1 day ago
Excellent survey, but one has to be careful when participating in such surveys:
"I’m on disability, but agents let me code again and be more productive than ever (in a 25+ year career). - S22"
Once Social Security Administration learns this, there goes the disability benefit...
[-]
- LoganDark 1 day ago
  I think you eventually lose disability benefits anyway once you start making money.
ramoz 1 day ago
> Takeaway 3c: Experienced developers disagree about using agents for software planning and design. Some avoided agents out of concern over the importance of design, while others embraced back-and-forth design with an AI.
Im in the back-and-forth camp. I expect a lot of interesting UX to develop here. I built https://github.com/backnotprop/plannotator over the weekend to give me a better way to review & collaborate around plans - all while natively integrated into the coding agent harness.
amkharg26 1 day ago
The title is provocative but there's truth to it. The distinction between "vibing" with AI tools and actually controlling the output is crucial for production code.
I've seen this with code generation tools - developers who treat AI suggestions as magic often struggle when the output doesn't work or introduces subtle bugs. The professionals who succeed are those who understand what the AI is doing, validate the output rigorously, and maintain clear mental models of their system.
This becomes especially important for code quality and technical debt. If you're just accepting AI-generated code without understanding architectural implications, you're building a maintenance nightmare. Control means being able to reason about tradeoffs, not just getting something that "works" in the moment.
senshan 1 day ago
I often tell people that agentic programming tools are the best thing since cscope. The last 6 months I have not used cscope even once after decades of using it nearly daily.
[0] https://en.wikipedia.org/wiki/Cscope
[-]
- utopiah 1 day ago
  Well, looks like that's how I'm spending my day https://cscope.sourceforge.net/cscope_vim_tutorial.html
  Out of curiosity, if I wanted to setup cscope for a bunch of small projects, say dozens of prototypes in their own directory, would it be useful? Too broad?
softwaredoug 1 day ago
The new layer of abstraction is tests. Mostly end-to-end and integration tests. It describes the important constraints to the agents, essentially long lived context.
So essentially what this means is a declarative programming system of overall system behavior.
zwnow 1 day ago
Idk, I still mostly avoid using it and if I do, I just copy and paste shit into the Claude web version. I wont ever manage agents as that sounds just as complicated as coding shit myself.
[-]
- lexandstuff 1 day ago
  It's not complicated at all. You don't "manage agents". You just type your prompt into an terminal application that can update files, read your docs and run your tests.
  As with every new tech there's a hell of a lot of noise (plugins, skills, hooks, MCP, LSP - to quote Kaparthy) but most of it can just be disregarded. No one is "behind" - it's all very easy to use.
  [-]
  - danielbln 20 hours ago
    Easy to use, hard to master. Or: low skill floor, high skill ceiling. My output wouldn't be nearly as good without subagents and skills, and MCPs are somewhat required if you deploy tool using agents at scale.
    It's like saying all you need is notepad to develop. It's not wrong, but.. you know.
    [-]
    - micik 15 hours ago
      It’s not hard to master. It’s not a skill to be learned —- it’s a tool that comes with a manual. You read the manual and now you can use the tool. Most people never will read the manual which is what gives the false impression that there’s something “to master” here. It’s like saying vím is harder to use than notepad. Not if you read the entire manual first.
      [-]
      - danielbln 14 hours ago
        I'm not sure how you define skill acquisition, it's reading documentation and doing the skill, yes? The AI landscape shifts rather quickly still, and a new LLM + harness has a different set of functionality, but more importantly different fuzzy failure cases Things a model is particular good at, things that work better if you combine certain systems. All of it is documented, but also fast moving and new things are discovered frequently. In comparison, Vim has been around for decades.
        And vum is absolutely harder to use than notepad. Otherwise it's like saying that rocket science isn't hard because you just have to read the documentation to know how to engineer a rocket.
zkmon 1 day ago
I haven't seen the definition of an agent, in the paper. Do they differentiate agents from generic online chat interfaces?
[-]
- senshan 1 day ago
  Page 2: We define agentic tools or agents as AI tools integrated into an IDE or a terminal that can manipulate the code directly (i.e., excluding web-based chat interfaces)
- esafak 1 day ago
  An agent takes actions. Chat bots only return text.
  [-]
  - zkmon 1 day ago
    "takes actions" is automation and its is hardly new. Code was always taking actions over the decades. Interpreting and generating text belongs to chat bots. What's new with agents?
    [-]
    - esafak 12 hours ago
      Your code only takes actions prescribed by you. The agent does not; it picks the tool. I thought this was too obvious to point out.
000ooo000 1 day ago
Have to wonder about the motivations of research when the intro leads with such a quote.
andrewstuart 1 day ago
Don’t let anyone tell you the right way to program a computer.
Do it in the way that makes you feel happy, or conforms to organizational standards.
[-]
- mkoubaa 1 day ago
  The right way to program a computer:
  Well
  [-]
  - andrewstuart 1 day ago
    No.
    There’s many contexts in which programming a computer well is not important.
4b11b4 1 day ago
I like to think of it as "maintaining fertile soil"
throw-12-16 1 day ago
Getting big "I'll keep making saddles in the era of automobiles" vibes from these comments.
[-]
- danielbln 20 hours ago
  Yeah, it feels many SWEs have painted themselves into a corner. They love the nose-to-code-grindstone process and chain themselves to the abstraction layer of today. I don't think it's gonna end well for them, let's see.
  [-]
  - Snuggly73 9 hours ago
    This type of comment implies that it’s going to stop with “them” and somehow “us that adopted the LLM” will be the winners. The goal is full automation, there is no “adapt or be left behind”.
game_the0ry 1 day ago
> Through field observations (N=13) and qualitative surveys (N=99)...
Not a statistically significant sample size.
[-]
- flurie 1 day ago
  This is a qualitative methods paper, so statistical significance is not relevant. The rough qualitative equivalent would instead be "data saturation" (responses generally look like ones you've received already) and "thematic saturation" (you've likely found all the themes you will find through this method of data collection). There's an intuitive quality to determining the number of responses needed based on the topic and research questions, but this looks to me like they have achieved sufficient thematic saturation based on the results.
  [-]
  - game_the0ry 20 hours ago
    So, I upvoted your comment bc I genuinely believe there is something in your comments worth learning from, but...
    > This is a qualitative methods paper, so statistical significance is not relevant.
    I have never heard of a "qualitative methods paper" and it sounds like something a researcher would do to push a narrative with "qualitative data" rather than data that could be measured.
    Tell me why I am wrong.
    [-]
    - flurie 19 hours ago
      You're not necessarily wrong, but the phrase "push a narrative," the scare quotes around "qualitative data," and your initial comment suggest to me that you are not familiar with qualitative research but have a bias or mistrust against it (no judgment, just stating my observation). If you would like to know more about it, this[1] provides a reasonable overview, and if you would like to know much more, I can ask my spouse, who is a qualitative methodologist in medicine at an R1[2], for her recommendations. I can also tell you what I think of this specific paper, but I did not want it to color my initial comment.
      [1] https://en.wikipedia.org/wiki/Qualitative_research
      [2] https://en.wikipedia.org/wiki/List_of_research_universities_...
      [-]
      - game_the0ry 17 hours ago
        > your initial comment suggest to me that you are not familiar with qualitative research but have a bias or mistrust against it
        I can confirm that, yes, I do have an arguably paranoid bias and/or mistrust against information that is not quantifiable in nature nor is simple enough for me (an idiot) to understand easily.
        Appreciate the thoughtful response. Don't ask the spouse, just enjoy the new year. I'll figure it out.
- bee_rider 1 day ago
  97 samples is enough to get a 95% confidence level if you accept a 10% margin of error. 99 is not so bad, at least.
  https://www.surveymonkey.com/mp/sample-size-calculator/
- HPsquared 1 day ago
  Significance depends on effect size.
- energy123 23 hours ago
  How many independent witnesses would you need to convict someone of murder?
- superjose 1 day ago
  Same thoughts exactly.
SunlitCat 1 day ago
Funny how the title alone evokes the old “real programmers” trope https://xkcd.com/378/