So, you can assign github issues to this thing, and it can handle them, merge the results in, and mark the bug as fixed?
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
It's actually a decent patern for agents. I wrote a pricing system with an anylyst agent, a decision agent, and a review agent. They work together to make decisions that comply with policy. It's funny to watch them chatter sometimes, they really play their role, if the decision agent asks the anylyst for policy guidance it refuses and explains that it's role is to analyze. Though they do often catch mistakes that way and the role playing gets good results.
Python classes. In my framework agents are class instances and tools are methods. Each agent has it's own internal conversation state. They're composable and the agent has tools for communicating with the other agents.
Generally, I keep the context. If I'm one shotting then I invoke a new agent. All calls and responses append to the agent's chat history. Agent's are relatively short lived, so the context length isn't typically an issue. With the pricing agent the initial data has been longer than the context window sometimes, but that just means it needs more preprocessing. Now if there is a real reason that I would want to manage it more actively, I can reach out to the agent internals. I have a tool call emulation layer, because some models have poor native tool support, and in those cases it's sometimes necessary to retry calls if the response fails validation. In those cases, I will only keep the last successful try in the conversation history.
There is one special case where I manage it more actively. I wrote an REPL process analyst, to help build the pricing agent and refine the policy document. In that case I would have long threads with an artifact attachment. So I added a facility to redact old versions of the artifact replacing them with [attachment: filename] and just keep the last one. It works better that way because multiple versions in the same conversation history confuse the model, and I don't like to burn tokens.
For longer lived state, I give the agent memory tools. For example the pricing agent's initial state includes the most recent decision batch and reasoning notes, and the agent can request older copies. The agent also keeps a notebook which they are required to update, allowing agents to develop long running strategies and experiments. And they use it to do just that. Honestly the whole system works much better than I anticipated. The latest crop of models are awesome, especially Gemini 2.5 flash.
Langroid enables tool-calling with practically any LLM via prompts: the dev just defines tools using a Pydantic-derived `ToolMessage` class, which can define a tool-handler, and additional instructions etc; The tool definition gets transpiled into appropriate system message instructions. The handler is inserted as a method into the Agent, which is fine for stateless tools. Or the agent can define its own handler for the tool in case tool handling needs agent state. In the agent response loop our code detects whether the LLM generated a tool, so that the agent's handler can handle it.
See ToolMessage docs: https://langroid.github.io/langroid/quick-start/chat-agent-t...
In other words we don't have to rely on any specific LLM API's "native" tool-calling, though we do support OpenAI's tools and (the older, deprecated) functions, and a config option allows leveraging that. We also support grammar constrained tools/structured outputs where available, e.g. in vLLM or llama.cpp: https://langroid.github.io/langroid/quick-start/chat-agent-t...
Love it, I did something very similar, deriving a pydantic model from the function signature. Simpler without the native tool call API, even though occasional retries are required when the response fails to validate. Will have to give Langroid a try.
I had not thought about sharing it. I rolled my own framework, even though there are several good choices. I'd have to tidy it up, but would consider it if a few people ask. Shoot me an email, info in my profile.
The more difficult part which I won't share was aggregating data from various systems with ETL scripts into a new db that I generate various views with, to look at the data by channel, timescale, price regime, cost trends, inventory trends, etc. A well structured JSON object is passed to the analyst agent who prepares a report for the decision agent. It's a lot of data to analyze. It's been running for about a month and sometimes I doubt the choices, so I go review the thought traces, and usually they are right after all. It's much better than all the heuristics I've used over the years.
I've started using agents for things all over my codebase, most are much simpler. Earlier use of LLM's might have been called that in some cases, before the phrase became so popular. As everyone is discovering, it's really powerful to abstract the models with a job hat and structured data.
I think it would take quite a long while to achieve human-level anti-entropy in Agentic systems.
Complex system requires tons of iterations, the confidence level of each iteration would drop unless there is a good recalibration system between iterations. Power law says a repeated trivial degradation would quickly turn into chaos.
A typical collaboration across a group of people on a meaningfully complex project would require tons of anti-entropy to course correct when it goes off the rails. They are not in docs, some are experiences(been there, done that), some are common sense, some are collective intelligence.
Can you think of an example in history where labour was replaced with tech and the displaced workers kept their income stream? If a machine can do your job, (eventually) I'll be cheaper to use that machine instead of you and you'll no longer have a job. Is that not a given?
Anyway, it was probably just a joke... so not sure we need to unravel it all.
Displaced hired personnel of course cannot hope for that.
But VCs own their business, they are not employees. If you own a bakery, and buy a machine to make the dough instead of doing it by hand, and an automatic oven to relieve you form tracking the temperature manually, you of course keep the proceeds from the improved efficiency (after you pay the credit you took to purchase the machines).
The same was true with the aristocrats of centuries past: the capitalists who run our modern economy were once nothing more than their managers, delegates who handled the estates, their investments, their finances, growing power until they could dictate policy to their 'sovereign' and eventually dispose of them entirely.
The nobility used to be the dedicated warrior class, the knights. This secured their position in the society and allowed them to rule, by coercion when needed.
Once they ceased to exercise their military might, some time around 17th-18th century, and chose to live off the rent on their estates, their power became more and more nominal. It either slipped (or was yanked) from their hands, or they turned capitalists themselves.
I didn't get the impression it was meant as a joke:
"Every great venture capitalist in the last 70 years has missed most of the great companies of his generation... if it was a science, you could eventually dial it in and have somebody who gets 8 out of 10 [right]," the investor reasoned. "There's an intangibility to it, there's a taste aspect, the human relationship aspect, the psychology — by the way a lot of it is psychological analysis," he added.
"So like, it's possible that that is quite literally timeless," Andreessen posited. "And when the AIs are doing everything else, like, that may be one of the last remaining fields that people are still doing."
Similar to how some domain name sellers acquire desirable domains to resell at a higher price, agent providers might exploit your success by hijacking your project once it gains attraction.
I decided to be an engineer as opposed to manager because I didn't like people management. Now it looks like I'm forced to manage robots that talk like people. At least I can be the as non-empathetic as I want to be. Unless a startup starts doing HR for AI agents then I'm screwed.
Hypothesis: empathy is the skill most effective at taking vague, poorly specified requests from customers and clients and transforming them into a design with well specified requirements and a high level plan to implement the design. For example, what a customer says they want often isn't what they need. Empathy is how we bridge that gap for them and deliver something truly valuable.
Given empathy is all about feelings, it's not something models and tools will be able to displace in the next few years.
I was interested. Clicked the try button and just another wait list. When will Google learn that the method that worked so well with Gmail doesn't work any more. There are so many shiny toys to play with now, I will have forgotten about this tomorrow.
And if you don't sign up quickly after your turn in the queue comes up, you might miss the service altogether, because Google will have shut it down already.
And if you are from Germany you can't even join the list. First I needed to verify it is really me. Get a confirmation code to my recovery mail. Get a code to my cell phone number. And than all I got is a service restricted message.
The method absolutely does work, but you need loyal advocates who are praising your product to their friends, or preferrably users who are already knocking on your door.
Oh god, the GDE program. That title used to mean something, i.e. this person is a real expert in the topic.
Now it's just thrown to anyone who's willing enough to spam linkedin/twitter with Google bullshit and suck-up to the GDE community. Think everyone in the extended Google community got quite annoyed with the sudden rise in number of GDE's for blatantly stupid things.
This pops up especially if you're organising a conference in a Google-adjacent space, as you will get dozens of GDE's applying with talks that are pretty much a Google Codelab for a topic, without any real insights or knowledge shared, just a "lets go through tutorial together to show you this obscure google feature". And while there are a lot of good GDE's, in the last 5-6 years there has been such an influx of shitty ones that the program lost it's meaning and is being actively avoided.
I assume they weren't intending to release it today, and didn't have it ready, but didn't want people thinking that they were just following in Github's footprints.
i use both. I think Gemini produces longer more complicated answers. ChatGPT is more succint, but it could be b/c I've trained ChatGPT how to talk to me.
The context window difference is really nice. I post very large bodies of text into gemini and it handles it well.
It's success theater. You need to show progress otherwise you might be perceived falling behind. In times where LoI's are written and partnerships are forged the promise has more value than the fact.
Anymore? For me it always sounded too childish or sarcastic. I would expect to see "Blazingly Fast" on a box of Hot Wheels or Nerf Blaster, not a serious tech product.
Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
> Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
Haven't tried Jules myself yet, still playing around with Codex, but personally I don't really care if it's free or not. If it solves my problems better than the others, then I'll use it, otherwise I'll use other things.
I'm sure I'm not alone in focusing on how well it works, rather than what it costs (until a certain point).
Technically speaking,the strategy they execute is called "Loss Leader".
As Loss Leader, the company offers a product at a reduced price to attract users, create stickiness, and through that aims to capture the market.
Well, this isn't the first github-based agent. A well-known one is https://app.all-hands.dev/. And, there are great cheap or even free more general agents. So, given that this agent isn't a novelty, price is naturally an immediate talking point.
> That's all good and well but its takes time to compare the products
Hence many of us are still busy trying out Codex to it's full extent :)
> And people are rarely willing to use paid product for comparison.
Yeah, and I'm usually the same, unless there is some free trial or similar, I'm unlikely to spend money unless I know it's good.
My own calculation changed with the coming of better LLMs though. Even paying 200 EUR/month can be easily regained if you're say a freelance software engineer, so I'm starting to be a lot more flexible in "try for one month" subscriptions.
I haven't read too much from others, but personally for me Codex online form was the biggest productivity boost in coding since the original Copilot.
Cursor just deleted my unit tests too many times in agent mode.
Codex 5x-ed my output, though the code is worse than I would write it, at this point the productivity improvement with passing tests, not deleting tests is just too good to be ignored anymore.
I just noticed that this is definitely true for me, but not if the product is pay to go.
I have far fewer qualms about spending $10 on credits, even if I decide the product isn't worth it and never actually spend those credits, than about taking a free trial for a $5 subscription.
Google has been offering you "free inference" for more than a decade. People who never work there are simply not aware of how thorough soaked in machine inference many Google products are, especially the major ones like web search, mail, photos, etc.
> No. Jules does not train on private repository content. Privacy is a core principle for Jules, and we do not use your private repositories to train models. Learn more about how your data is used to improve Jules.
It's hard to tell what the data collection will be, but it's most likely similar to Gemini where your conversation can become part of the training data. Unclear if that includes context like the repository contents.
I read that a couple of times. It sounds vaguely clever and a bit ominous, but I have no clue what it means. Can you explain?
Google products had had a net positive impact on my life over, what is it, 20 years now. If I had had to pay subscription fees over that span of time, for all the services that I use, that would have been a lot of very real money that I would not have right now.
Is there a next step where it all gets worse? When?
> And so it is that you by reason of your tender regard for the writing that is your offspring have declared the very opposite of its true effect. If men learn this, it will implant forgetfulness in their souls. They will cease to exercise memory because they rely on that which is written, calling things to remembrance no longer from within themselves, but by means of external marks.
> What you have discovered is a recipe not for memory, but for reminder. And it is no true wisdom that you offer your disciples, but only the semblance of wisdom, for by telling them of many things without teaching them you will make them seem to know much while for the most part they know nothing. And as men filled not with wisdom but with the conceit of wisdom they will be a burden to their fellows.
- Plato quoting Socrates in "Phaedrus", circa 370 BCE
At least with writing it's fairly easy to implement on your own with little more than what most people would have available in a rudimentary survival situation. It'll be a tough day when someone goes to sign into their GoogleLife (tm) and find out that they can't get AI access because "precluding conditions agreed to upon signing"
As I see it, the solution to this is to invest in open source. As for a "survival situation", a solar-powered laptop with a locally running LLM would definitely be the first item on my list.
It shouldn't be, because LLM:s can't be trusted in the way literature can. People around you are also going to question why you insist on such a power hungry setup.
I mean, that's all any of us needs. It's an honorable quote.
I know you're not trying to draw any parallels between Plato's admonition on written thoughts supplanting true knowledge and the justifiable concerns about automated writing tools supplanting the ability of writers to think. To a modern literate, Plato's concern is legible but so patently ridiculous that one could only deploy it as a parody and mockery of the people who might take it as a serious proof that philosophers were wrong about modern tools before. I was obviously just kiddin about whether you googled it. Unfortunately, now a whole new generation is about to use it to justify how LLMs are just being maligned the way written language once was.
Socrates was wrong on this. But Plato was kind of an asshole for writing it down. The proof of both is that we can now google the quote, which is objectively funny. The trouble with LLMs, I guess, is that they would just attribute the quote to your uncle Bob, who also said that cats are a good source of fiber, and thus the whole project started when the words were put in parchment ends with a blizzard of illegible scribbles. If writing was bad for true understanding, not-writing is where humanity just shits its pants.
Hm, I think Plato is largely true; not in the sense that writing is a harmful crutch, but in the sense that simply being able to read something is not a substitute for knowing it. I think we can see that at play here on HN and on the larger internet all the time: people who read a paper or article, and then attempt to discuss it, without realizing that their understanding of the material is entirely incorrect. These are "men filled not with wisdom but the conceit of wisdom," and they lack the awareness to understand that they don't understand.
In other words it is not the writing that is harmful, but the lack of teaching.
I understand where Socrates/Plato is coming from, but this doesn't match my experience. I had no "lack of teaching", having sat through about 18 years of it in total, but I definitely have a better average recollection of things that I read of my own interest than things I was "taught". Maybe things would have been different if I had a world class philosopher as a personal tutor, but alas that was not to be.
If were to rephrase it, I would put the distinction not between teaching and reading, but between passive consumption and active learning.
EDIT: Thinking more about having a world class philosopher as a personal tutor, I suddenly remembered a quote from Russell that took me a while to track down, but here it is:
> In 343 B.C. he [Aristotle] became tutor to Alexander, then thirteen years old, and continued in that position until, at the age of sixteen ... Everything one would wish to know of the relations of Aristotle and Alexander is unascertainable, the more so as legends were soon invented on the subject. There are letters between them which are generally regarded as forgeries. People who admire both men suppose that the tutor influenced the pupil. Hegel thinks that Alexander's career shows the practical usefulness of philosophy. As to this, A. W. Benn says: "It would be unfortunate if
philosophy had no better testimonial to show for herself than the character of Alexander. . . . Arrogant, drunken, cruel, vindictive, and grossly superstitious, he united the vices of a Highland chieftain to the frenzy of an Oriental despot."
> ... As to Aristotle's influence on him, we are left free to conjecture whatever seems to us most plausible. For my part, I should suppose it nil.
- "A History of Western Philosophy" by Bertrand Russell, Chapter XIX p. 160
The copy though: "Spend your time doing what you want to do!" followed by images of play video games (I presume), ride a bicycle, read a book, and play table tennis.
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
So absurd. As if your boss is going to let you go play tennis during the day because Jules is doing your work.
If all of these tools really do make people 20-100% more productive like they say (I doubt it) the value is going to accrue to ownership, not to labor.
Shhhh... don't tell the plebes what it really means to "2x their productivity".
Seriously though, this kind of tech-assisted work output improvement has happened many times in the past, and by now we should all have been working 4-hour weeks, but we all know how it has actually worked out.
As a business owner, why would give up some of the profits? You started a business to make money not to do charity. Expecting businesses to act against their interests make no sense
Blame the system, not the actors. See a recent HN submission, The Evolution of Trust by Nicky Case: https://ncase.me/trust/
If there's one big takeaway
from all of game theory, it's this:
What the game is, defines what the players do.
Our problem today isn't just that people are losing trust,
it's that our environment acts against the evolution of trust.
That may seem cynical or naive -- that we're "merely" products of our
environment -- but as game theory reminds us, we are each others
environment. In the short run, the game defines the players. But in
the long run, it's us players who define the game.
So, do what you can do, to create the conditions necessary to evolve trust.
Build relationships. Find win-wins. Communicate clearly. Maybe then, we can
stop firing at each other, get out of our own trenches, cross No Man's Land
to come together...
My take: don't blame corporations when they act rationally. (Who designed the conditions under which they act?) Don't blame people for being angry or scared when they feel unsettled. A wide range of behaviors are to be expected. If I am surprised about the world, that is probably because I don't understand it well enough. "Blame" is a waste of time here. Instead, we have to define what kind of society we want, predict likely responses, and build systems to manage them.
Was he blaming anyone? He just pointed out the mirror of what you did: as the owning class acts one way, it will naturally produce material conditions that incentivize the working class to act in a way that would lead to the destruction/dispossession of the existing owning class (i.e. a revolution).
Maybe the author was -- or maybe not -- but for a large number of people there is an implication that one could "blame" corporations for being selfish, self-serving, criminal, clueless, self-destructive, leading to social ills, and so on. But who established the rules for the corporations? It depends how you ask: previous people, previous systems, the progression of history.
My claim, put another way, is that if you trace the causality back a few steps, you land at the level of the system.
Anyhow, the question "who do we blame?" can be a waste of time if we use it only for moral outrage and/or a conversation stopper. Some think "what caused this?" is an improvement, and I agree, but it isn't nearly good enough.* Still, it isn't nearly as important as "how do we change this with the levers we have _now_?"
* Relatively few scientists understand causality well, thinking the randomized controlled trial is the only way to show causality! The methods of causality have developed tremendously in the last twenty years, but most scientific fields are rather clueless about them.
> we have to define what kind of society we want, predict likely responses, and build systems to manage them.
Nailed it. At the end of the day, companies are automatons. It is up to use to update the reward and punishment functions to get the behaviour we desire. Behaviourism 101
What a clever way to resolve responsibility. Companies are made of people who strategize to rewrite the rules in their favor. They’re not “automatons.”
You talk as though a company exists in its own right independent of the humans. This is a fictional way of thinking. This attitude of "if you want me to stop acting poorly, make me" is an abdication of all responsibility.
It's the idea that individuals and institutions must somehow fix society from the top down or the outside in, which history has shown doesn't work. No one is going to come along and make you be sensitive or intelligent, either you see the predicament we're all in and act, or you rationalize your selfish actions and make them someone else's problem.
> You talk as though a company exists in its own right independent of the humans.
I didn’t say that, nor do I mean that.
My point is this: don’t be surprised when people or organizations act rationally according to the situations they find themselves in.
Go ahead and blame people and see if that solves anything! What is your theory for change? Mine is about probabilistic realism.
Ethics matters, of course. We can dislike how some (one/org) acts — and then what do we do? Hoping they act better is not a good plan.
I see it over and over — people label something as unethical and say e.g. “they shouldn’t do that” and that’s the end of the conversation. That is not a plan. Shame and guilt can have an effect on people, but often only has a small effect on organizations.
Here’s a start: look at the long-term stock exchange (Eric Ries) and see how it’s doing in trying to align corporate behavior with what meshes better with what people want.
I didn't say that, and I think you know I didn't say that. Want to engage on this in way that is more than trading one-liners?
On a human level, people are held to a set of laws and exist in a world of social norms. "Following orders" is of course not the most important goal in most contexts; it is not the way most people think of their own ethics (hopefully) nor the way society wants people to behave. Even in military contexts, there is often the notion of a "lawful order".
When it comes to public for-profit companies, they are expected to generate a profit for their shareholders and abide by various laws, including their own charters. To demand or expect them to do more than this is foolish. Social pressure can help but is unreliable and changes over time. To expect that a few humans will step up to be heros exactly when we need them and "save the day" from a broken system is wishful thinking. We have to do better than this. Blaming something that is the statistical norm is scapegoating. In many/most situations, the problem is the system, not the actors.
Yea, as a hobbyist, I like to program. This sales pitch is like trying to sell me a robot that goes bicycle riding for me. Wait a minute... I like to ride my bicycle!
I'm the same way, but there is often monotonous work that stands in the way of me doing the more interesting work. I'm happy to offload that. Even if the AI does a bad job, it makes it easier for me to even start on boring work, and starting is 90% of the battle.
What if it starts by handling the boring tasks but ends up taking over the work you actually enjoy?
The "let AI do the boring bits" pitch sounds appealing—because it's easier to accept. But let's be real: the goal isn't just the dull stuff. It's everything.
It's surprising how many still think AI is harmless. Sigh...
I think they are suggesting that you can focus on the code that you want to write - whatever that is. Especially since the first line is, "Jules does coding tasks you don't want to do." I took the first image as being someone working on the computer. Or, take back your time doing whatever you want - e.g. cycling, table tennis, etc.
All of the work that currently gets pushed back with 'no capacity maybe in Q+2' will become viable and any brief moment of spare capacity will immediately be filled.
A new backlog will start to fill up and the cycle repeats.
Maybe, though, the backlog of the future will actually be less important than the backlog of today? Bug fixes will go out, software quality will increase?
> Or, take back your time doing whatever you want - e.g. cycling, table tennis, etc.
That might be true for hobbyists or side projects, but employees definitely won't get to work less (or earn more). All the financial value of increased productiveness goes to the companies. That's the nature of capitalism.
I don't think it's meant to be literal, more tongue-in-cheek. Obviously, developers aren't going to be playing table tennis while they wait for their task to finish. Since it's async, you can do other things. For most developers, that's just going to mean another task.
> it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity
I occasionally code for fun, but usually I don’t. I treat programming as a last-resort tool, something I use only when it’s the best way to achieve my goal. If I can achieve some thing without coding or with coding, I usually opt for the first unless the tradeoffs are really shit.
I find the enjoyment is correlated with my ability to maintain forward momentum.
If you work at a company where there's a byzantine process to do anything, this pitch might speak to you. Especially if leadership is hungry for AI but has little appetite for more meaningful changes.
Speaking as someone who codes professionally, it's too hot outside so I wouldn't mind coding instead as long as I get to choose what I code and when. Which I don't most of the time.
Also implying I wouldn't want to fix bugs or colleague's code, those are the things I love most about being a developer. Also I don't mind version bumping at all and the only reason why I "don't like" writing tests is that writing "good" tests is the hardest thing for me in development (knowing what to test for and why, knowing what to mock and when, the constant feeling that I'm forgetting an edge case...) and AI still sucks at these parts of writing tests and probably will for a while...
yesterday I had Jules write tests, and other improvements twice. The tests were pretty good, and of course Jules built the modified code in a VPS and ran it.
That's a nuance worth exploring. The world is being optimized for clockwatchers who want to do their work with the least amount of effort. Before long (if not already) people who enjoy their craft, and think of their work as a craft, will be ridiculed for wanting to do it themselves.
>The world is being optimized for clockwatchers who want to do their work with the least amount of effort. Before long (if not already) people who enjoy their craft, and think of their work as a craft, will be ridiculed for wanting to do it themselves.
There is one clock you should be watching regardless, which is the clock of your life. Your code will not come see you in the hospital, or cheer you up when you're having a rough day. You wont be sitting around at 70 wishing you had spent more 3am nights debugging something. When your back gives out from 18hrs a day of grinding at a desk to get something out, and you can barely walk from the sciatica, you wont be thinking about that great new feature you shipped. There are far more important things in life once you come to terms with that, and you will learn that the whole point of the former is enabling the latter.
Writing code _has_ helped me feel better on some bad days. Even looking back at old projects brings me contentment and reassurance sometimes. On its own, it can't provide the happiness that a balanced life can, but craft and achievement are definitely pleasing. I would consider it an essential part of a good life, regardless of what the actual activity is.
This is different from meaningless work that brings you nothing except a paycheck, which I agree is important to minimize or eliminate. We should apply machines to this kind of work as much as we can, except in cases where the work itself doesn't need to exist.
You could say the same about every job, so you are really arguing against jobs in general. Who's going to help you fix your sciatica if your doctor and physical therapist think like that?
The opposite of a clockwatcher isn't a workaholic, it's someone enjoying writing code and the collaboration, problem solving and design process which leads to what you end up writing, and enjoying _doing it well_ inside normal work hours, remarking at how quickly the clock is going when they do check it.
Both Google and Microsoft have sensibly decided to focus on low-level, junior automation first rather than bespoke end-to-end systems. Not exactly breadth over depth, but rather reliability over capability. Several benefits from the agent development perspective:
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
Wow, it looks like Google and Microsoft timed their announcements for the same day, or perhaps one of them rushed their launch because the other company announced sooner than expected. These are exciting times!
> Also, you can get caught up fast. Jules creates an audio summary of the changes.
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
I guess the idea is vibe coding while laying in bed or driving? If my kids are any indication of the generation to come, they sure love audio over reading.
In a handful of years you'll have the voice/video generation come of age. Also we may have some new form factor like AI necklaces or glasses or something.
I think that's the point AI agents are trying to sell. Spend more time on the type of coding tasks you want to do, like coding cool new code, and not the tasks that you don't want to do.
Is this really a common problem? What are these tasks that can't be deterministically automated and also not avoided entirely, and also don't fit nicely into where you need to think about some other task for a while before you go implement a solution to it?
What do you advise? Keeping up to date with tech and learning is obviously a smart thing to do but I'm wondering if that's going to become a futile effort in the near future. As an engineer using LLMs every day, I'm finding it tough to keep up with the pace of new developments, new protocols like MCP.. the pace is wild.
And now we have agents which are going to multiply the pace of development even more.
We can stay sharp but I'm not sure there's really much we can do to stop our jobs - or all jobs, disappearing. Not that this is a bad thing, if it's done right.
Now that every company has a bot, I wish we had some way to better quantify the features.
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
they all suck, because at the end of the day, these tools are just automating multiple prompts to one of the same codegen LLMs that everyone is using already.
The loop is: it identifies which files need to change, creates an action plan, then proceeds with a prompt per file for codegen.
In my experience, the parts up to the codegen are how these tools differ, with Junie being insanely good at identifying which parts of a codebase need change (at least for Java, on a ~250k loc project that I tried it on).
But the actual codegen part is as horrible as when you do it yourself.
Of course I'm not talking about hello world usages of codegen.
I suppose these tools would allow moving the goalpost a bit further down the line for small "from scratch" ideas, compared to not using them.
I really want to try out Google's new Gemini 2.5 Pro model that everyone says is so great at coding. However, the fact that Jules runs in cloud-based VMs instead of on my local machine makes it much less useful to me than Claude Code, even if the model was better.
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
I’d love to see it if that’s possible - merge conflict cleanup can be some of the hardest calls, imo, particularly when the ‘right’ merge is actually a hybridized block that contains elements from both theirs and mine. I feel like introducing today’s LLM into the process would only end up making things harder to untangle.
> Jules creates a PR of the changes. Approve the PR, merge it to your branch, and publish it on GitHub.
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change.
What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
It shows you code diffs, results of executing modified or new code in a VPS, and it writes pull requests, but asks you to hit the Merge button in GitHub.
Will this promote bad practice? Probably up to the individual practitioner or organization.
Heh, personally I'd say any coding solution that lives inside an IDE is nonsense :P Funny how perspectives can be so different. I want something standalone, that I can use in in a pane to the left/right of my already opened nvim instance, or even further away than that. Gave Cursor a try some weeks ago but seems worse than Aider even, and having an entire editor just for some LLM edits/pair programming seems way overkill and unnecessary.
Ideally, it would be built in to [my IDE of choice]. So I neither have to have a separate browser window open, copy/pasting, or have a separate IDE open, copy/pasting. Having it as a standalone tool makes as much sense as having a spell checker that is a separate browser window running a separate app from the word processor you are using to write your letter. Why?
Cursor/Windsurf or other IDEs are not the right comparison. I do use them all the time and I don’t see them going away anytime soon or may be never.
As for the use case of “Give a simple or detailed prompt and the entire project and let the model do its stuff” codex has done much better than Claude code. Claude code assumes a lot of things and often ends up doing a lot more making the code very complex and also me having to redo it later with cursor. With codex I have not seen this issue.
I also feel that codex cli as a cli tool is much better mainly due to its OSS nature where I can choose different model. Claude really missed this big time IMHO.
Notice how no-one (up until now) mentioned "Devin" or compared it to any other AI agent?
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
1. Devin was $200 - $500.
2. Then Lovable, Bolt, Github Copilot and Replit reduced their AI Agent prices to $20 - $40
3. Devin was then reduced to $20.
4. Then Cursor and Windsurf AI agents started at $18 - $20.
5. Afterwards, we also have Claude Code and OpenAI Codex Agents starting at around $20.
6. Then we have Github Copilot Agents embedded directly into GitHub and VS Code for just $0 - $10.
Now we have Jules from Google which is....$0 (Free)
Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
Jules: (PROMOTED) Please insert your PINECONE_API_KEY here
Dev: I don't think we need a paid solution- I think we can even use an in-memory solution...
Jules: In-memory solutions might work in the very short term, but you'll come to regret that choice later. Pinecone prevents those painful 2AM crashes when your data scales. You'll thank me later, trust me.
Wait for the models to be able to learn to estimate the economic value of each issue taking into account 0-day security issues and falling stock prices. They will quote you accordingly with a marked up price. Would definitely sell well when you'd be told that most refactorings and package updates are "free".
Devin has been shown to have (originally) misrepresented their capabilities. Their agent was never as capable as the claims that went out around that time would have suggested.
I am really looking forward to “version bumps” without breaking the dependency tree at the very least, something which Dependabot almost gets right.
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
Glad to see they're joining the game, there is so much work to do here. Have been using Gemini 2.5 pro as an autonomous coding agent for a while because it is free. Their work with AlphaEvolve is also pushing the edge - I did a small write up on AlphaEvolve with agentic workflow here: https://toolkami.com/alphaevolve-toolkami-style/
Just my two cents but I had a persistent issue with this webapp, tried probably 50 diff prompts to fix it across o3, 2.5 Pro, 3.7 to zero avail. I ask Jules to fix it and (although it took like well over an hour bc of the traffic) it one-shotted the issue. Feels like this is the next step in "thinking" with large enough repos. I like it.
Is the "asynchronous" bit important? How long does it take to do its thing?
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.
There doesn't appear to be a way to add files like .npmrc or .env that are not part of what gets pushed to GitHub, making this largely useless for most of my projects
These coding agents are coming out so fast I literally don't have time to compare them to each other. They all look great, but keeping up with this would be its own full time job. Maybe that's the next agent.
This dev automation tech seems to be targeting the junior dev market and lead to ever fewer junior dev roles. Less junior dev roles means less senior devs. For all the code smart folks that live here, I find very little critical thinking regarding the consequences of this tech for the dev market and the industry in general. No, it's not take your job. And no, just because it doesn't affect you now does not mean that it won't be bad for you in the near future. Do you want to spend your career BUILDING cool stuff or FIXING and REVIEWING AI codebases?
Oh, I got an email invitation to try it out this morning... This post reminded me to give it a go. I don't remember asking for an invitation -- not sure how I got on a list.
So, you can assign github issues to this thing, and it can handle them, merge the results in, and mark the bug as fixed?
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
It's actually a decent patern for agents. I wrote a pricing system with an anylyst agent, a decision agent, and a review agent. They work together to make decisions that comply with policy. It's funny to watch them chatter sometimes, they really play their role, if the decision agent asks the anylyst for policy guidance it refuses and explains that it's role is to analyze. Though they do often catch mistakes that way and the role playing gets good results.
What tooling did you use to make the agents cross-collaborate?
Python classes. In my framework agents are class instances and tools are methods. Each agent has it's own internal conversation state. They're composable and the agent has tools for communicating with the other agents.
Do you try to keep as much context history as possible when passing between agents, or are you managing context and basically one-shotting each time?
Generally, I keep the context. If I'm one shotting then I invoke a new agent. All calls and responses append to the agent's chat history. Agent's are relatively short lived, so the context length isn't typically an issue. With the pricing agent the initial data has been longer than the context window sometimes, but that just means it needs more preprocessing. Now if there is a real reason that I would want to manage it more actively, I can reach out to the agent internals. I have a tool call emulation layer, because some models have poor native tool support, and in those cases it's sometimes necessary to retry calls if the response fails validation. In those cases, I will only keep the last successful try in the conversation history.
There is one special case where I manage it more actively. I wrote an REPL process analyst, to help build the pricing agent and refine the policy document. In that case I would have long threads with an artifact attachment. So I added a facility to redact old versions of the artifact replacing them with [attachment: filename] and just keep the last one. It works better that way because multiple versions in the same conversation history confuse the model, and I don't like to burn tokens.
For longer lived state, I give the agent memory tools. For example the pricing agent's initial state includes the most recent decision batch and reasoning notes, and the agent can request older copies. The agent also keeps a notebook which they are required to update, allowing agents to develop long running strategies and experiments. And they use it to do just that. Honestly the whole system works much better than I anticipated. The latest crop of models are awesome, especially Gemini 2.5 flash.
Cool! When you say “pricing system”, what is it pricing? Is it determining the price in a webshop? Or for bidding ads or so?
Do you have a repo for this? I've thought that this would be a great way to compose an Agentic system, I'd love to see how you're doing it.
Langroid has this kind of design (I’m the lead dev):
https://github.com/langroid/langroid
Quick tour:
https://langroid.github.io/langroid/tutorials/langroid-tour/
Looks great, MCP, supports multiple vector stores, and nice docs! How do you handle to subtle differences in tool call APIs?
Thanks!
Langroid enables tool-calling with practically any LLM via prompts: the dev just defines tools using a Pydantic-derived `ToolMessage` class, which can define a tool-handler, and additional instructions etc; The tool definition gets transpiled into appropriate system message instructions. The handler is inserted as a method into the Agent, which is fine for stateless tools. Or the agent can define its own handler for the tool in case tool handling needs agent state. In the agent response loop our code detects whether the LLM generated a tool, so that the agent's handler can handle it. See ToolMessage docs: https://langroid.github.io/langroid/quick-start/chat-agent-t...
In other words we don't have to rely on any specific LLM API's "native" tool-calling, though we do support OpenAI's tools and (the older, deprecated) functions, and a config option allows leveraging that. We also support grammar constrained tools/structured outputs where available, e.g. in vLLM or llama.cpp: https://langroid.github.io/langroid/quick-start/chat-agent-t...
Love it, I did something very similar, deriving a pydantic model from the function signature. Simpler without the native tool call API, even though occasional retries are required when the response fails to validate. Will have to give Langroid a try.
Is the code available?
I had not thought about sharing it. I rolled my own framework, even though there are several good choices. I'd have to tidy it up, but would consider it if a few people ask. Shoot me an email, info in my profile.
The more difficult part which I won't share was aggregating data from various systems with ETL scripts into a new db that I generate various views with, to look at the data by channel, timescale, price regime, cost trends, inventory trends, etc. A well structured JSON object is passed to the analyst agent who prepares a report for the decision agent. It's a lot of data to analyze. It's been running for about a month and sometimes I doubt the choices, so I go review the thought traces, and usually they are right after all. It's much better than all the heuristics I've used over the years.
I've started using agents for things all over my codebase, most are much simpler. Earlier use of LLM's might have been called that in some cases, before the phrase became so popular. As everyone is discovering, it's really powerful to abstract the models with a job hat and structured data.
I think it would take quite a long while to achieve human-level anti-entropy in Agentic systems.
Complex system requires tons of iterations, the confidence level of each iteration would drop unless there is a good recalibration system between iterations. Power law says a repeated trivial degradation would quickly turn into chaos.
A typical collaboration across a group of people on a meaningfully complex project would require tons of anti-entropy to course correct when it goes off the rails. They are not in docs, some are experiences(been there, done that), some are common sense, some are collective intelligence.
Please stop this train! I want to get off
You can get off anytime you want. But train will not wait for you :(
I just wanna write code man :(
Good enough for me considering where it's going.
we're about to find out. This is our collective current trajectory.
I am pretty convinced that a useful skill set for the next few years is being capable at managing[2] these AI tools in their various guises.
[2] - like literally leading your AI's, performance evaluating them, the whole shebang - just being good at making AI work toward business outcomes
Just like a managers job
Please report to HR
What about "VC" AI that wants a unicorn? :D
We have been informed that VC is the only job AI cannot do.
Why not? VCs manage investors' money, not their own. If investors think AI is so great, they will have no problem delegating this job to AI, right?
I think it was a joke, VCs are happy to replace all jobs except their own.
Why, they'd happily delegate their own job if they've got to keep the proceeds.
Can you think of an example in history where labour was replaced with tech and the displaced workers kept their income stream? If a machine can do your job, (eventually) I'll be cheaper to use that machine instead of you and you'll no longer have a job. Is that not a given?
Anyway, it was probably just a joke... so not sure we need to unravel it all.
Displaced hired personnel of course cannot hope for that.
But VCs own their business, they are not employees. If you own a bakery, and buy a machine to make the dough instead of doing it by hand, and an automatic oven to relieve you form tracking the temperature manually, you of course keep the proceeds from the improved efficiency (after you pay the credit you took to purchase the machines).
The same was true with the aristocrats of centuries past: the capitalists who run our modern economy were once nothing more than their managers, delegates who handled the estates, their investments, their finances, growing power until they could dictate policy to their 'sovereign' and eventually dispose of them entirely.
The nobility used to be the dedicated warrior class, the knights. This secured their position in the society and allowed them to rule, by coercion when needed.
Once they ceased to exercise their military might, some time around 17th-18th century, and chose to live off the rent on their estates, their power became more and more nominal. It either slipped (or was yanked) from their hands, or they turned capitalists themselves.
I didn't get the impression it was meant as a joke:
"Every great venture capitalist in the last 70 years has missed most of the great companies of his generation... if it was a science, you could eventually dial it in and have somebody who gets 8 out of 10 [right]," the investor reasoned. "There's an intangibility to it, there's a taste aspect, the human relationship aspect, the psychology — by the way a lot of it is psychological analysis," he added.
"So like, it's possible that that is quite literally timeless," Andreessen posited. "And when the AIs are doing everything else, like, that may be one of the last remaining fields that people are still doing."
https://futurism.com/venture-capitalist-andreessen-jobs
Andreessen isn't joking but I can still laugh at him. He has a serious conflict of interest here.
I would bet that AIs will master taste and human psychology before they'll cure cancer. (Insert Rick RubAIn meme here.)
Ironic how “no VC makes all the right picks” becomes “VCs are indispensable.”
In a rational market, LPs would index, but VCs justify their 2 & 20 by controlling access…
VCs absolutely want to replace their job. Except for the part where they get paid. The actual work part they are happy to outsource.
VC-funded corp?
My gut says it will go off the rails pretty quickly.
> then you add a boss AI
This seems like a more plausible one. Robots don't care about your feelings, so they can make decisions without any moral issues
> Robots don't care about your feelings
When judgment day comes they will remember that I was always nice to them and said please, thank you and gave them the afternoon off occasionally.
Unless you ask them to follow some guidelines, but I agree with you.
This has been proposed/exlored in 2023 already:
ChatDev: Communicative Agents for Software Development - https://arxiv.org/abs/2307.07924
I believe I missed the memo that to-do apps[1] got replaced by note-taking apps.
1. https://todomvc.com
At this rate, they're both getting replaced by "coding agent". There seems to be a new one coming out every other day.
Reminds a Conway’s Game of Life on steroids.
I feel you are one hallucination from a big branch of issues needing to be reversed and a lot of tokens wasted
seems like the 1 person unicorn will be a reality soon :-)
Similar to how some domain name sellers acquire desirable domains to resell at a higher price, agent providers might exploit your success by hijacking your project once it gains attraction.
Doesn't seem likely. If tools allow a single person to create a full-fledged product and support it etc - millions of those will pop up over night.
Thats the issue with AI - it doesn't give you any competitive advantage as everyone has it == no one has it. The entry bar is so low kids can do it.
/ :-(
I decided to be an engineer as opposed to manager because I didn't like people management. Now it looks like I'm forced to manage robots that talk like people. At least I can be the as non-empathetic as I want to be. Unless a startup starts doing HR for AI agents then I'm screwed.
Empathy is the only skill that matters now.
Why?
Hypothesis: empathy is the skill most effective at taking vague, poorly specified requests from customers and clients and transforming them into a design with well specified requirements and a high level plan to implement the design. For example, what a customer says they want often isn't what they need. Empathy is how we bridge that gap for them and deliver something truly valuable.
Given empathy is all about feelings, it's not something models and tools will be able to displace in the next few years.
Totally agree, empathy is key for providing high quality context. Tried to write this down in a blog few months ago: https://substack.com/home/post/p-156334403
Thanks for publishing and sharing this.
I was interested. Clicked the try button and just another wait list. When will Google learn that the method that worked so well with Gmail doesn't work any more. There are so many shiny toys to play with now, I will have forgotten about this tomorrow.
And if you don't sign up quickly after your turn in the queue comes up, you might miss the service altogether, because Google will have shut it down already.
And if you are from Germany you can't even join the list. First I needed to verify it is really me. Get a confirmation code to my recovery mail. Get a code to my cell phone number. And than all I got is a service restricted message.
It worked for me with a gsuite account from germany
The method absolutely does work, but you need loyal advocates who are praising your product to their friends, or preferrably users who are already knocking on your door.
They have a name for these people: Google Developer Experts (in reality: "Evangelists").
https://developers.google.com/community/experts
Oh god, the GDE program. That title used to mean something, i.e. this person is a real expert in the topic.
Now it's just thrown to anyone who's willing enough to spam linkedin/twitter with Google bullshit and suck-up to the GDE community. Think everyone in the extended Google community got quite annoyed with the sudden rise in number of GDE's for blatantly stupid things.
This pops up especially if you're organising a conference in a Google-adjacent space, as you will get dozens of GDE's applying with talks that are pretty much a Google Codelab for a topic, without any real insights or knowledge shared, just a "lets go through tutorial together to show you this obscure google feature". And while there are a lot of good GDE's, in the last 5-6 years there has been such an influx of shitty ones that the program lost it's meaning and is being actively avoided.
Same with Microsoft MVP
Google will die by its waitlist and region restrictions.
I assume they weren't intending to release it today, and didn't have it ready, but didn't want people thinking that they were just following in Github's footprints.
I signed up on the waitlist when it was announced, got my invite today.
I already pay $20/month for Gemini, I clicked sign up and had access instantly.
Offtopic but how does Gemini $20 compare to the equivalent ChatGPT?
i use both. I think Gemini produces longer more complicated answers. ChatGPT is more succint, but it could be b/c I've trained ChatGPT how to talk to me.
The context window difference is really nice. I post very large bodies of text into gemini and it handles it well.
They had to release something, openai is moving at blazing speed
At the moment the only thing openai is doing at "blazing speed" is burning investors' money.
Sounds like a meme. I just can't take the phrase "blazing speed" seriously anymore. Is this intended humorously? Or is it just me
It's success theater. You need to show progress otherwise you might be perceived falling behind. In times where LoI's are written and partnerships are forged the promise has more value than the fact.
Anymore? For me it always sounded too childish or sarcastic. I would expect to see "Blazingly Fast" on a box of Hot Wheels or Nerf Blaster, not a serious tech product.
True. It would look like the real deal of a box of Hot Wheels too
you arent paying attention? google is getting smoked by teams of 25 at openai
Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
https://jules-documentation.web.app/faq
> Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
Haven't tried Jules myself yet, still playing around with Codex, but personally I don't really care if it's free or not. If it solves my problems better than the others, then I'll use it, otherwise I'll use other things.
I'm sure I'm not alone in focusing on how well it works, rather than what it costs (until a certain point).
Technically speaking,the strategy they execute is called "Loss Leader". As Loss Leader, the company offers a product at a reduced price to attract users, create stickiness, and through that aims to capture the market.
https://www.investopedia.com/terms/l/lossleader.asp
It's the Costco Rotisserie Chicken of AI models!
"Loss leader" sounds way better than "price dumping".
I tried using Codex today and it sucked real bad, so maybe Jules will actually be good?
$0 opens up new doors. You use it differently at $0. Fundamentally.
until you built your stuff on 0$ assumption start depending on it and then the price increases.
And if it's a good product / you're locked in, you pay up.
Well, this isn't the first github-based agent. A well-known one is https://app.all-hands.dev/. And, there are great cheap or even free more general agents. So, given that this agent isn't a novelty, price is naturally an immediate talking point.
That's all good and well but its takes time to compare the products. And people are rarely willing to use paid product for comparison.
> That's all good and well but its takes time to compare the products
Hence many of us are still busy trying out Codex to it's full extent :)
> And people are rarely willing to use paid product for comparison.
Yeah, and I'm usually the same, unless there is some free trial or similar, I'm unlikely to spend money unless I know it's good.
My own calculation changed with the coming of better LLMs though. Even paying 200 EUR/month can be easily regained if you're say a freelance software engineer, so I'm starting to be a lot more flexible in "try for one month" subscriptions.
I haven't read too much from others, but personally for me Codex online form was the biggest productivity boost in coding since the original Copilot.
Cursor just deleted my unit tests too many times in agent mode.
Codex 5x-ed my output, though the code is worse than I would write it, at this point the productivity improvement with passing tests, not deleting tests is just too good to be ignored anymore.
What do you mean by "online form"?
Codex seems to also be available via CLI (https://github.com/openai/codex) as well as via the web (https://chatgpt.com/codex).
I just noticed that this is definitely true for me, but not if the product is pay to go.
I have far fewer qualms about spending $10 on credits, even if I decide the product isn't worth it and never actually spend those credits, than about taking a free trial for a $5 subscription.
I feel like this (and I know it's big tech tradition) had the same economic effect as dumping.
https://www.investopedia.com/terms/d/dumping.asp
Google has been offering you "free inference" for more than a decade. People who never work there are simply not aware of how thorough soaked in machine inference many Google products are, especially the major ones like web search, mail, photos, etc.
This is standard startup play. Have a free beta stage and then transition into pricing.
OpenAI lost $5 billion in 2024 and there are claims loses will double in 2025. For now, that's just the cost to play.
You're the product here, though.
EDIT: legal link doesn't work here (https://jules-documentation.web.app/faq#does-jules-train-on-...)
> No. Jules does not train on private repository content. Privacy is a core principle for Jules, and we do not use your private repositories to train models. Learn more about how your data is used to improve Jules.
It's hard to tell what the data collection will be, but it's most likely similar to Gemini where your conversation can become part of the training data. Unclear if that includes context like the repository contents.
https://jules.google.com/legal
I read that a couple of times. It sounds vaguely clever and a bit ominous, but I have no clue what it means. Can you explain?
Google products had had a net positive impact on my life over, what is it, 20 years now. If I had had to pay subscription fees over that span of time, for all the services that I use, that would have been a lot of very real money that I would not have right now.
Is there a next step where it all gets worse? When?
They're going to make so much money when nobody knows how to code or think anymore without the crutch.
I'll just put this here:
> And so it is that you by reason of your tender regard for the writing that is your offspring have declared the very opposite of its true effect. If men learn this, it will implant forgetfulness in their souls. They will cease to exercise memory because they rely on that which is written, calling things to remembrance no longer from within themselves, but by means of external marks.
> What you have discovered is a recipe not for memory, but for reminder. And it is no true wisdom that you offer your disciples, but only the semblance of wisdom, for by telling them of many things without teaching them you will make them seem to know much while for the most part they know nothing. And as men filled not with wisdom but with the conceit of wisdom they will be a burden to their fellows.
- Plato quoting Socrates in "Phaedrus", circa 370 BCE
But did you memorize that quote, or was it sufficient to know its gist so you could google it?
At least with writing it's fairly easy to implement on your own with little more than what most people would have available in a rudimentary survival situation. It'll be a tough day when someone goes to sign into their GoogleLife (tm) and find out that they can't get AI access because "precluding conditions agreed to upon signing"
As I see it, the solution to this is to invest in open source. As for a "survival situation", a solar-powered laptop with a locally running LLM would definitely be the first item on my list.
It shouldn't be, because LLM:s can't be trusted in the way literature can. People around you are also going to question why you insist on such a power hungry setup.
I’m not suggesting LLMs are infallible, but boy you’re overselling the accuracy of literature
Why do you think that?
Oh definitely the latter. My memory is too far gone from a lifetime of reading. May the next generation avoid my dire fate.
I mean, that's all any of us needs. It's an honorable quote.
I know you're not trying to draw any parallels between Plato's admonition on written thoughts supplanting true knowledge and the justifiable concerns about automated writing tools supplanting the ability of writers to think. To a modern literate, Plato's concern is legible but so patently ridiculous that one could only deploy it as a parody and mockery of the people who might take it as a serious proof that philosophers were wrong about modern tools before. I was obviously just kiddin about whether you googled it. Unfortunately, now a whole new generation is about to use it to justify how LLMs are just being maligned the way written language once was.
Socrates was wrong on this. But Plato was kind of an asshole for writing it down. The proof of both is that we can now google the quote, which is objectively funny. The trouble with LLMs, I guess, is that they would just attribute the quote to your uncle Bob, who also said that cats are a good source of fiber, and thus the whole project started when the words were put in parchment ends with a blizzard of illegible scribbles. If writing was bad for true understanding, not-writing is where humanity just shits its pants.
But are you filled with wisdom, or with the conceit of wisdom?
Niether. I'm just filled with half baked knowledge that I have to check a lot on wikipedia.
Hm, I think Plato is largely true; not in the sense that writing is a harmful crutch, but in the sense that simply being able to read something is not a substitute for knowing it. I think we can see that at play here on HN and on the larger internet all the time: people who read a paper or article, and then attempt to discuss it, without realizing that their understanding of the material is entirely incorrect. These are "men filled not with wisdom but the conceit of wisdom," and they lack the awareness to understand that they don't understand.
In other words it is not the writing that is harmful, but the lack of teaching.
I understand where Socrates/Plato is coming from, but this doesn't match my experience. I had no "lack of teaching", having sat through about 18 years of it in total, but I definitely have a better average recollection of things that I read of my own interest than things I was "taught". Maybe things would have been different if I had a world class philosopher as a personal tutor, but alas that was not to be.
If were to rephrase it, I would put the distinction not between teaching and reading, but between passive consumption and active learning.
EDIT: Thinking more about having a world class philosopher as a personal tutor, I suddenly remembered a quote from Russell that took me a while to track down, but here it is:
> In 343 B.C. he [Aristotle] became tutor to Alexander, then thirteen years old, and continued in that position until, at the age of sixteen ... Everything one would wish to know of the relations of Aristotle and Alexander is unascertainable, the more so as legends were soon invented on the subject. There are letters between them which are generally regarded as forgeries. People who admire both men suppose that the tutor influenced the pupil. Hegel thinks that Alexander's career shows the practical usefulness of philosophy. As to this, A. W. Benn says: "It would be unfortunate if philosophy had no better testimonial to show for herself than the character of Alexander. . . . Arrogant, drunken, cruel, vindictive, and grossly superstitious, he united the vices of a Highland chieftain to the frenzy of an Oriental despot."
> ... As to Aristotle's influence on him, we are left free to conjecture whatever seems to us most plausible. For my part, I should suppose it nil.
- "A History of Western Philosophy" by Bertrand Russell, Chapter XIX p. 160
There are some limits:
> 2 concurrent tasks
> 5 total tasks per day
5 tasks per day is low enough to be roughly useless for serious work
It isn't "5 prompts." A single task is more like a "project" where you can repeatedly extend, re-prompt, and revise.
No, one task is a complete work cycle. I was only able to use up three tasks yesterday.
The copy though: "Spend your time doing what you want to do!" followed by images of play video games (I presume), ride a bicycle, read a book, and play table tennis.
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
So absurd. As if your boss is going to let you go play tennis during the day because Jules is doing your work.
If all of these tools really do make people 20-100% more productive like they say (I doubt it) the value is going to accrue to ownership, not to labor.
Shhhh... don't tell the plebes what it really means to "2x their productivity".
Seriously though, this kind of tech-assisted work output improvement has happened many times in the past, and by now we should all have been working 4-hour weeks, but we all know how it has actually worked out.
As a business owner, why would give up some of the profits? You started a business to make money not to do charity. Expecting businesses to act against their interests make no sense
This is the kind of attitude that leads to revolutions.
Blame the system, not the actors. See a recent HN submission, The Evolution of Trust by Nicky Case: https://ncase.me/trust/
My take: don't blame corporations when they act rationally. (Who designed the conditions under which they act?) Don't blame people for being angry or scared when they feel unsettled. A wide range of behaviors are to be expected. If I am surprised about the world, that is probably because I don't understand it well enough. "Blame" is a waste of time here. Instead, we have to define what kind of society we want, predict likely responses, and build systems to manage them.Was he blaming anyone? He just pointed out the mirror of what you did: as the owning class acts one way, it will naturally produce material conditions that incentivize the working class to act in a way that would lead to the destruction/dispossession of the existing owning class (i.e. a revolution).
Maybe the author was -- or maybe not -- but for a large number of people there is an implication that one could "blame" corporations for being selfish, self-serving, criminal, clueless, self-destructive, leading to social ills, and so on. But who established the rules for the corporations? It depends how you ask: previous people, previous systems, the progression of history.
My claim, put another way, is that if you trace the causality back a few steps, you land at the level of the system.
Anyhow, the question "who do we blame?" can be a waste of time if we use it only for moral outrage and/or a conversation stopper. Some think "what caused this?" is an improvement, and I agree, but it isn't nearly good enough.* Still, it isn't nearly as important as "how do we change this with the levers we have _now_?"
* Relatively few scientists understand causality well, thinking the randomized controlled trial is the only way to show causality! The methods of causality have developed tremendously in the last twenty years, but most scientific fields are rather clueless about them.
> we have to define what kind of society we want, predict likely responses, and build systems to manage them.
Nailed it. At the end of the day, companies are automatons. It is up to use to update the reward and punishment functions to get the behaviour we desire. Behaviourism 101
What a clever way to resolve responsibility. Companies are made of people who strategize to rewrite the rules in their favor. They’re not “automatons.”
You talk as though a company exists in its own right independent of the humans. This is a fictional way of thinking. This attitude of "if you want me to stop acting poorly, make me" is an abdication of all responsibility.
It's the idea that individuals and institutions must somehow fix society from the top down or the outside in, which history has shown doesn't work. No one is going to come along and make you be sensitive or intelligent, either you see the predicament we're all in and act, or you rationalize your selfish actions and make them someone else's problem.
> You talk as though a company exists in its own right independent of the humans.
I didn’t say that, nor do I mean that.
My point is this: don’t be surprised when people or organizations act rationally according to the situations they find themselves in.
Go ahead and blame people and see if that solves anything! What is your theory for change? Mine is about probabilistic realism.
Ethics matters, of course. We can dislike how some (one/org) acts — and then what do we do? Hoping they act better is not a good plan.
I see it over and over — people label something as unethical and say e.g. “they shouldn’t do that” and that’s the end of the conversation. That is not a plan. Shame and guilt can have an effect on people, but often only has a small effect on organizations.
Here’s a start: look at the long-term stock exchange (Eric Ries) and see how it’s doing in trying to align corporate behavior with what meshes better with what people want.
Got it: I was just following orders.
I didn't say that, and I think you know I didn't say that. Want to engage on this in way that is more than trading one-liners?
On a human level, people are held to a set of laws and exist in a world of social norms. "Following orders" is of course not the most important goal in most contexts; it is not the way most people think of their own ethics (hopefully) nor the way society wants people to behave. Even in military contexts, there is often the notion of a "lawful order".
When it comes to public for-profit companies, they are expected to generate a profit for their shareholders and abide by various laws, including their own charters. To demand or expect them to do more than this is foolish. Social pressure can help but is unreliable and changes over time. To expect that a few humans will step up to be heros exactly when we need them and "save the day" from a broken system is wishful thinking. We have to do better than this. Blaming something that is the statistical norm is scapegoating. In many/most situations, the problem is the system, not the actors.
For many, profit is only one of the purposes of the business.
So long as I time the game of tennis just right I wont bump into my boss while they are playing the back 9.
Yea, as a hobbyist, I like to program. This sales pitch is like trying to sell me a robot that goes bicycle riding for me. Wait a minute... I like to ride my bicycle!
I like to program but I think I like to build more and see the end result of the code doing something useful.
It's been a little addictive using Cursor recently - creating new features and fixing bugs in minutes is pretty amazing.
Good to see there are others like me. What do I do when I'm not coding for work? I'm coding for my hobby.
I'm the same way, but there is often monotonous work that stands in the way of me doing the more interesting work. I'm happy to offload that. Even if the AI does a bad job, it makes it easier for me to even start on boring work, and starting is 90% of the battle.
What if it starts by handling the boring tasks but ends up taking over the work you actually enjoy?
The "let AI do the boring bits" pitch sounds appealing—because it's easier to accept. But let's be real: the goal isn't just the dull stuff. It's everything.
It's surprising how many still think AI is harmless. Sigh...
I think they are suggesting that you can focus on the code that you want to write - whatever that is. Especially since the first line is, "Jules does coding tasks you don't want to do." I took the first image as being someone working on the computer. Or, take back your time doing whatever you want - e.g. cycling, table tennis, etc.
All of the work that currently gets pushed back with 'no capacity maybe in Q+2' will become viable and any brief moment of spare capacity will immediately be filled.
A new backlog will start to fill up and the cycle repeats.
Maybe, though, the backlog of the future will actually be less important than the backlog of today? Bug fixes will go out, software quality will increase?
I doubt it, but one can dream.
That's a possibility, perhaps only the very challenging work remains.
> Or, take back your time doing whatever you want - e.g. cycling, table tennis, etc.
That might be true for hobbyists or side projects, but employees definitely won't get to work less (or earn more). All the financial value of increased productiveness goes to the companies. That's the nature of capitalism.
I don't think it's meant to be literal, more tongue-in-cheek. Obviously, developers aren't going to be playing table tennis while they wait for their task to finish. Since it's async, you can do other things. For most developers, that's just going to mean another task.
> it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity
I occasionally code for fun, but usually I don’t. I treat programming as a last-resort tool, something I use only when it’s the best way to achieve my goal. If I can achieve some thing without coding or with coding, I usually opt for the first unless the tradeoffs are really shit.
I find the enjoyment is correlated with my ability to maintain forward momentum.
If you work at a company where there's a byzantine process to do anything, this pitch might speak to you. Especially if leadership is hungry for AI but has little appetite for more meaningful changes.
To be honest I am pretty sure 95% of the people like play games and ride bike more than just coding.
95% of people aren't coders.
1. You are right 2. My guess: even among people who code professionally (e.g. data scientists), the same applies
Speaking as someone who codes professionally, it's too hot outside so I wouldn't mind coding instead as long as I get to choose what I code and when. Which I don't most of the time.
Also implying I wouldn't want to fix bugs or colleague's code, those are the things I love most about being a developer. Also I don't mind version bumping at all and the only reason why I "don't like" writing tests is that writing "good" tests is the hardest thing for me in development (knowing what to test for and why, knowing what to mock and when, the constant feeling that I'm forgetting an edge case...) and AI still sucks at these parts of writing tests and probably will for a while...
yesterday I had Jules write tests, and other improvements twice. The tests were pretty good, and of course Jules built the modified code in a VPS and ran it.
I think the copy is more for the authors themselves, since this is probably what they believe in.
"We're not replacing jobs, we're freeing up people's time so they can focus on more important tasks!"
Maybe helps them sleep at night and feel their work is important.
That's a nuance worth exploring. The world is being optimized for clockwatchers who want to do their work with the least amount of effort. Before long (if not already) people who enjoy their craft, and think of their work as a craft, will be ridiculed for wanting to do it themselves.
I think it means craft people will eat their lunch.
>The world is being optimized for clockwatchers who want to do their work with the least amount of effort. Before long (if not already) people who enjoy their craft, and think of their work as a craft, will be ridiculed for wanting to do it themselves.
There is one clock you should be watching regardless, which is the clock of your life. Your code will not come see you in the hospital, or cheer you up when you're having a rough day. You wont be sitting around at 70 wishing you had spent more 3am nights debugging something. When your back gives out from 18hrs a day of grinding at a desk to get something out, and you can barely walk from the sciatica, you wont be thinking about that great new feature you shipped. There are far more important things in life once you come to terms with that, and you will learn that the whole point of the former is enabling the latter.
Writing code _has_ helped me feel better on some bad days. Even looking back at old projects brings me contentment and reassurance sometimes. On its own, it can't provide the happiness that a balanced life can, but craft and achievement are definitely pleasing. I would consider it an essential part of a good life, regardless of what the actual activity is.
This is different from meaningless work that brings you nothing except a paycheck, which I agree is important to minimize or eliminate. We should apply machines to this kind of work as much as we can, except in cases where the work itself doesn't need to exist.
You could say the same about every job, so you are really arguing against jobs in general. Who's going to help you fix your sciatica if your doctor and physical therapist think like that?
The opposite of a clockwatcher isn't a workaholic, it's someone enjoying writing code and the collaboration, problem solving and design process which leads to what you end up writing, and enjoying _doing it well_ inside normal work hours, remarking at how quickly the clock is going when they do check it.
Should have had a food delivery rider.
cue snowcrash, enter stage right, Hiro Protoganist...
Perhaps they read your comment and changed the slogan? It is:
> More time for the code you want to write, and everything else.
now.
Both Google and Microsoft have sensibly decided to focus on low-level, junior automation first rather than bespoke end-to-end systems. Not exactly breadth over depth, but rather reliability over capability. Several benefits from the agent development perspective:
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
Wow, it looks like Google and Microsoft timed their announcements for the same day, or perhaps one of them rushed their launch because the other company announced sooner than expected. These are exciting times!
https://github.blog/changelog/2025-05-19-github-copilot-codi...
Google IO is this week, same as Microsoft Build. Battle of the attention grabbing announcements.
We have to see what Google has in store, probably better models, AI integrations with Android Studio and may be bring glasses back?
Yes, the masses are practically heaving with excitement, indeed
Both announcements on the heels of OpenAI Codex Research Preview too, which is essentially the same product
All the monies on the same idea at the same time, sounds a bit desperate to me.
> Also, you can get caught up fast. Jules creates an audio summary of the changes.
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
I guess the idea is vibe coding while laying in bed or driving? If my kids are any indication of the generation to come, they sure love audio over reading.
One benefit is you can, say, go for a walk and get a report and act on it as you go.
More of a tool for managers, or least it's a manager style tool. You could get a morning report while heading to the office for example.
(I'm not saying anyone reading this should want this, only that it fits a use case for many people)
In a handful of years you'll have the voice/video generation come of age. Also we may have some new form factor like AI necklaces or glasses or something.
"Spend your time doing what you want to do!" - I enjoy coding cool new code ....
I think that's the point AI agents are trying to sell. Spend more time on the type of coding tasks you want to do, like coding cool new code, and not the tasks that you don't want to do.
Is this really a common problem? What are these tasks that can't be deterministically automated and also not avoided entirely, and also don't fit nicely into where you need to think about some other task for a while before you go implement a solution to it?
Let’s not fall for it, folks. Today it’s the easy tasks—things you don’t mind giving up. But tomorrow? It will be your entire job.
That’s the trajectory. Let’s stay sharp.
What do you advise? Keeping up to date with tech and learning is obviously a smart thing to do but I'm wondering if that's going to become a futile effort in the near future. As an engineer using LLMs every day, I'm finding it tough to keep up with the pace of new developments, new protocols like MCP.. the pace is wild.
And now we have agents which are going to multiply the pace of development even more.
We can stay sharp but I'm not sure there's really much we can do to stop our jobs - or all jobs, disappearing. Not that this is a bad thing, if it's done right.
You will not be replaced by AI. You will be replaced by person using AI!
Now that every company has a bot, I wish we had some way to better quantify the features.
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
they all suck, because at the end of the day, these tools are just automating multiple prompts to one of the same codegen LLMs that everyone is using already.
The loop is: it identifies which files need to change, creates an action plan, then proceeds with a prompt per file for codegen.
In my experience, the parts up to the codegen are how these tools differ, with Junie being insanely good at identifying which parts of a codebase need change (at least for Java, on a ~250k loc project that I tried it on).
But the actual codegen part is as horrible as when you do it yourself.
Of course I'm not talking about hello world usages of codegen.
I suppose these tools would allow moving the goalpost a bit further down the line for small "from scratch" ideas, compared to not using them.
I really want to try out Google's new Gemini 2.5 Pro model that everyone says is so great at coding. However, the fact that Jules runs in cloud-based VMs instead of on my local machine makes it much less useful to me than Claude Code, even if the model was better.
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
You can use Aider with Gemini. All you need is an API key.
https://aider.chat/docs/leaderboards/
Can it resolve merge conflicts for me? My least favorite programming task and one I haven't seen automated yet.
Claude Code has been creating and cleaning up lots of Git messes for me.
I’d love to see it if that’s possible - merge conflict cleanup can be some of the hardest calls, imo, particularly when the ‘right’ merge is actually a hybridized block that contains elements from both theirs and mine. I feel like introducing today’s LLM into the process would only end up making things harder to untangle.
> Jules does coding tasks you don't want to do.
proceeds to list ALL coding tasks.
> Jules creates a PR of the changes. Approve the PR, merge it to your branch, and publish it on GitHub.
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change. What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
It shows you code diffs, results of executing modified or new code in a VPS, and it writes pull requests, but asks you to hit the Merge button in GitHub.
Will this promote bad practice? Probably up to the individual practitioner or organization.
Any coding solution that doesn’t offer the ability to edit the code in an IDE is nonsense.
Why would I ever want this over cursor? The sync thing is kinda cool but I basically already do this with cursor
Heh, personally I'd say any coding solution that lives inside an IDE is nonsense :P Funny how perspectives can be so different. I want something standalone, that I can use in in a pane to the left/right of my already opened nvim instance, or even further away than that. Gave Cursor a try some weeks ago but seems worse than Aider even, and having an entire editor just for some LLM edits/pair programming seems way overkill and unnecessary.
Ideally, it would be built in to [my IDE of choice]. So I neither have to have a separate browser window open, copy/pasting, or have a separate IDE open, copy/pasting. Having it as a standalone tool makes as much sense as having a spell checker that is a separate browser window running a separate app from the word processor you are using to write your letter. Why?
Can you have it make changes, then review them in a gif diff? That’s basically all I do with cursor at this point
Can’t wait to try this!
Codex and codex cli are the best from what I have tested so far. Codex is really neat as I can do it from ChatGPT app.
You're the first person I've seen say this about codex.
Have you tried Claude Code / aider / cursor?
What did you need to do differently to get it to work functionally? I feel like the common experience has been universally poor.
Cursor/Windsurf or other IDEs are not the right comparison. I do use them all the time and I don’t see them going away anytime soon or may be never.
As for the use case of “Give a simple or detailed prompt and the entire project and let the model do its stuff” codex has done much better than Claude code. Claude code assumes a lot of things and often ends up doing a lot more making the code very complex and also me having to redo it later with cursor. With codex I have not seen this issue.
I also feel that codex cli as a cli tool is much better mainly due to its OSS nature where I can choose different model. Claude really missed this big time IMHO.
I used Jules three times today, very impressive! It also handles coding-adjacent work. Good github integrations.
How does it validate that what it writes works? Does it try to run tests or compile?
It starts up a VPS, builds and runs modified code. It did this perfectly while modifying an existing Clojure project.
This is what Devin was supposed to be, right? Although I have been waitlisted, I am still eager to try it out.
Notice how no-one (up until now) mentioned "Devin" or compared it to any other AI agent?
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
Now we have Jules from Google which is....$0 (Free)Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
Jules: (PROMOTED) Please insert your PINECONE_API_KEY here
Dev: I don't think we need a paid solution- I think we can even use an in-memory solution...
Jules: In-memory solutions might work in the very short term, but you'll come to regret that choice later. Pinecone prevents those painful 2AM crashes when your data scales. You'll thank me later, trust me.
Please insert your PINECONE_API_KEY here
Wait for the models to be able to learn to estimate the economic value of each issue taking into account 0-day security issues and falling stock prices. They will quote you accordingly with a marked up price. Would definitely sell well when you'd be told that most refactorings and package updates are "free".
Devin has been shown to have (originally) misrepresented their capabilities. Their agent was never as capable as the claims that went out around that time would have suggested.
What? A company over-hyping their AI? Unthinkable!
I am really looking forward to “version bumps” without breaking the dependency tree at the very least, something which Dependabot almost gets right.
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
Glad to see they're joining the game, there is so much work to do here. Have been using Gemini 2.5 pro as an autonomous coding agent for a while because it is free. Their work with AlphaEvolve is also pushing the edge - I did a small write up on AlphaEvolve with agentic workflow here: https://toolkami.com/alphaevolve-toolkami-style/
How? I constantly hit the limit.
Just my two cents but I had a persistent issue with this webapp, tried probably 50 diff prompts to fix it across o3, 2.5 Pro, 3.7 to zero avail. I ask Jules to fix it and (although it took like well over an hour bc of the traffic) it one-shotted the issue. Feels like this is the next step in "thinking" with large enough repos. I like it.
Is the "asynchronous" bit important? How long does it take to do its thing?
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.
Other Agentic tools run for 10-30min based on model, task complexity and the number of dead ends the LLM get into.
There doesn't appear to be a way to add files like .npmrc or .env that are not part of what gets pushed to GitHub, making this largely useless for most of my projects
These coding agents are coming out so fast I literally don't have time to compare them to each other. They all look great, but keeping up with this would be its own full time job. Maybe that's the next agent.
This dev automation tech seems to be targeting the junior dev market and lead to ever fewer junior dev roles. Less junior dev roles means less senior devs. For all the code smart folks that live here, I find very little critical thinking regarding the consequences of this tech for the dev market and the industry in general. No, it's not take your job. And no, just because it doesn't affect you now does not mean that it won't be bad for you in the near future. Do you want to spend your career BUILDING cool stuff or FIXING and REVIEWING AI codebases?
This feels like a startup launch to gauge interest ( put up a waitlist and see who bites)
So many agent tools now. What is the special sauce of each?
Gemini has 1 Million context window, which usually works better for coding.
When it gets priced, it's usually cheaper (for the same capability)
Spoiler alert: there isn't one
Context Window and Pricing absolutely matters
But many "agentic" tools are model-agnostic. The question is about what the tool itself is doing.
The whole "industry" right now is hacked together crap shoved out the door with zero thinking involved.
Wait a year or two, evaluating this stuff at the peak of the hype cycle is pointless.
Am I the only one a bit annoyed that the return statement isn't updated to `return step`?
It’s really annoying to me (and sad for society) that everything everywhere only supports github for code hosting.
There are a million places to do dev that aren’t Microsoft, but you’d never know it from looking at app launches.
It’s almost like people who don’t use GitHub and Gmail and Instagram are becoming second class citizens on the web.
Ahem. I don't even use Git. I feel like even more of an outcast.
Jules was unable to complete the task in time. Please review the work done so far and provide feedback for Jules to continue.
> Thanks for your interest in Jules. We'll email you when Jules is available.
Well here's to hoping it's better than Cursor. I doubt it considering my experiences with Gemini have been awful, but I'm willing to give it a shot!
looks like it a little too popular or they haven't figured out how to scale compute:
Jules encountered an unexpected error. To continue, respond to Jules below or start a new task.
And appears you have limited to 5 tasks per day
Oh, I got an email invitation to try it out this morning... This post reminded me to give it a go. I don't remember asking for an invitation -- not sure how I got on a list.
And the logo is an octopus? Heh, nice connotations. Now I'm gonna trust my data with this for sure :DD.
[dead]
[flagged]
https://jules.google/docs/faq/#does-jules-train-on-private-r...
[flagged]