Astro/Solid - Hacker News

$et-al an hour ago

Given the timing, this is very likely a submarine article. Or as the kids call it these days: sponcon.

https://www.paulgraham.com/submarine.html

$astrange an hour ago

It is not a sponsored article and he writes one of these every time a new model releases. Why would a professor at Wharton need to write sponsored Substack articles.

$0x1ceb00da an hour ago

"I don't care who the IRS sends I am not paying taxes!"

$gopalv an hour ago

> It worked for nine and a half hours.

> Again, it wasn’t perfect. As an expert, I was able to spot some errors and omissions (some as a result of the design I had asked for) that I had the AI correct

That's the bit that stuck out to me - that's longer than I would expect to work on a problem in a day or even expect to go back & fix the output of something that has a core reward loop of hours.

My customers are currently clamoring to push down my agent response times from 85 seconds down to below the 20s mark.

At the same time, it is very dissonant to see the industry heading towards hour+ long workflows with an agent.

[-]

$hedgehog an hour ago

Work duration is also not that valuable of a measure, you're usually better off defining the process yourself in code and having that delegate chunks of work to the models. The only real issue there is that it's harder to take advantage of the providers' subscription discounts, but on the other hand it's easier to do your own model routing, and there's no way I've seen for the normal chatbots to maintain coherence on streams of work measured in days and weeks.

$matneyx an hour ago

In Claude's defense (and I cannot believe I'm defending it), I know no single dev who could create what it did (Concord), from a 19-page design document, in 9.5 working hours.

We're gonna go back to the days where our bosses ask why we're just sitting around, but instead of saying "compiling," we'll just say, "waiting for Claude."

[-]

$neogodless an hour ago

For the rare uninitiated:

https://xkcd.com/303/

$PeterStuer an hour ago

My Opus 4.8 regularly works for 10+minutes on a single non-trivial coding request.

[-]

$ASalazarMX 12 minutes ago

Your Opus 4.8? Is it now usual to refer to LLMs like that?

$root_axis an hour ago

I just can't stand this type of fawning language.

$asdK120 an hour ago

Mollick runs the Generative AI Lab at Wharton, with all the corporate sponsors.

He is a professor but sadly also an AI shill. He should switch to advertising washing power.

[-]

$MostlyStable an hour ago

So...no engagement with the substance? Not even to explain why it is that this is not a useful description or test of capabilities? Ok.

[-]

$dthread3 an hour ago

I would like to see it do something useful, like converting pytorch to golang.

[-]

$cadamsdotcom 17 minutes ago

Why not get a plan from Anthropic and get that done yourself? Probably is going to cost you as much as a coffee.

$lijok an hour ago

Hot damn - is that the floor of what you consider useful?

$fdsdfsdfzxczxc an hour ago

This newfangled car thing is useless. It can't even properly shoe a horse.

$whyenot an hour ago

Instead of attacking the author, please respond to the content of the article. That is the HN way, and it leads to more substantive and interesting discussions.

$recursivedoubts 43 minutes ago

would it be possible for mythos to make the space bar scroll the pages on your website properly?

$382hi an hour ago

I think Qwen 3.7-Plus is better at reasoning than Mythos, and I've used both for quite a while.

$the_doctah an hour ago

More Mythos Marketing.

What it feels like to work with Mythos