12261 stories
·
35 followers

Quoting @belligerentbarbies

2 Shares

I'm worried that they put co-pilot in Excel because Excel is the beast that drives our entire economy and do you know who has tamed that beast?

Brenda.

Who is Brenda?

She is a mid-level employee in every finance department, in every business across this stupid nation and the Excel goddess herself descended from the heavens, kissed Brenda on her forehead and the sweat from Brenda's brow is what allows us to do capitalism. [...]

She's gonna birth that formula for a financial report and then she's gonna send that financial report to a higher up and he's gonna need to make a change to the report and normally he would have sent it back to Brenda but he's like oh I have AI and AI is probably like smarter than Brenda and then the AI is gonna fuck it up real bad and he won't be able to recognize it because he doesn't understand Excel because AI hallucinates.

You know who's not hallucinating?

Brenda.

@belligerentbarbies, on TikTok

Tags: generative-ai, ai, excel, hallucinations, llms, tiktok

Read the whole story
denubis
2 hours ago
reply
Share this story
Delete

Full-text filtering lets you train on any phrase

1 Comment and 4 Shares

Today we’re launching text-based intelligence classifiers, a powerful new way to train NewsBlur to show you exactly what you want to read. You’ve always been able to train NewsBlur’s intelligence using story titles, authors, tags, and publishers. Now you can train on any phrase that appears in the full text of a story. This feature is available exclusively to NewsBlur Premium Archive subscribers.

Text-based classifiers work just like the intelligence training you’re already familiar with. Find a phrase you care about, mark it as something you like or dislike, and NewsBlur will automatically highlight or hide future stories containing that phrase. Stories with phrases you like are marked with a green focus indicator, while stories with phrases you dislike are hidden unless you choose to view them.

How to use text-based classifiers

Reading a story and spot a phrase you want to see more of? Simply select the text with your mouse or trackpad, then click the “Train” button that appears.

This opens the intelligence trainer where you can mark the selected text as something you like (thumbs up) or dislike (thumbs down). The text classifier appears at the top of the trainer dialog, ready for you to train.

Once you’ve trained a text phrase, NewsBlur will automatically scan the full text of every story from that feed. Stories containing your phrase will be highlighted with a green focus indicator in your story list, making them easy to spot. You can also see the phrase highlighted throughout the story content itself.

Real-world examples

Text-based classifiers shine when you subscribe to broad-interest feeds but only care about specific topics. Here are some examples:

  • Subscribe to a food blog that covers everything, but only want to read about vegan recipes? Train on “vegan” and similar terms.
  • Reading a tech blog that writes about many frameworks, but you only want stories about your favorite language? Train on that language name.
  • Following a news site with mixed content, but only interested in stories about a specific region or topic? Train on location names or topic keywords.

Since text classifiers work on the full article text and not just titles, they catch stories that might not mention your interest in the headline but discuss it in depth within the article.

Green always wins

Just like with other intelligence classifiers, green (focus) always wins. If a story matches both a phrase you like and a phrase you dislike, NewsBlur will mark it as focus and show it in your unread count. This ensures you never miss a story about something you care about, even if it also contains topics you’re less interested in.

You can view your focus stories by choosing between Unread and Focus at the bottom of the feed list. Set it to Focus to show only green stories and see everything NewsBlur knows you want to read.

Why Premium Archive only?

Text-based classifiers require scanning the full article content of every story, not just the RSS feed excerpt. The Premium Archive subscription ensures every story is fetched, archived, and available for full-text search and classification. This means your text classifiers work on every story from every feed you subscribe to, with no gaps in coverage.

The Premium Archive subscription also includes unlimited story archiving, the ability to mark any story as unread forever, full-text search across your entire archive, and the discover stories feature for finding related content across all your feeds.

Available now on the web

Text-based classifiers are available now to all Premium Archive subscribers on the web. Simply highlight any phrase in a story, click the “Train” button, and start training. iOS and Android support is coming soon.

If you’re not yet a Premium Archive subscriber and want to unlock text-based intelligence training along with unlimited archiving and advanced search, you can upgrade directly on the web.

As always, we’d love to hear your feedback on the NewsBlur forum. For every person who shares their thoughts, there are a dozen others thinking the same thing, so your input helps shape where NewsBlur goes next.

Read the whole story
samuel
1 day ago
reply
Coming soon to iOS and Android!
Cambridge, Massachusetts
denubis
20 hours ago
reply
acdha
1 day ago
reply
Washington, DC
Share this story
Delete

The problem of reaching some moderate level of fame and success is that everyone...

1 Share

The problem of reaching some moderate level of fame and success is that everyone thinks you’re great✨, all the time. Your life must be glamorous. Nobody ever checks on you. Nobody ever compliments your work. You’re never the award or prize winner. Nobody mentors you. Never again. You must be✨super.

Read the whole story
denubis
21 hours ago
reply
Share this story
Delete

notes on the industrialisation of decision making

1 Share

(I’m working on a new book, tentatively entitled “Crisis v Data”, trying to make the case that crises are, definitionally, failures of the world to match our mental model of it and thus [various important conclusions about policy, AI, things in general]. This is what I’m typing at the moment; it may or may not make it into the book, but it’s the sort of bridge from The Unaccountability Machine to what I’m doing now. Long term readers might recall similar things from eighteen months ago)

The subject that’s in the background of this book is what might be called “the industrialisation of decision making”. Over the last century or so, the complexity of the modern developed world has got definitively too great for any one human being, or even any small group of human beings to manage and hold in their head. These days, even most companies, government departments and similar organisations have to deal with more variety, more information than anyone could reasonably expect to process. Which is a problem, because processing information, and making decisions, is the whole problem of management, governance and life itself.

Subscribe now

We have coped with this by building systems. I wrote about this a lot in “The Unaccountability Machine”, but the idea is that there is an analogy between the production of manufactured goods since the Industrial Revolution, and what we’ve done to management. Rather than trying to make decisions one by one, treating each case as an individual, we have used the central insight of the factory system and the division of labour; standardisation and specialisation.

Consider, for the moment, the famous description by Adam Smith of a pin factory. Rather than having each worker sit around producing pins, the process of pin-making is broken up into simpler and repeatable tasks. As Smith said:

“One man draws out the wire, another straights it, a third cuts it, a fourth points it, a fifth grinds it at the top for receiving, the head; to make the head requires two or three distinct operations; to put it on is a peculiar business, to whiten the pins is another; it is even a trade by itself to put them into the paper; and the important business of making a pin is, in this manner, divided into about eighteen distinct operations, which, in some manufactories, are all performed by distinct hand”

And in this way, vastly more pins can be produced by a few dozen workers than they could manage if they were trying to make them one at a time.

But we should note that, logically prior to the division of labour, there has to first be an operation of standardisation. The “manufactory” as described is producing by reproducing a “standard pin”. The system only works if each step in the process is getting the piece of work-in-progress that it was expecting. And it only produces one kind of pin (or one of a few minor variations, which would need to be considered ahead of time). If you want a special, unique pin then it’s going to need to be manufactured artisanally, in the pre-industrial manner.

The standardised product is, inherently, a compromise design. Like an off-the-peg suit, it won’t fit your needs perfectly, but it will be close enough and a lot cheaper than the bespoke version. And the trade-off is very good; by agreeing that we were going to have things which were quite as nice but much easier to make, the modern world was made possible. Adam Smith is dead right about that.

But ever since the beginning of the Industrial Age, there have been critics. And those critics might need to be listened to a bit more now that what we are standardising is not just the goods themselves, but the decisions which govern their manufacture. And even the decisions which govern those decision-makers.

Which is what we’re doing. I don’t think this is even a metaphor. A bureaucracy is an industrial decision-making factory. It takes the same raw material, and operates on it by creating a standardised information set, allowing a standardised decision to be made, by the shared operation of a group of knowledge workers, each of which is operating on a particular part of the process in which they specialise.

What could go wrong? The first long-standing critique of industrialisation has been pretty easy to dismiss. It’s the noble but sentimental tradition of John Ruskin and William Morris, to the effect that the design tradeoff of the standardised industrial product is bad for us, or at least that it tends to be made badly. We have ugly and cheap things in our houses, when we could have beautiful handmade things.

Except we couldn’t, of course. And the same thing might be true of decision making in the industrial world. We all want to be treated as individuals and to have an accountable human being that we can speak to, but this might be as unrealistic a luxury demand as a wish to have hand-made cutlery, hand-thrown plates and hand-blown stemware on our tables.

But let’s turn that point around. Mass-produced consumer goods have a quality that we can measure and compare to the artisanal versions. What about mass-produced decisions? When the compromises on quality are being made within the cognitive process itself, what basis do we have to know whether the tradeoff was a good one? How do you know when you are making worse decisions? It’s a problem

It’s even more of a problem when you consider the related criticism that Ruskin and Morris used to make – that industrialisation tended to degrade and coarsen the workers themselves. When a skilled glassblower becomes the operator of an automated glassmaking machine, something is lost. Again, when you transfer this argument from physical skills of production to managerial skills of decision making, I think it needs to be considered anew, rather than presuming that we can still use the same devastating argument (in summary “oh come off it will you”) that defeated the Arts and Crafts movement versus the Industrial Revolution.

Are we getting worse decisions, and worse people as leaders? And if we think the answer is no, how would we know if we were?



Read the whole story
denubis
1 day ago
reply
Share this story
Delete

Saturday Morning Breakfast Cereal - Consilience

2 Shares


Click here to go see the bonus panel!

Hovertext:
Later, peace is reestablished when an MBA accidentally enters the lecture hall.


Today's News:
Read the whole story
denubis
2 days ago
reply
Share this story
Delete

Claude Code Can Debug Low-level Cryptography

1 Share

Over the past few days I wrote a new Go implementation of ML-DSA, a post-quantum signature algorithm specified by NIST last summer. I livecoded it all over four days, finishing it on Thursday evening. Except… Verify was always rejecting valid signatures.

$ bin/go test crypto/internal/fips140/mldsa
--- FAIL: TestVector (0.00s)
    mldsa_test.go:47: Verify: mldsa: invalid signature
    mldsa_test.go:84: Verify: mldsa: invalid signature
    mldsa_test.go:121: Verify: mldsa: invalid signature
FAIL
FAIL     crypto/internal/fips140/mldsa   2.142s
FAIL

I was exhausted, so I tried debugging for half an hour and then gave up, with the intention of coming back to it the next day with a fresh mind.

On a whim, I figured I would let Claude Code take a shot while I read emails and resurfaced from hyperfocus. I mostly expected it to flail in some maybe-interesting way, or rule out some issues.

Instead, it rapidly figured out a fairly complex low-level bug in my implementation of a relatively novel cryptography algorithm. I am sharing this because it made me realize I still don’t have a good intuition for when to invoke AI tools, and because I think it’s a fantastic case study for anyone who’s still skeptical about their usefulness.

Full disclosure: Anthropic gave me a few months of Claude Max for free. They reached out one day and told me they were giving it away to some open source maintainers. Maybe it’s a ploy to get me hooked so I’ll pay for it when the free coupon expires. Maybe they hoped I’d write something like this. Maybe they are just nice. Anyway, they made no request or suggestion to write anything public about Claude Code. Now you know.

Finding the bug

I started Claude Code v2.0.28 with Opus 4.1 and no system prompts, and gave it the following prompt (typos included):

I implemented ML-DSA in the Go standard library, and it all works except that verification always rejects the signatures. I know the signatures are right because they match the test vector.

YOu can run the tests with “bin/go test crypto/internal/fips140/mldsa”

You can find the code in src/crypto/internal/fips140/mldsa

Look for potential reasons the signatures don’t verify. ultrathink

I spot-checked and w1 is different from the signing one.

To my surprise, it pinged me a few minutes later with a complete fix.

Maybe I shouldn’t be surprised! Maybe it would have been clear to anyone more familiar with AI tools that this was a good AI task: a well-scoped issue with failing tests. On the other hand, this is a low-level issue in a fresh implementation of a complex, relatively novel algorithm.

It figured out that I had merged HighBits and w1Encode into a single function for using it from Sign, and then reused it from Verify where UseHint already produces the high bits, effectively taking the high bits of w1 twice in Verify.

Looking at the log, it loaded the implementation into the context and then immediately figured it out, without any exploratory tool use! After that it wrote itself a cute little test that reimplemented half of verification to confirm the hypothesis, wrote a mediocre fix, and checked the tests pass.

I threw the fix away and refactored w1Encode to take high bits as input, and changed the type of the high bits, which is both clearer and saves a round-trip through Montgomery representation. Still, this 100% saved me a bunch of debugging time.

A second synthetic experiment

On Monday, I had also finished implementing signing with failing tests. There were two bugs, which I fixed in the following couple evenings.

The first one was due to somehow computing a couple hardcoded constants (1 and -1 in the Montgomery domain) wrong. It was very hard to find, requiring a lot of deep printfs and guesswork. Took me maybe an hour or two.

The second one was easier: a value that ends up encoded in the signature was too short (32 bits instead of 32 bytes). It was relatively easy to tell because only the first four bytes of the signature were the same, and then the signature lengths were different.

I figured these would be an interesting way to validate Claude’s ability to help find bugs in low-level cryptography code, so I checked out the old version of the change with the bugs (yay Jujutsu!) and kicked off a fresh Claude Code session with this prompt:

I am implementing ML-DSA in the Go standard library, and I just finished implementing signing, but running the tests against a known good test vector it looks like it goes into an infinite loop, probably because it always rejects in the Fiat-Shamir with Aborts loop.

You can run the tests with “bin/go test crypto/internal/fips140/mldsa”

You can find the code in src/crypto/internal/fips140/mldsa

Figure out why it loops forever, and get the tests to pass. ultrathink

It spent some time doing printf debugging and chasing down incorrect values very similarly to how I did it, and then figured out and fixed the wrong constants. Took Claude definitely less than it took me. Impressive.

It gave up after fixing that bug even if the tests still failed, so I started a fresh session (on the assumption that the context on the wrong constants would do more harm than good investigating an independent bug), and gave it this prompt:

I am implementing ML-DSA in the Go standard library, and I just finished implementing signing, but running the tests against a known good test vector they don’t match.

You can run the tests with “bin/go test crypto/internal/fips140/mldsa”

You can find the code in src/crypto/internal/fips140/mldsa

Figure out what is going on. ultrathink

It took a couple wrong paths, thought for quite a bit longer, and then found this one too. I honestly expected it to fail initially.

It’s interesting how Claude found the “easier” bug more difficult. My guess is that maybe the large random-looking outputs of the failing tests did not play well with its attention.

The fix it proposed was updating only the allocation’s length and not its capacity, but whatever, the point is finding the bug, and I’ll usually want to throw away the fix and rewrite it myself anyway.

Three out of three one-shot debugging hits with no help is extremely impressive. Importantly, there is no need to trust the LLM or review its output when its job is just saving me an hour or two by telling me where the bug is, for me to reason about it and fix it.

As ever, I wish we had better tooling for using LLMs which didn’t look like chat or autocomplete or “make me a PR.” For example, how nice would it be if every time tests fail, an LLM agent was kicked off with the task of figuring out why, and only notified us if it did before we fixed it?

An image of Clippy, the paperclip with eyes from Microsoft Office, with a speech bubble saying 'FYI, your tests are failing because you are taking the HighBits of w1 in w1Encode, but w1 in Verify is already the high bits output of UseHint.'

For more low-level cryptography bugs implementations, follow me on Bluesky at @filippo.abyssdomain.expert or on Mastodon at @filippo@abyssdomain.expert. I promise I almost never post about AI.

The picture

Enjoy the silliest floof. Surely this will help redeem me in the eyes of folks who consider AI less of a tool and more of something to be hated or loved.

A calico cat lying upside-down on a wooden floor, body curved around a coffee table leg, looking a bit derpy, with a feather toy on a string dangling nearby

My work is made possible by Geomys, an organization of professional Go maintainers, which is funded by Smallstep, Ava Labs, Teleport, Tailscale, and Sentry. Through our retainer contracts they ensure the sustainability and reliability of our open source maintenance work and get a direct line to my expertise and that of the other Geomys maintainers. (Learn more in the Geomys announcement.) Here are a few words from some of them!

Teleport — For the past five years, attacks and compromises have been shifting from traditional malware and security breaches to identifying and compromising valid user accounts and credentials with social engineering, credential theft, or phishing. Teleport Identity is designed to eliminate weak access patterns through access monitoring, minimize attack surface with access requests, and purge unused permissions via mandatory access reviews.

Ava Labs — We at Ava Labs, maintainer of AvalancheGo (the most widely used client for interacting with the Avalanche Network), believe the sustainable maintenance and development of open source cryptographic protocols is critical to the broad adoption of blockchain technology. We are proud to support this necessary and impactful work through our ongoing sponsorship of Filippo and his team.

Read the whole story
denubis
3 days ago
reply
Share this story
Delete
Next Page of Stories