11662 stories
·
35 followers

Notes from my appearance on the Software Misadventures Podcast

1 Share

I was a guest on Ronak Nathani and Guang Yang's Software Misadventures Podcast, which interviews seasoned software engineers about their careers so far and their misadventures along the way. Here's the episode: LLMs are like your weird, over-confident intern | Simon Willison (Datasette).

You can get the audio version on Overcast, on Apple Podcasts or on Spotify - or you can watch the video version on YouTube.

I ran the video through MacWhisper to get a transcript, then spent some time editing out my own favourite quotes, trying to focus on things I haven't written about previously on this blog.

Having a blog

23:15

There's something wholesome about having a little corner of the internet just for you.

It feels a little bit subversive as well in this day and age, with all of these giant walled platforms and you're like, "Yeah, no, I've got domain name and I'm running a web app.”

It used to be that 10, 15 years ago, everyone's intro to web development was building your own blog system. I don't think people do that anymore.

That's really sad because it's such a good project - you get to learn databases and HTML and URL design and SEO and all of these different skills.

Aligning LLMs with your own expertise

37:10

As an experienced software engineer, I can get great code from LLMs because I've got that expertise in what kind of questions to ask. I can spot when it makes mistakes very quickly. I know how to test the things it's giving me.

Occasionally I'll ask it legal questions - I'll paste in terms of service and ask, "Is there anything in here that looks a bit dodgy?"

I know for a fact that this is a terrible idea because I have no legal knowledge! I'm sort of like play acting with it and nodding along, but I would never make a life altering decision based on legal advice from LLM that I got, because I'm not a lawyer.

If I was a lawyer, I'd use them all the time because I'd be able to fall back on my actual expertise to make sure that I'm using them responsibly.

The usability of LLM chat interfaces

40:30

It's like taking a brand new computer user and dumping them in a Linux machine with a terminal prompt and say, "There you go, figure it out."

It's an absolute joke that we've got this incredibly sophisticated software and we've given it a command line interface and launched it to a hundred million people.

Benefits for people with English as a second language

41:53

For people who don't speak English or have English as a second language, this stuff is incredible.

We live in a society where having really good spoken and written English puts you at a huge advantage.

The street light outside your house is broken and you need to write a letter to the council to get it fixed? That used to be a significant barrier.

It's not anymore. ChatGPT will write a formal letter to the council complaining about a broken street light that is absolutely flawless.

And you can prompt it in any language. I'm so excited about that.

Interestingly, it sort of breaks aspects of society as well - because we've been using written English skills as a filter for so many different things.

If you want to get into university, you have to write formal letters and all of that kind of stuff, which used to keep people out.

Now it doesn't anymore, which I think is thrilling…. but at the same time, if you've got institutions that are designed around the idea that you can evaluate everyone and filter them based on written essays, and now you can't, we've got to redesign those institutions.

That's going to take a while. What does that even look like? It's so disruptive to society in all of these different ways.

Are we all going to lose your jobs?

46:39

As a professional programmer, there's an aspect where you ask, OK, does this mean that our jobs are all gonna dry up?

I don't think the jobs dry up. I think more companies start commissioning custom software because the cost of developing custom software goes down, which I think increases the demand for engineers who know what they're doing.

But I'm not an economist. Maybe this is the death knell for six figure programmer salaries and we're gonna end up working for peanuts?

[... later 1:32:12 ...]

Every now and then you hear a story of a company who got software built for them, and it turns out it was the boss's cousin, who's like a 15-year-old who's good with computers, and they built software, and it's garbage.

Maybe we've just given everyone in the world the overconfident 15-year-old cousin who's gonna claim to be able to build something, and build them something that maybe kind of works.

And maybe society's okay with that?

This is why I don't feel threatened as a senior engineer, because I know that if you sit down somebody who doesn't know how to program with an LLM, and you sit me with an LLM, and ask us to build the same thing, I will build better software than they will.

Hopefully market forces come into play, and the demand is there for software that actually works, and is fast and reliable.

And so people who can build software that's fast and reliable, often with LLM assistance, used responsibly, benefit from that.

Prompt engineering and evals

54:08

For me, prompt engineering is about figuring out things like - for a SQL query - we need to send the full schema and we need to send these three example responses.

That's engineering. It's complicated.

The hardest part of prompt engineering is evaluating. Figuring out, of these two prompts, which one is better?

I still don't have a great way of doing that myself.

The people who are doing the most sophisticated development on top of LLMs are all about evals. They've got really sophisticated ways of evaluating their prompts.

Letting skills atrophy

1:26:12

We talked about the risk of learned helplessness, and letting our skills atrophy by outsourting so much of our work to LLMs.

The other day I reported a bug against GitHub Actions complaining that the windows-latest version of Python couldn't load SQLite extensions.

Then after I'd filed the bug, I realized that I'd got Claude to write my test code and it had hallucinated the wrong SQLite code for loading an extension!

I had to close that bug and say, no, sorry, this was my fault.

That was a bit embarrassing. I should know better than most people that you have to check everything these things do, and it had caught me out. Python and SQLite are my bread and butter. I really should have caught that one!

But my counter to this is that I feel like my overall capabilities are expanding so quickly. I can get so much more stuff done that I'm willing to pay with a little bit of my soul.

I'm willing to accept a little bit of atrophying in some of my abilities in exchange for, honestly, a two to five X productivity boost on the time that I spend typing code into a computer.

That's like 10% of my job, so it's not like I'm two to five times more productive overall. But it's still a material improvement.

It's making me more ambitious. I'm writing software I would never have even dared to write before. So I think that's worth the risk.

Imitation intelligence

1:53:35

I feel like artificial intelligence has all of these science fiction ideas around it. People will get into heated debates about whether this is artificial intelligence at all.

I've been thinking about it in terms of imitation intelligence, because everything these models do is effectively imitating something that they saw in their training data.

And that actually really helps you form a mental model of what they can do and why they're useful. It means that you can think, "Okay, if the training data has shown it how to do this thing, it can probably help me with this thing."

If you want to cure cancer, the training data doesn't know how to cure cancer. It's not gonna come up with a novel cure for cancer just out of nothing.

The weird intern

I've used the weird intern analogy a few times before. Here's the version Ronak and Guang extracted as the trailer for our episode:

1:18:00

I call it my weird intern. I'll say to my wife, Natalie, sometimes, "Hey, so I got my weird intern to do this." And that works, right?

It's a good mental model for these things as well, because it's like having an intern who has read all of the documentation and memorized the documentation for every programming language, and is a wild conspiracy theorist, and sometimes comes up with absurd ideas, and they're massively overconfident.

It's the intern that always believes that they're right. But it's an intern who you can, I hate to say it, you can kind of bully them.

You can be like, "Do it again, do that again." "No, that's wrong." And you don't have to feel guilty about it, which is great!

Or one of my favorite prompts is you just say, "Do better." And it works. It's the craziest thing. It'll write some code, you say, "Do better." And it goes, "Oh, I'm sorry, I should..."

And then it will churn out better code, which is so stupid that that's how this technology works. But it's kind of fun.

Tags: blogging, podcasts, ai, prompt-engineering, generative-ai, llms

Read the whole story
denubis
12 hours ago
reply
Share this story
Delete

My podcast with Dan Faggella

1 Share

Dan Faggella recorded an unusual podcast with me that’s now online. He introduces me as a “quantum physicist,” which is something that I never call myself (I’m a theoretical computer scientist) but have sort of given up on not being called by others. But the ensuing 85-minute conversation has virtually nothing to do with physics, or anything technical at all.

Instead, Dan pretty much exclusively wants to talk about moral philosophy: my views about what kind of AI, if any, would be a “worthy successor to humanity,” and how AIs should treat humans and vice versa, and whether there’s any objective morality at all, and (at the very end) what principles ought to guide government regulation of AI.

So, I inveigh against “meat chauvinism,” and expand on the view that locates human specialness (such as it is) in what might be the unclonability, unpredictability, and unrewindability of our minds, and plead for comity among the warring camps of AI safetyists.

The central point of disagreement between me and Dan ended up centering around moral realism: Dan kept wanting to say that a future AGI’s moral values would probably be as incomprehensible to us as are ours to a sea snail, and that we need to make peace with that. I replied that, firstly, things like the Golden Rule strike me as plausible candidates for moral universals, which all thriving civilizations (however primitive or advanced) will agree about in the same way they agree about 5 being a prime number. And secondly, that if that isn’t true—if the morality of our AI or cyborg descendants really will be utterly alien to us—then I find it hard to have any preferences at all about the future they’ll inhabit, and just want to enjoy life while I can! That which (by assumption) I can’t understand, I’m not going to issue moral judgments about either.

Anyway, rewatching the episode, I was unpleasantly surprised by my many verbal infelicities, my constant rocking side-to-side in my chair, my sometimes talking over Dan in my enthusiasm, etc. etc., but also pleasantly surprised by the content of what I said, all of which I still stand by despite the terrifying moral minefields into which Dan invited me. I strongly recommend watching at 2x speed, which will minimize the infelicities and make me sound smarter. Thanks so much to Dan for making this happen, and let me know what you think!

Added: See here for other podcasts in the same series and on the same set of questions, including with Nick Bostrom, Ben Goertzel, Dan Hendrycks, Anders Sandberg, and Richard Sutton.

Read the whole story
denubis
13 hours ago
reply
Share this story
Delete

Speed matters

1 Share

Speed matters

Jamie Brandon in 2021, talking about the importance of optimizing for the speed at which you can work as a developer:

Being 10x faster also changes the kinds of projects that are worth doing.

Last year I spent something like 100 hours writing a text editor. […] If I was 10x slower it would have been 20-50 weeks. Suddenly that doesn't seem like such a good deal any more - what a waste of a year!

It’s not just about speed of writing code:

When I think about speed I think about the whole process - researching, planning, designing, arguing, coding, testing, debugging, documenting etc.

Often when I try to convince someone to get faster at one of those steps, they'll argue that the others are more important so it's not worthwhile trying to be faster. Eg choosing the right idea is more important than coding the wrong idea really quickly.

But that's totally conditional on the speed of everything else! If you could code 10x as fast then you could try out 10 different ideas in the time it would previously have taken to try out 1 idea. Or you could just try out 1 idea, but have 90% of your previous coding time available as extra idea time.

Jamie’s model here helps explain the effect I described in AI-enhanced development makes me more ambitious with my projects. Prompting an LLM to write portions of my code for me gives me that 5-10x boost in the time I spend typing code into a computer, which has a big affect on my ambitions despite being only about 10% of the activities I perform relevant to building software.

I also increasingly lean on LLMs as assistants in the research phase - exploring library options, building experimental prototypes - and for activities like writing tests and even a little bit of documentation.

Via Reilly Wood

Tags: ai-assisted-programming, llms, ai, generative-ai

Read the whole story
denubis
1 day ago
reply
Share this story
Delete

We or They ?

1 Share

Like most academics these days, I spend a lot of time filling in online forms. Mostly, this is just an annoyance but occasionally I get something out of it. A recent survey in which the higher-ups tried to get an idea of how the workforce was feeling, asked the question “Do you think of the University as We or They?”.

Unsurprisingly given my reference to “higher-ups”, my answer was “They”. But giving the answer reminded me that, not so long ago, it would have been “We”. In its idealized form, a university was a self-governing community, with a well-understood teaching and research mission (which did not require a Mission Statement). All but the most senior management jobs were done by academics taking turns before returning to their real jobs. Administrative staff did essential work, largely independently, but didn’t conceive themselves as part of management.

The reality was inevitably less egalitarian and communitarian than this picture suggests, in all sorts of ways. Senior professors had too much power and inevitably, some of them abused it. And, given the times, lots of bad behaviour was tolerated that would not be now.

For good and ill, this has all been swept away, at least in Australia. Multiple layers of management are filled by people who have either left the academic life behind them or were never part of it. The university in this view, is not a community but a business enterprise, even if its ownership structure is rather opaque.

The reality is that of an ordinary workplace in which, most of the time, the interests of bosses and workers are in conflict (though, as in any workplace, there is a shared interest in the survival of the business). Senior managers see themselves as such and compare themselves to their corporate peers. Administrative job titles are those of the corporate sector (Chief Financial Officer and so on)>

Yet, as the question implies, there is a still a feeling that the university should be a We, and not merely in the sense of workers being willing to sing the company song. My own version of this is to think of the current regime as being temporary occupiers, from whom We will be liberated in due course. But others may take a more positive view – I’d be interested in commetnnts

Read the whole story
denubis
1 day ago
reply
Share this story
Delete

Quoting Andrej Karpathy

2 Shares

It's a bit sad and confusing that LLMs ("Large Language Models") have little to do with language; It's just historical. They are highly general purpose technology for statistical modeling of token streams. A better name would be Autoregressive Transformers or something.

They don't care if the tokens happen to represent little text chunks. It could just as well be little image patches, audio chunks, action choices, molecules, or whatever. If you can reduce your problem to that of modeling token streams (for any arbitrary vocabulary of some set of discrete tokens), you can "throw an LLM at it".

Andrej Karpathy

Tags: andrej-karpathy, llms, ai, generative-ai

Read the whole story
denubis
1 day ago
reply
Share this story
Delete

Quoting Pamela McCorduck, in 1979

1 Share

There is superstition about creativity, and for that matter, about thinking in every sense, and it's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something - play good checkers, solve simple but relatively informal problems - there was a chorus of critics to say, but that's not thinking.

Pamela McCorduck, in 1979

Tags: ai

Read the whole story
denubis
3 days ago
reply
Share this story
Delete
Next Page of Stories