Wanted: Advice from CS teachers
-
@aredridel @EricLawton @david_chisnall @maco
I've had so many people say "it knows how to write code now" as if this is somehow ... new and different from generating text. As if there as been some foundational advancement and not just the same tool applied again.
@futurebird @aredridel @EricLawton @david_chisnall @maco They have been improving the ability of the models writing code, probably faster than it's improving on almost any other ability. They can do this by what's called reinforcement learning with verifiable rewards (RLVR), since with code it's possible to verify whether the result is correct or not (whether it compiles, whether it passes a particular test or test suite, etc)
So while the pre training is based on just predicting the next token in existing code bases, they can then make it better and better at coding by giving it problems to solve (get this code to compile, fix this bug, implement this feature, etc), check whether it succeeded, and apply positive or negative reinforcement based on the result.
And this can scale fairly easily; you can come up with whole classes of problems, like "implement this feature in <language X>" and vary the language while using the same test suite, and now you can train it to write all of those languages better.
So while there are also improvements in the tooling, the models themselves have been getting quite a bit better at both writing correct code on the first try, and also figuring out what went wrong and fixing it when it doesn't work on the first try.
In fact, there are now open weights models (models that you can download and run on your own hardware, though for the biggest ones you really need thousands to tens of thousands of dollars of hardware to run the full model) which are competitive with the top tier closed models from just 6 months ago or so on coding tasks, in large part because of how effective RLVR is.
-
Wanted: Advice from CS teachers
When #teaching a group of students new to coding I've noticed that my students who are normally very good about not calling out during class will shout "it's not working!" the moment their code hits an error and fails to run. They want me to fix it right away. This makes for too many interruptions since I'm easy to nerd snipe in this way.
I think I need to let them know that fixing errors that keep the code from running is literally what I'm trying to teach.
@futurebird I'd respond with a few key questions:
- In what way is it not working?
- Why do you think that is?
- If you can see errors, what do they tell you?
- How can you find out more about what is or is not happening?And there's the all-important "What are your assumptions, and are they correct?"
-
"Now I'm curious about whether LLMs' code compiles and executes error-free on their first attempt."
At first it did not, but they have added a routine to run it through a compiler until it at least runs without syntax errors and probably produces output that seems like what you asked for for a limited example of input.
This is a bolted on extra check, not some improvement in the base LLM.
But some people are acting like it does represent advances in the LLM.
@futurebird @EricLawton @david_chisnall
there are certain languages (such as C) in which that would be a cruel trick; lots of code which contains subtle undefined behavior bugs that don't show easily will compile without errors, or in many cases, often without warnings as well. Not all undefined behavior is detectable at compile time. -
Once there are few enough expert human programmers left, the price will go up.
And, if I read you correctly, they don't guarantee output accuracy with respect to input tokens but charge extra to try again.
And if they charge per output token, that is incentive to generate filler, certainly not to optimize.
@EricLawton @maco @aredridel @futurebird @david_chisnall we don't know exactly how much it costs for the closed models; they may be selling at a loss, break even, or a slight profit on interference. But you can tell exactly how much inference costs with open weights models, you can run them on your own hardware and measure the cost of the hardware and power. And there's a competitive landscape of providers offering to run them. And open weights models are only lagging behind the closed models by a few months by now.
If the market consolidates down to only one or two leading players, then yes, it's possible for them to put a squeeze on the market and jack up prices. But right now, it's a highly competitive market, with very little stickiness, it's very easy to move to a different provider if the one you're using jacks up prices. Right now each of OpenAI, Anthropic, Google, and xAI are releasing frontier models regularly which leapfrog each other on various benchmarks, and the Chinese labs are only a few months behind, and generally release open weight models which are much easier to measure and build on top of. There's very little moat right now other than sheer capacity for training and inference.
And I would expect, if we do get a consolidation and squeeze, it would just be by jacking up prices, not by generating too many tokens. Right now inference is highly constrained; those people I work with who use these models regularly hit capacity limitations all the time. These companies can't build out capacity fast enough to meet demand, so if anything they're motivated to make things more efficient right now.
I have a lot of problems with the whole LLM industry, and I feel like in many ways it's being rushed out before we're truly ready for all of the consequences, but it is actually quite in demand right now.
-
@EricLawton There have been 500,000 tech layoffs in the last few years. We've got no shortage of skilled tech knowledge for hire. At the pace we're going, there's no chance of a dwindling supply of programmers in my lifetime.
If you haven't been coding for a few years, you won't be a skilled programmer. It won't take a lifetime to run out of them.
-
@raganwald
The best, most succinct, explanation of the difference here came from @pluralistic:
Coding makes things run well, software engineering makes things fail well.
All meaningful software fails over time as it interacts with the real world and the real world changes., so handling failure cases well is important.
Handling these cases involves expanding one's context window to take into account a lot of different factors.
For LLMs, a linear increase in the context window results in a quadratic increase in processing. And the unit economics of LLMs sucks already without squaring the costs.
Which is why AI, in its current incarnation, is fundamentally not capable of creating good software.(I've heavily paraphrased, so apologies if he reads this).
-
Example of the problem:
Me: "OK everyone. Next we'll make this into a function so we can simply call it each time-"
Student 1: "It won't work." (student who wouldn't interrupt like this normally)
Student 2: "Mine's broken too!"
Student 3: "It says error. I have the EXACT same thing as you but it's not working."
This makes me feel overloaded and grouchy. Too many questions at once. What I want them to do is wait until the explanation is done and ask when I'm walking around. #CSEdu
@futurebird Wait until you teach them the "let it crash" philosophy of software engineering.
-
Example of the problem:
Me: "OK everyone. Next we'll make this into a function so we can simply call it each time-"
Student 1: "It won't work." (student who wouldn't interrupt like this normally)
Student 2: "Mine's broken too!"
Student 3: "It says error. I have the EXACT same thing as you but it's not working."
This makes me feel overloaded and grouchy. Too many questions at once. What I want them to do is wait until the explanation is done and ask when I'm walking around. #CSEdu
@futurebird one recommendation - one rule that worked when I was learning programming and my teacher didn't like when I interrupted her - if you've got an issue because you're ahead or behind others, wait till the teacher is available. Till then, muck around, debug, try random things.
-
So Your Code Won't Run
1. There *is* an error in your code. It's probably just a typo. You can find it by looking for it in a calm, systematic way.
2. The error will make sense. It's not random. The computer does not "just hate you"
3. Read the error message. The error message *tries* to help you, but it's just a computer so YOUR HUMAN INTELLIGENCE may be needed to find the real source of error.
4. Every programmer makes errors. Great programmers can find and fix them.
1/
@futurebird
> 2. The error will make sense. It's not random. The computer does not "just hate you"
learning to have a constant faith in this has gotten me through so much shit that might otherwise have caused me to physically break something and give up forever
psychologically it's like "if you keep the spear pointed at the horse you will be safer than if you broke rank and ran" - you know logically that is what it is but every second of it is screaming at you to ignore that understanding and in the end what you train for will win out -
Yeah...
what I'm trying to convey is that there is a *reason* why the code isn't working and it will make sense in the context of the rules the got dang computer is trying to follow.
It might be annoying or silly, but it will "make sense"
@futurebird @mansr constantly grumbling the whole time you're fixing the problem about the idiots who design $THING like that can be a helpful coping mechanism for some -
@futurebird @mansr constantly grumbling the whole time you're fixing the problem about the idiots who design $THING like that can be a helpful coping mechanism for some@futurebird @mansr ...this just goes back to my whole thing about if maybe younger people have more learned helplessness about everything because more of their lives is dictated by arbitrary rules imposed on them by [EDIT: the invisible, untouchable people in some office somewhere who dictate] their cultural environment rather than the non-arbitrary rules of the physical world
no matter how dumb the rules of a sportball game get, the ball *must* move in certain ways in response to certain actions
that's not the case in a video game -
@raganwald @futurebird @EricLawton @david_chisnall I suppose there's something to be said for figuring out which parts of the received wisdom (built up by years of collective experience) are still valid....but there are better ways to do that than throwing it all out! (And I doubt that's their motivation anyway.)
-
@futurebird assigning code broken in specific ways & having a rubric for teaching the troubleshooting sounds like it should be SOP for coding courses, is this not normally part of the curriculum?
(def not dumping on you, asking as an Old who is a self-taught potato coder who never did a CS degree & feels like the way I learned basically anything that I do know was: type it in from a magazine or other source / modify working code that’s similar to what I need -> make mistakes in transcription / tweaks -> code doesn’t run or runs with errors -> troubleshoot the mistakes -> learn stuff
)@itgrrl @futurebird i never saw that in high school in the 90s... -
At the university we had this maybe once.
But then, to quote a professor: "You are learning 'computer science' here. 'Programming' is something that you should either already know or learn in your free time."
@wakame @voltagex @itgrrl @futurebird [vague memory of a passage in solzhenitsyn about "engineers" and people who've never had to lay a brick] -
@futurebird
I know this from people I taught programming.And I think the main problem is that the computer is judging you. In a way.
This can come in two forms:
a) The program fails to run, shows you an error, etc.
b) The IDE adds an error or warning to a line saying: This is wrong.So there is "objective proof" right there on the screen that you "are a failure". This is not some other person saying it, this is a piece of technology.
This is also something I hate from a usability/user experience perspective.
The computer doesn't say: "Sorry, I don't understand what you mean with that line."
It says: "This line can not be processed because the user is dumb."(Not quite, overemphasizing.)
When taking about critique or blame, there is this typical antipattern: "Everybody uses a fork."
No, they don't. I use a fork, I want you to use a fork, but instead of saying that, I invoke a mystical "everybody".
@wakame @futurebird my immediate instinct is to object that these error messages are about the input, not the person sending the input, but making it not personal / not making it personal is also one of those important skills that everyone used to assume everyone had and no one taught and now no one has -
@futurebird
I totally cried when I was 14 and I tought in my naivety that I knew almost everything and then a simple program failed.
[Edit: And seriously: I think it is hard to understand if the voice from god tells your that there is an error line 32, that this could be somehow wrong.
I mean, this is a computer, right? It doesn't make mistakes.
Maybe emphasizing that the IDE and the compiler and everything else was written by humans and that they discover bugs in those programs all the time could help.]
@wakame @futurebird
> the voice from god
i rarely had this problem and i also could never understand what people at church and elsewhere were talking about when they talked about feeling the presence of god or whatever
i just thought of it as pure cause and effect, like
you're rolling a toy car down a track
the track has a snag in it you can't see
the toy gets derailed and hits the floor
you don't look at the floor for the snag -
@wakame @futurebird
> the voice from god
i rarely had this problem and i also could never understand what people at church and elsewhere were talking about when they talked about feeling the presence of god or whatever
i just thought of it as pure cause and effect, like
you're rolling a toy car down a track
the track has a snag in it you can't see
the toy gets derailed and hits the floor
you don't look at the floor for the snag@wakame @futurebird (not that i don't make the mistake of checking everything from lines 8 through 64 after an error on line 32 without looking up to line 4, but that's more just lazily assuming that past me must've gotten "the basic stuff" right and any error must've been further down) -
@raganwald
The best, most succinct, explanation of the difference here came from @pluralistic:
Coding makes things run well, software engineering makes things fail well.
All meaningful software fails over time as it interacts with the real world and the real world changes., so handling failure cases well is important.
Handling these cases involves expanding one's context window to take into account a lot of different factors.
For LLMs, a linear increase in the context window results in a quadratic increase in processing. And the unit economics of LLMs sucks already without squaring the costs.
Which is why AI, in its current incarnation, is fundamentally not capable of creating good software.(I've heavily paraphrased, so apologies if he reads this).
@flipper @raganwald @pluralistic @futurebird @EricLawton @david_chisnall I really hope it's a) true and b) stays like that
-
@wakame @futurebird my immediate instinct is to object that these error messages are about the input, not the person sending the input, but making it not personal / not making it personal is also one of those important skills that everyone used to assume everyone had and no one taught and now no one has@wakame @futurebird so far this thread it seems to teach someone how to program a computer they must first learn
- conflict management and de-escalation skills
- theory of mind
- rationalist epistemiology
- emotional self-discipline
- scientific method (controlled testing)
- the art of doing things one thing at a time (and figuring out what "one" "thing" is when it might not be self-evident)
... -
@wakame @futurebird so far this thread it seems to teach someone how to program a computer they must first learn
- conflict management and de-escalation skills
- theory of mind
- rationalist epistemiology
- emotional self-discipline
- scientific method (controlled testing)
- the art of doing things one thing at a time (and figuring out what "one" "thing" is when it might not be self-evident)
...@futurebird @wakame conclusion: programming is a martial art