• 0 Posts
  • 8 Comments
Joined 1 year ago
cake
Cake day: August 4th, 2024

help-circle


  • Unlike the dotcom bubble, Another big aspect of it is the unit cost to run the models.

    Traditional web applications scale really well. The incremental cost of adding a new user to your app is basically nothing. Fractions of a cent. With LLMs, scaling is linear. Each machine can only handle a few hundred users and they’re expensive to run:

    Big beefy GPUs are required for inference as well as training and they require a large amount of VRAM. Your typical home gaming GPU might have 16gb vram, 32 if you go high end and spend $2500 on it (just the GPU, not the whole pc). Frontier models need like 128gb VRAM to run and GPUs manufactured for data centre use cost a lot more. A state of the art Nvidia h200 costs $32k. The servers that can host one of these big frontier models cost, at best, $20 an hour to run and can only handle a handful of user requests so you need to scale linearly as your subscriber count increases. If you’re charging $20 a month for access to your model, you are burning a user’s monthly subscription every hour for each of these monster servers you have turned on. That’s generous and assumes they’re not paying the “on-demand” price of $60/hr.

    Sam Altman famously said OpenAI are losing money on their $200/mo subscriptions.

    If/when there is a market correction, a huge factor of the amount of continued interest (like with the internet after dotcom) is whether the quality of output from these models reflects the true, unsubsidized price of running them. I do think local models powered by things like llamacpp and ollama and which can run on high end gaming rigs and macbooks might be a possible direction for these models. Currently though you can’t get the same quality as state-of-the-art models from these small, local LLMs.






  • 100% ageed. We seriously need to normalise compassion for colleagues and pushing back on overwork and entitled clients. I’ve been a senior manager in IT for a decade and whether one of my reports’ kids is sick or someone just needs half a day out due to a migraine, I’ll let them do what they need to do. Turns or if you treat your colleagues like actual fucking human beings they actually appreciate you and often return the favour by working late or going above and beyond somewhere else when the crisis has been dealt with.

    Some of the horror stories in this thread of people phoning into zoom calls from the doctor’s office are insane. The world is not going to end just because Jerry couldn’t join this morning’s scrum or you had to move a call with a client.

    “Oh but if underling #148 takes the afternoon off sick we won’t hit our weekly sales goal”. You know what Karen, firstly we sell receipt processing software, we are not saving any lives here get a fucking grip. Secondly, if you deny them that time off they will be doing the bare minimum for the next quarter while they job hunt and inevitably leave.

    I truly resent being forced to partake in this ridiculous system but for as long as I have to, I’m going to have the decency to try to protect my team the best I can.