Fascinating article if only because AI or not, for the first time in a long time I'm seeing a 4 year old machine wheezing under the relentless weight of nonstop Teams/Zoom calls, screensharing and the nonstop running of various pieces of software pulling 10s of thousands of price updates every second. Can't recall the last time I saw a solid [$3.5k] machine struggle after only 4 years!!
Great read on the PC's comeback tour - who knew those old boxes had an encore in them? Thanks for the tech-savvy crystal ball! Keep those insights coming!
I want to see a side by side video of a PC and an "AI PC" running and someone explain and demo the difference, not just jargon. Searching BestBuy does not turn up anything significant. So far it just looks like a hype term. Where are the actual products?
I think it's still something of an open question whether or not there's a material advantage to running LLMs locally, but my initial thought here would be no for a few reasons.
First, it's a computationally- and data-intensive computation with utterly trivial network requirements. The query/response is simple plain-text, with the data requirements being the model itself and whatever supporting data may be desirable to flesh out the results or enable further computation. Why shell out top-dollar to compete with tech company wallets for constrained hardware supply when you can pay a subscription fee? Other than the privacy of your question/response, you're not gaining much here and paying a lot of money for it.
Second, running LLM computations on the edge means moving the models themselves to the edge. We're still in the early days, so the models/weights are quite valuable and expensive to produce. If there's very little practical incentive to run the models locally to begin with (see above), why risk something taking the model and running when you can instead just host the model in the cloud?
Third, we're still largely in the "hope" phase of AI/LLMs. There's a lot of hope that this will generate real economic return commensurate with the cost, but I'm not sure to what extent this return has been realized. In my experience it's a net-productivity improvement, but for newer or more junior people I can almost see that reversing since they're not in as good of a position to judge when they're getting a bad answer.
TLDR, I think we're still in the "hope" phase of the technology itself, with a lot of hype and still a mixed bag on real-world return. Added to that, the incentives to push this compute to the edge seems weak at best and likely there's more incentive to keep it off the edge for a while from a data-value perspective. There might be a time for this trade, but I think you're early.
Well, it hasn't even been a year yet and I'm wrong on all points. I was surprised to see NPUs that could run smaller versions of these models get to market so quickly, but Microsoft has announced hardware that can handle it and software to utilize it. It looks like inference, at least with some trimmed down models, may start moving to the edge. It remains to be seen what the customer uptake will be, but given that the chips they're using are good al-round hardware even without the NPUs, customers may start getting them into their hands even if they weren't interested in the LLM features themselves.
On the usage front, from what I understand there's genuine buying going on here. Companies seem willing to shell out the cash to give it a shot. While its still early to say how much of that will stick, there's a lot more reason now to think that this is going to drive real economic value and not just be a new toy.
Well there's Llama - the open source AI models that are catching the closed LLMs and being used for highly specific training cases. Would that be a reason to ramp compute at the edge?
I'm an AI skeptic but my 4 year old workstation machine's XEON chips and once overabundance of DRAM [32GB] is starting to groan under the weight of ever heavier software and higher interactive loads.
Could the replacement cycle boom - AI or no? THANK YOU FOR ANY THOUGHTS!
Mistral 7B is an open-source trimmed-down version of the Llama 2 model (an alternative to chatGPT) that can be fitted into an ordinary pc to do inferencing (i.e., answering users' questions, which is far, far less demanding than training an LLM). IMAO, the current development of such open-source LLMs is progressing at lightning speed. For those who know how this works, they don't need to upgrade their current pcs to do fantanstic things.
But for ordinary consumers, there might be a FOMO if they don't follow others to upgrade their pcs when everyone is talking about AI. At the end of the day, perhaps behaviors move stock prices more than fundamentals do.
AI is like the oh boy moment, I can crunch a spreadsheet on my PC. Did nothing for productivity. And costs are a lot lower, considering that first computer might have cost you a grand in 1990's dollars. AI is going to start a data stampede. Ask Bing a question and he goes to the same sources you already knew. Its' faster (and less critical). Ultimately are you better at picking stocks (or race horses - James Quinn wrote a book on that). So if you thought information overload was a problem, get ready. Bing will help you sort through the data? (see lack of critical parsing of sources), same old garbage in garbage out admonitions. And Bill Gates will get rich all over again.
One day during our conversation, my daughter said, "What will someone do with Software without Hardware? Cloud is a concept that involve data servers construcred with chips, even AI, virtual etc"
This is huge, something I’ve written about before over at Seeking Alpha. I’d recommend checking out some of Dell’s investor calls where executives spell out in detail a lot of the hardware greenfield in front of the company. Cheers!
Um, well, yes, you probably won’t get far searching for an AI PC given that no PC maker has marketed an AI PC (as if that would be a thing). An AI capable PC (big difference) would be equipped to handle the computing workload required for on-device processing (aka, edge computing). And please, don’t bother visiting the Dell website and searching for ‘AI capable PC.’
Quite unlikely, for a few reasons. One is that the *size* if the yraining sets are what gives it the illusion of intelligence. For another, if we already know how to reduce the data set to "eliminate the hallucination effect", the what the heck do we need the LLM for?
Fascinating article if only because AI or not, for the first time in a long time I'm seeing a 4 year old machine wheezing under the relentless weight of nonstop Teams/Zoom calls, screensharing and the nonstop running of various pieces of software pulling 10s of thousands of price updates every second. Can't recall the last time I saw a solid [$3.5k] machine struggle after only 4 years!!
Great read on the PC's comeback tour - who knew those old boxes had an encore in them? Thanks for the tech-savvy crystal ball! Keep those insights coming!
I want to see a side by side video of a PC and an "AI PC" running and someone explain and demo the difference, not just jargon. Searching BestBuy does not turn up anything significant. So far it just looks like a hype term. Where are the actual products?
I think it's still something of an open question whether or not there's a material advantage to running LLMs locally, but my initial thought here would be no for a few reasons.
First, it's a computationally- and data-intensive computation with utterly trivial network requirements. The query/response is simple plain-text, with the data requirements being the model itself and whatever supporting data may be desirable to flesh out the results or enable further computation. Why shell out top-dollar to compete with tech company wallets for constrained hardware supply when you can pay a subscription fee? Other than the privacy of your question/response, you're not gaining much here and paying a lot of money for it.
Second, running LLM computations on the edge means moving the models themselves to the edge. We're still in the early days, so the models/weights are quite valuable and expensive to produce. If there's very little practical incentive to run the models locally to begin with (see above), why risk something taking the model and running when you can instead just host the model in the cloud?
Third, we're still largely in the "hope" phase of AI/LLMs. There's a lot of hope that this will generate real economic return commensurate with the cost, but I'm not sure to what extent this return has been realized. In my experience it's a net-productivity improvement, but for newer or more junior people I can almost see that reversing since they're not in as good of a position to judge when they're getting a bad answer.
TLDR, I think we're still in the "hope" phase of the technology itself, with a lot of hype and still a mixed bag on real-world return. Added to that, the incentives to push this compute to the edge seems weak at best and likely there's more incentive to keep it off the edge for a while from a data-value perspective. There might be a time for this trade, but I think you're early.
Well, it hasn't even been a year yet and I'm wrong on all points. I was surprised to see NPUs that could run smaller versions of these models get to market so quickly, but Microsoft has announced hardware that can handle it and software to utilize it. It looks like inference, at least with some trimmed down models, may start moving to the edge. It remains to be seen what the customer uptake will be, but given that the chips they're using are good al-round hardware even without the NPUs, customers may start getting them into their hands even if they weren't interested in the LLM features themselves.
On the usage front, from what I understand there's genuine buying going on here. Companies seem willing to shell out the cash to give it a shot. While its still early to say how much of that will stick, there's a lot more reason now to think that this is going to drive real economic value and not just be a new toy.
Well there's Llama - the open source AI models that are catching the closed LLMs and being used for highly specific training cases. Would that be a reason to ramp compute at the edge?
I'm an AI skeptic but my 4 year old workstation machine's XEON chips and once overabundance of DRAM [32GB] is starting to groan under the weight of ever heavier software and higher interactive loads.
Could the replacement cycle boom - AI or no? THANK YOU FOR ANY THOUGHTS!
Well, arguably AI is a better marketing spin than "gaming"...
To support the "marketing spin" angle, let me share some personal understanding here. It appears that even a currently available consumer-grade pc like the M1 16GB MacBook Air is not bad at running the Mistral 7B model locally (e.g., using @LMStudioAI https://lmstudio.ai). See the experiences shared in these tweets: https://twitter.com/chakkaradeep/status/1727574808240824455 and https://twitter.com/skirano/status/1727453513670709721 .
Mistral 7B is an open-source trimmed-down version of the Llama 2 model (an alternative to chatGPT) that can be fitted into an ordinary pc to do inferencing (i.e., answering users' questions, which is far, far less demanding than training an LLM). IMAO, the current development of such open-source LLMs is progressing at lightning speed. For those who know how this works, they don't need to upgrade their current pcs to do fantanstic things.
But for ordinary consumers, there might be a FOMO if they don't follow others to upgrade their pcs when everyone is talking about AI. At the end of the day, perhaps behaviors move stock prices more than fundamentals do.
Just out now on the open-source "new Mistral 8x7B model outperforming gpt-3.5 and llama2 70B. "
https://twitter.com/jerryjliu0/status/1734388929003139175
Great note, thanks!
Totally agree. Was this shift in perception to the edge just based on AMD's comments? What will a $3,000 AI PC even provide?
AI is like the oh boy moment, I can crunch a spreadsheet on my PC. Did nothing for productivity. And costs are a lot lower, considering that first computer might have cost you a grand in 1990's dollars. AI is going to start a data stampede. Ask Bing a question and he goes to the same sources you already knew. Its' faster (and less critical). Ultimately are you better at picking stocks (or race horses - James Quinn wrote a book on that). So if you thought information overload was a problem, get ready. Bing will help you sort through the data? (see lack of critical parsing of sources), same old garbage in garbage out admonitions. And Bill Gates will get rich all over again.
One day during our conversation, my daughter said, "What will someone do with Software without Hardware? Cloud is a concept that involve data servers construcred with chips, even AI, virtual etc"
This is huge, something I’ve written about before over at Seeking Alpha. I’d recommend checking out some of Dell’s investor calls where executives spell out in detail a lot of the hardware greenfield in front of the company. Cheers!
I went to Dell.com and searched on "AI PC" and got a lot of ordinary PCs. Ask Dell to show you one instead of just talk about it.
Um, well, yes, you probably won’t get far searching for an AI PC given that no PC maker has marketed an AI PC (as if that would be a thing). An AI capable PC (big difference) would be equipped to handle the computing workload required for on-device processing (aka, edge computing). And please, don’t bother visiting the Dell website and searching for ‘AI capable PC.’
Ah, but you can at Intel! ;-) https://www.intel.com/content/www/us/en/products/docs/processors/core-ultra/ai-pc.html
Very interesting!
"would be" - future tense. Just pointing out that they do not exist. No point at getting excited until they exist.
"local large language models",
aka "small large language models"
aka anyone else see the contradiction?
Could narrower data sets may eliminate the hallucination effects and improve on the corrosion of large public data training?
Perhaps retrieval-augmented generation (RAG) is a way to mitigate hallucination.
See some offerings by solution providers like these:
https://twitter.com/Gradient_AI_/status/1729189923809554723
https://twitter.com/ecardenas300/status/1713577279975051429
Quite unlikely, for a few reasons. One is that the *size* if the yraining sets are what gives it the illusion of intelligence. For another, if we already know how to reduce the data set to "eliminate the hallucination effect", the what the heck do we need the LLM for?
Yes it appears contradictory. To me:
- large language models = 100s billions to trillions of parameters
- small or local large language models = billions to tens of billions of parameters
I.e. much smaller but still large.
Alas, you *really* don't understand the scale of resources needed for these LLMs
Care to elaborate? Resources in terms of RAM, GPU, and memory bandwidth?
All of the above by many orders of magnitude.