I’m looking at people running Deepseek’s 671B R1 locally using NVME drives at 2 tokens a second. So why not skip the FLASH and give me a 1TB drive using a NVME controller and only DRAM behind it? The ultimate transistor count is lower on the silicon side. It would be slower than typical system memory but the same as any NVME with a DRAM cache. The controller architecture for safe page writes in Flash and the internal boost circuitry for pulsing each page is around the same complexity as the memory management unit and constant refresh of DRAM and its power stability requirements. Heck, DDR5 and up is putting power regulation on the system memory already IIRC. Anyone know why this is not a thing or where to get this thing?


I don’t know. I have, in the past, been vaguely interested in getting a physical RAM drive to throw more memory on a system than a motherboard supports for use as a read cache for slower storage, and I only ran into things from years back.
My guess is that there just isn’t enough stuff out there where one really benefits all that much from that much more memory, so not a huge amount of demand. It might also be that the idea is to segregate desktop computing from some other environments, and Intel and similar want to put a cap on how much memory is accessible to desktop systems so that they can charge higher prices for specialized systems. I don’t know how they’d prevent someone from throwing a ton of DRAM chips in a drive, but I can imagine them having an incentive for it.