I’m looking at people running Deepseek’s 671B R1 locally using NVME drives at 2 tokens a second. So why not skip the FLASH and give me a 1TB drive using a NVME controller and only DRAM behind it? The ultimate transistor count is lower on the silicon side. It would be slower than typical system memory but the same as any NVME with a DRAM cache. The controller architecture for safe page writes in Flash and the internal boost circuitry for pulsing each page is around the same complexity as the memory management unit and constant refresh of DRAM and its power stability requirements. Heck, DDR5 and up is putting power regulation on the system memory already IIRC. Anyone know why this is not a thing or where to get this thing?

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    because ram drives are easy to accomplish the same thing in software for applications that need it

    I don’t know if this is what OP is going for, but I’ve wanted to do something similar to what he’s talking about myself to exceed the maximum amount of memory that a motherboard supported. Basically, I wanted to stick more memory on a system – and I was fine with access to it being slower than to the on-motherboard memory – to act as a very large read cache.

    A RAM drive will let you use memory that your motherboard supports as a drive. But it won’t let you stick even more physical DRAM into a system, above-and-beyond what the motherboard can handle.

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 year ago

        No, I’m not. I’m talking about providing access to more physical DRAM than a motherboard is capable of handling.

        Swap would involve taking infreqently-used stuff in a still-bounded amount of DRAM and sticking it in slower, cheaper nonvolatile storage.

    • 𞋴𝛂𝛋𝛆@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Yes it is like a swap, if a person can only think in terms of the software side. DRAM can be changed at any place in memory. For Flash, there is a minimum page size. All bits in a Flash page, like 1k bits, can be set high or low, depending on the architecture. Then these bits can be changed to the opposite charge. If any bit on the page needs to be changed to the original value, it requires resetting the entire page. In reality, it takes swapping pages around to ensure data integrity if power is lost. Every time the page is changed like this, it reduces some of the page’s potential to reset to the default state. If I recall correctly (unlikely) each bit is a combo of 3 transistors and a capacitor like charge storage location.

      DRAM is a single special transistor with a built in capacitor. The main problem with DRAM is that it only lasts around 100-200 microseconds. So all DRAM is constantly getting refreshed and moved around several times a second. The volatile memory type that makes sense in simple terms is static RAM. This is what the CPU cache is made from. It is 6 transistors per bit and each bit is like an on off switch with no caveat nonsense so long as the thing is powered.

      The addressing scheme is simply serial versus parallel. So loading 512 bit wide AVX words is going to be slower than with the address bus of the primary memory controller. Still, no one is complaining about their NVME speed here. The microcontroller for the NVME would be nearly capable of handling DRAM. The main difference is that Flash page refreshing takes tens of volts and a small negative power rail, while DRAM needs very stable power because there is very little voltage difference between a one and zero. The harder part is how to manage all of the memory address spaces and any bad blocks of bits. This is what the NVME is already doing.