I'd be surprised if it was significantly less. A comparable 70 billion parameter model from llama requires about 120GB to store. Supposedly the largest current chatgpt goes up to 170 billion parameters, which would take a couple hundred GB to store. There are ways to tradeoff some accuracy in order to save a bunch of space, but you're not going to get it under tens of GB.
These models really are going through that many Gb of parameters once for every word in the output. GPUs and tensor processors are crazy fast. For comparison, think about how much data a GPU generates for 4k60 video display. Its like 1GB per second. And the recommended memory speed required to generate that image is like 400GB per second. Crazy fast.
I feel like this really depends on what hardware you have access too. What are you interested in doing?How long are you willing to wait for it to generate, and how good do you want it to be?
You can pull off like 0.5 word per second of one of the mistral models on the CPU with 32GB of RAM. The stabediffusion image models work okay with like 8-16GB of vram.