For some context, the idea is to build a text to SQL interface. The interface allows you to select certain tables from the data warehouse and injects their definitions in the prompt, so the 4096 context limit is useful here.
Check this PR out, you can see the chart showing that even the best 13B quantisation would be a far cry from the 30B with 2 bit quantisation: https://github.com/ggerganov/llama.cpp/pull/1684