For example, here is distribution of q-grams count in 120 Mb of dblp paper
titles (pretty large dataset).
q count
2 7218
3 115107
4 589428
5 1648453
6 3336685
Number of 5-grams if about 15x larger than number of 3-grams. But most part
of index space will be occupied by links to the rows(about 120 millions of
links), while size of q-grams itself will be almost ignorable in comparison
with it.
----
With best regards,
Alexander Korotkov.