Support for "Random()"?


#1

Does MapD’s sql support the use of the Random() function?


#2

Hi,

No MapD does not support the Random() function.

Can you tell us a little about your use case, and what you would expect the behaviour of this function to be.

It is very easy to add a function via the ExtensionFunctions framework, but we would want to check that the implementation was efficient on the GPU.

regards


#3

Well, behaviour-wise, basically the way that random would work in other sql variants. return a number between 0->1 every time it is called.

Our use case for this particular moment, was to sub-sample some data.


#4

Hi,

If your use case is just for sub-sampling you could go about that without needing random(), depending on how managed the sample needs to be you could use a MOD style operation on the ROWID of the loaded rows.

Basically you can use something like where MOD(column.rowid , 100) = 1 to give you a one percent sample.

You can get more sophisticated with the operation to get different behaviour. We use this in Immerse occasionally for sampling.

regards


#5

Interesting, do all tables have a hidden/default rowid column?
(an id column is not something I’ve tended to bother having in most big-datasets)


#6

Hi,

Yes, they do currently. We do not expose this feature in the docs as we reserve the right to change the behaviour and potentially remove in the future, so it is currently a use it, but be aware it might change in the future, kind of feature.

regards


#7

Note that rowid is a virtual column so there is no storage/memory overhead for it (it is computed on-the-fly when needed)


#8

awesome!
Thanks guys! :slight_smile: