bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

New program: rand(1)


From: Tim Rice
Subject: New program: rand(1)
Date: Tue, 16 Aug 2022 22:45:11 +0000

Hey all,

I have made a preliminary commit for a new program to simulate random 
variables, similar to functions like runif, rnorm and rexp in R.

This was partly motivated by wanting a convenient way to benchmark GNU Datamash 
on large numbers of unpredictable inputs.

Currently implemented are unif (continuous Uniform distribution), exp 
(Exponential distribution), and norm (Normal distribution). I expect to 
implement additional distributions in the coming weeks.


Sample usage:

$ rand --rate 0.2 exp 16 | column -c80
6.052907        5.259284        5.180190        1.108741
0.123777        3.394979        0.948344        0.313611
1.244067        5.443215        4.219600        14.698509
3.775806        13.298394       2.857540        8.236070

$ time rand norm 1000000 | datamash jarque 1
0.43937664067692

real    0m0.319s
user    0m0.541s
sys     0m0.017s


Let me know if you notice any issues, whether that be possible bugs, 
statistical biases I might have overlooked, or easy ways to improve performance.

When it comes to performance, there could be baroque algorithms that result in 
mild speedups. I'd prefer to keep code simple where possible, so if you have 
some favorite algorithm, the onus will be on you to convince me to include it. 
You should demonstrate that it provides a significant benefit and/or isn't all 
that hard to implement/maintain. Patches will be considered :)

~ Tim



reply via email to

[Prev in Thread] Current Thread [Next in Thread]