Grieve (grieve) wrote,

Non-uniform Random Distribution

At my old job I often interviewed candidates for computer programming positions. One of the most clever questions, or so I thought, was the following:

There is a function rand(). The funtion rand() will return the numbers between 0 and 1. The function rand() can return 0, but it will not return 1. All numbers returned by the rand() function have the same probability of being returned. We will call this uniform random distribution.

Given rand() write an algorithm which can return non-uniformly distributed numbers. Specifically as input you will be given an array of floats. Each float in the array determines the percentage chance of the index of that array element being returned. For example if given the array [0.1, 0.2, 0.3, 0.5] 0 will be returned 10% of the time, 1 will be returned 20% of the time, and so on. The values of the array are guaranteed to sum to 1. We will call this non-uniform random distribution.

We discovered the question when one of my co-worker's wife needed to solve this for a program she was working on.

This was an interesting problem and had two solutions of which I knew. One could solve the problem in O(1) time, but was potentially memory intensive. The other could solve the problem with a much smaller amount of memory but was O(logn). Both used an array as an internal memory structure to map the values as appropriate. I still believe my old co-workers may ask this question, so I will leave the details of the answers as an excercise to the reader.

At my new job I was discussing non-uniform random distribution with a co-worker, he needed a bell curve distribution, and I realized he was discussing the problem this interview question was all about. I explained the two answers to him, thinking that I was being clever. He then proceeded to tell me the solution he used, and I was highly impressed. Since he had a specific distribution in mind, he could actually write a function which took the output of rand() as the input and directly calculated an output using no interal memory structures (other than variables on the stack).

It made me realize that in essence any solution of this problem was just a mapping from X (the output of rand()) to Y the number you are looking for. The simplest case is uniform random distribution in which you have Y = X. While it seems obvious in retrospect I had never thought of the problem in those terms.

With this new perspective on the problem I realized the solutions we were looking for in our interviews were completely generic, and could solve any distribution you cared to make an array for. However, it was probably never going to be the most efficient for a specific distribution, and it makes me wonder if you can programatically come up with a function for a non-uniform random distribution given a set of points, where the function uses no internal memory structure other than variables on the stack. Maybe some of my mathier friends can shed some light on it.
Tags: programming

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.