Discussion:
Types of Data Distributions
(too old to reply)
c***@gmail.com
2014-12-30 13:52:07 UTC
Permalink
Hello All,

This is a combination of an Excel and a Statistics question.

I'm given various types of distributions and asked to group them into like "buckets" or "tiers". So for example, (and this will vary every time) if I'm given a list of sales data for 300+ locations, I'm trying to get them into groups.

Here's my formulas so far:

J4 = # of Locations: 377
J5 = # of Bins: =ROUNDUP(LOG(J4,2)+1,0)
J6 = Bin Size: = (J7-J8)/J5
J7 = Max of Data Distribution = max(sales data) = 663
J8 = Min of Data Distribution = min(sales data) = 176

My question is:

Does anyone know any other methods for estimating the number of bins? The methodology I'm using above is called Sturges' formula.

How would other methods fit into my variables above?


Thanks to All and let me know if I can clarify anything further.
Gordon Sande
2014-12-30 14:48:04 UTC
Permalink
Post by c***@gmail.com
Hello All,
This is a combination of an Excel and a Statistics question.
I'm given various types of distributions and asked to group them into
like "buckets" or "tiers". So for example, (and this will vary every
time) if I'm given a list of sales data for 300+ locations, I'm trying
to get them into groups.
J4 = # of Locations: 377
J5 = # of Bins: =ROUNDUP(LOG(J4,2)+1,0)
J6 = Bin Size: = (J7-J8)/J5
J7 = Max of Data Distribution = max(sales data) = 663
J8 = Min of Data Distribution = min(sales data) = 176
Does anyone know any other methods for estimating the number of bins?
The methodology I'm using above is called Sturges' formula.
How would other methods fit into my variables above?
Thanks to All and let me know if I can clarify anything further.
You may want to make the bin size a "pretty" number for ease of reading.
A power of 10 times 1, 2 or 5 is a common choice for a "pretty" number.
Then the min and max can be integer multiples of the bin size.

If you are going to do comparisons over time you will want the bin size
to remain constant to make life easier.

For many types of economic data, like size of firms etc, it may be
sensible to take logarithms first. Then you bin boundaries could become
a sequence of "pretty" numbers.

Of course, a good graphics package would do these things for you.
c***@gmail.com
2014-12-30 15:03:11 UTC
Permalink
Post by Gordon Sande
Post by c***@gmail.com
Hello All,
This is a combination of an Excel and a Statistics question.
I'm given various types of distributions and asked to group them into
like "buckets" or "tiers". So for example, (and this will vary every
time) if I'm given a list of sales data for 300+ locations, I'm trying
to get them into groups.
J4 = # of Locations: 377
J5 = # of Bins: =ROUNDUP(LOG(J4,2)+1,0)
J6 = Bin Size: = (J7-J8)/J5
J7 = Max of Data Distribution = max(sales data) = 663
J8 = Min of Data Distribution = min(sales data) = 176
Does anyone know any other methods for estimating the number of bins?
The methodology I'm using above is called Sturges' formula.
How would other methods fit into my variables above?
Thanks to All and let me know if I can clarify anything further.
You may want to make the bin size a "pretty" number for ease of reading.
A power of 10 times 1, 2 or 5 is a common choice for a "pretty" number.
Then the min and max can be integer multiples of the bin size.
If you are going to do comparisons over time you will want the bin size
to remain constant to make life easier.
For many types of economic data, like size of firms etc, it may be
sensible to take logarithms first. Then you bin boundaries could become
a sequence of "pretty" numbers.
Of course, a good graphics package would do these things for you.
Thank you!

Loading...