Kodama's home / tips.
We can get character based histogram using this class.
To get the top n-data, use ruby Sized Priority Queue class.
require "histogram" h=Histogram.new(0,99,5) # lower bound / upper bound / class width readlines.each{|l| h.push(l.size)} # scan lines h.report # print figure
$ ruby sample.rb < cgi-lib.rb 0-.:----+----+----+----+----+----+----+----+----+----+----+----+----+- 66 5-.:----+----+----+----+----+----+ 30 10-.:----+----+ 10 15-.:----+----+----+--- 18 20-.:----+----+----+----+----+-- 27 25-.:----+----+----+----+-- 22 30-.:----+----+----+----+- 21 35-.:----+----+ 10 40-.:----+---- 9 45-.:----+-- 7 50-.:----+ 5 55-.:----+----+-- 12 60-.:----+ 5 65-.:----+----+---- 14 70-.:----+- 6 75-.:---- 4 80-.:-- 2 85-.: 0 90-.: 0 95-.:-- 2 number: 270, average: 25.733333 standard deviation: 23.516488, coefficient of variation: 0.913853 range: 103.000000, range/average: 4.002591 variance: 553.025185, skewness: 0.838520, kurtosis: 2.856878
h.push(200) # pushed into the class "95-". h.push(-200) # pushed into the class "0-".We can control this behavior. See the method "push" in source code for detail.
ついでに, 平均, 分散, 高次のモメントなどを計算.
Kodama's home / tips.