User Tools

Site Tools


Awk 95th Percentile

95th percentile tends to be a more accurate representation when capacity planning because the top 5% of largest numbers are discarded. Unlike calculating a straight average/mean which would inlude these numbers and skew the results.

cat dataset.txt | sort -n | awk 'BEGIN{c=0} {total[c]=$1; c++;} END{print total[int(NR*0.95-0.5)]}'

-0.5 because awk does not have a round function so I would add 0.5, and subtract 1 because the array has index 0.

awk95thpercentile.txt · Last modified: 2020/02/13 22:55 (external edit)

free spam filter