95th percentile tends to be a more accurate representation when capacity planning because the top 5% of largest numbers are discarded. Unlike calculating a straight average/mean which would inlude these numbers and skew the results.
cat dataset.txt | sort -n | awk 'BEGIN{c=0} {total[c]=$1; c++;} END{print total[int(NR*0.95-0.5)]}'
-0.5 because awk does not have a round function so I would add 0.5, and subtract 1 because the array has index 0.