Calculate relative frequency, cumulative frequency, percentages, and Shannon entropy from ungrouped data. Includes bar chart, ogive curve, and frequency distribution table.
The relative frequency calculator converts raw frequency counts into proportions, percentages, and cumulative frequencies for discrete data. Relative frequency (count / total) is the empirical probability of each value — the foundation for statistical inference and the bridge between descriptive statistics and probability.
This tool builds a complete frequency distribution table with absolute frequency, relative frequency, percentage, cumulative frequency, and cumulative percentage. It includes a bar chart, cumulative curve (ogive), Shannon entropy, and mode identification.
Enter discrete data values, and get a professional frequency analysis with visualizations and information-theoretic metrics. Check the example with realistic values before reporting. Use the steps shown to verify rounding and units. Cross-check this output using a known reference case. Use the example pattern when troubleshooting unexpected results. Validate that outputs match your chosen standards. Check the example with realistic values before reporting. Use the steps shown to verify rounding and units. Cross-check this output using a known reference case.
This calculator provides a complete frequency analysis beyond simple counting — with relative and cumulative frequencies, a bar chart, cumulative curve, Shannon entropy, and mode identification all in one place. The entropy metric is particularly valuable for understanding how evenly or unevenly your data is distributed.
Whether you're building a frequency table for a statistics class, analyzing survey responses, or computing empirical probabilities from data, this tool produces the full analysis instantly.
Relative frequency = fᵢ / n where fᵢ is the frequency and n is the total. Cumulative frequency = sum of all frequencies up to and including current value. Shannon entropy H = −Σ pᵢ log₂(pᵢ). Normalized entropy = H / log₂(k) where k = number of unique values.
Result: Mode = 3 (freq 5), Entropy = 2.15 bits
Value 1: 1/15 = 6.67%. Value 2: 2/15 = 13.33%. Value 3: 5/15 = 33.33% (mode). Value 4: 4/15 = 26.67%. Value 5: 3/15 = 20.00%. Shannon entropy = 2.15 bits out of max 2.32 bits, giving 92.6% normalized entropy — a fairly even distribution.
The foundation of statistical inference is the connection between relative frequency and probability. If you flip a fair coin 1000 times, the relative frequency of heads will be close to 0.5 — the true probability. This convergence (the law of large numbers) is why we can estimate probabilities from data.
In Bayesian statistics, relative frequencies from data update prior beliefs to produce posterior probabilities. The likelihood function in Bayes' theorem is essentially built from relative frequencies. When you have a lot of data, the posterior is dominated by the relative frequencies (the data overwhelms the prior).
Shannon entropy was originally developed for communication theory — how many bits do you need to transmit a message? In statistics, it measures how "spread out" a discrete distribution is. Maximum entropy distributions (uniform for discrete, normal for continuous with fixed mean/variance) are the most "uncertain" and are used as default models when you have minimal information.
Relative frequency is the proportion of observations that have a specific value: frequency / total count. It ranges from 0 (value never appears) to 1 (all observations are this value). Multiplied by 100, it gives the percentage. Relative frequencies always sum to 1.0 (or 100%) across all values.
Frequency (absolute frequency) is the raw count of how many times a value appears. Relative frequency is that count divided by the total number of observations. Frequency depends on sample size; relative frequency is normalized to [0,1], making it comparable across different-sized datasets.
Cumulative relative frequency at value x is the sum of all relative frequencies for values ≤ x. It tells you what proportion of the data falls at or below x. Starting at 0 and ending at 1.0, it creates the empirical cumulative distribution function (ECDF) — also called the ogive when plotted.
Shannon entropy H = −Σ pᵢ log₂(pᵢ) measures the "uncertainty" or "information content" of a distribution. If one value dominates (low entropy), there's little surprise. If all values are equally likely (max entropy = log₂k), each observation is maximally informative. It's measured in bits (log₂) or nats (ln).
Relative frequency is an empirical (observed) measure from data. Probability is a theoretical concept. The law of large numbers guarantees that relative frequency converges to true probability as sample size grows. For small samples, relative frequency is a rough estimate; for large samples, it's a reliable estimate of probability.
Normalized entropy (H / Hmax) ranges from 0 to 1. Values near 1 mean the distribution is close to uniform (all values equally frequent). Values near 0 mean one value dominates. For a Likert scale survey: 95%+ normalized entropy suggests no clear preference; below 70% suggests strong preference for certain responses.