The Newsvendor Problem

The “Newsvendor Problem” is a classic problem in inventory and supply-chain management: how much product to carry in stock in the face of uncertain demand?

The problem is obviously of interest in its own right, but it is also an archetypical problem, meaning that variations of it arise frequently and in different contexts. It is therefore valuable to know “how to think about” this kind of problem; in particular, since in its simplest form, it has a closed-form, analytic solution.

The basic problem is often presented in the following terms. Imagine you run a newsstand. In the morning, you buy a certain number $n$ of newspapers at price $c_0$. Over the course of the day, you try to sell this inventory at price $c_1$; anything that isn’t sold in the evening is discarded (no salvage value).

If you knew how many papers you would actually sell during the course of the day (the demand $m$), then it would be easy: you would buy exactly $m$ papers in the morning. However, the demand is not known exactly, but we know the probability $p(k)$ of selling exactly $k$ copies. The question is: how many papers should you buy in the morning in order to maximize your net earnings (the revenue)?

A first guess might be to use the average number of papers that we expect to sell—that is, the mean of $p(k)$. However, this approach may not be good enough: suppose that $c_1$ is much larger than $c_0$ (so that your markup is high). In that case it makes sense to purchase more papers in the hope of selling them, because the gain from selling an additional paper outweighs the loss from having purchased too many. Of course, the converse also holds: if the markup is small, then each unsold paper significantly reduces our overall revenue.

To solve this problem analytically, we want to find the optimum of the expected revenue. The revenue is given by the sale price $c_1$, multiplied by the number of units sold, less the purchase price $c_0$, multiplied by the number of units purchased, $n$. Because we can only sell as many units as we originally had in inventory, the number of units sold is the lesser of the demand $m$ and the units in inventory $n$. (Notice that the demand can in principle exceed our inventory: in this case, we miss out on sales.) For a given demand $m$, the revenue is therefore:

$$ r(m) = c_1 \min( n, m ) - c_0 n $$

The revenue depends on the demand $m$. However, the demand is a random quantity: all that we know is that it is distributed according to some distribution $p(m)$. The expected revenue $E[r(m)]$ is the average of the revenue over all possible values of $m$, where each value is weighted by the appropriate probability factor:

$$ E[ r(m) ] = \int_0^\infty \! r(m) \, p(m) \,\text{d}m $$

We can now plug in the previous expression for $r(m)$, using the lesser of $n$ and $m$ in the integral:

$$ \begin{align*} E[ r(m) ] & = c_1 \int_0^n \! m \, p(m) \,\text{d}m + c_1 \int_n^\infty \! n \, p(m) \,\text{d}m - c_0 n \int_0^\infty \! p(m) \,\text{d}m \\ & = c_1 \int_0^n \! m \, p(m) \,\text{d}m + c_1 n \left( 1 - \int_0^n \! p(m) \,\text{d}m \right) - c_0 n \end{align*} $$

where we have made use of the fact that $\int_0^\infty p(m) \,\text{d}m = 1$ and that $\int_0^n p(m) \,\text{d}m + \int_n^\infty p(m) \,\text{d}m = \int_0^\infty p(m) \,\text{d}m$.

We now want to find the maximum of the expected revenue with respect to the initial inventory level $n$. To locate the maximum, we first take the derivative with respect to $n$:

$$ \begin{align*} \frac{\text{d}}{\text{d}n} E[ r(m) ] & = c_1 \, n \, p(n) + c_1 \left( 1 - \int_0^n \! p(m) \,\text{d}m \right) - c_1 \, n \, p(n) - c_0 \\ & = c_1 - c_0 - c_1 \int_0^n \! p(m) \,\text{d}m \end{align*} $$

where we have used the product rule and the fundamental theorem of calculus: $\frac{\text{d}}{\text{d}x} \int^x f(s) \,\text{d}m = f(x)$.

Next we equate the derivative to zero (that is the condition for the maximum) and rearrange terms to find

$$ \int_0^n \! p(m) \,\text{d}m = 1 - \frac{c_0}{c_1} $$

This is the final result. The lefthand side is the cumulative distribution function of the demand, and the righthand side is a simple expression involving the ratio of the purchase price and the sales price. Given the cumulative distribution function for the demand, we can now find the value of $n$ for which the cumulative distribution function equals $1-c_0/c_1$. This value of $n$ is the optimal initial inventory level.

The result is interesting and useful in its own right, but the primary takeaway from this post is the explicit expression for the expected revenue: because this expression can be formulated exactly, we could find a analytical, closed-form solution to the problem.

This post has been adapted from a section in my book on Data Analysis. Look there for more details and further reading.