# The Newsvendor Problem

The “Newsvendor Problem” is a classic problem in inventory and supply-chain management: how much product to carry in stock in the face of uncertain demand?

The problem is obviously of interest in its own right, but it is also an archetypical problem, meaning that variations of it arise frequently and in different contexts. It is therefore valuable to know “how to think about” this kind of problem; in particular, since in its simplest form, it has a closed-form, analytic solution.

The basic problem is often presented in the following terms. Imagine you run a newsstand. In the morning, you buy a certain number $n$ of newspapers at price $c_0$. Over the course of the day, you try to sell this inventory at price $c_1$; anything that isn’t sold in the evening is discarded (no salvage value).

If you knew how many papers you would actually sell during the course
of the day (the *demand* $m$), then it would be easy: you would
buy exactly $m$ papers in the morning. However, the demand is not known
exactly, but we know the probability $p(k)$ of selling exactly $k$ copies.
The question is: how many papers should you buy in the morning in
order to maximize your net earnings (the *revenue*)?

A first guess might be to use the average number of papers that we expect to sell—that is, the mean of $p(k)$. However, this approach may not be good enough: suppose that $c_1$ is much larger than $c_0$ (so that your markup is high). In that case it makes sense to purchase more papers in the hope of selling them, because the gain from selling an additional paper outweighs the loss from having purchased too many. Of course, the converse also holds: if the markup is small, then each unsold paper significantly reduces our overall revenue.

To solve this problem analytically, we want to find the optimum of the
expected revenue. The revenue is given by the sale price $c_1$, multiplied
by the number of units sold, less the purchase price $c_0$, multiplied by
the number of units purchased, $n$. Because we can only sell as many units
as we originally had in inventory, the number of units sold is the lesser
of the demand $m$ and the units in inventory $n$. (Notice that the demand
can in principle *exceed* our inventory: in this case, we miss out on sales.)
For a given demand $m$, the revenue is therefore:

$$ r(m) = c_1 \min( n, m ) - c_0 n $$

The revenue depends on the demand $m$. However, the demand is a random
quantity: all that we know is that it is distributed according to some
distribution $p(m)$. The *expected revenue* $E[r(m)]$ is the
average of the revenue over all possible values of $m$, where each
value is weighted by the appropriate probability factor:

$$ E[ r(m) ] = \int_0^\infty \! r(m) \, p(m) \,\text{d}m $$

We can now plug in the previous expression for $r(m)$, using the lesser of $n$ and $m$ in the integral:

$$ \begin{align*} E[ r(m) ] & = c_1 \int_0^n \! m \, p(m) \,\text{d}m + c_1 \int_n^\infty \! n \, p(m) \,\text{d}m - c_0 n \int_0^\infty \! p(m) \,\text{d}m \\ & = c_1 \int_0^n \! m \, p(m) \,\text{d}m + c_1 n \left( 1 - \int_0^n \! p(m) \,\text{d}m \right) - c_0 n \end{align*} $$

where we have made use of the fact that $\int_0^\infty p(m) \,\text{d}m = 1$ and that $\int_0^n p(m) \,\text{d}m + \int_n^\infty p(m) \,\text{d}m = \int_0^\infty p(m) \,\text{d}m$.

We now want to find the maximum of the expected revenue with respect to the initial inventory level $n$. To locate the maximum, we first take the derivative with respect to $n$:

$$ \begin{align*} \frac{\text{d}}{\text{d}n} E[ r(m) ] & = c_1 \, n \, p(n) + c_1 \left( 1 - \int_0^n \! p(m) \,\text{d}m \right) - c_1 \, n \, p(n) - c_0 \\ & = c_1 - c_0 - c_1 \int_0^n \! p(m) \,\text{d}m \end{align*} $$

where we have used the product rule and the fundamental theorem of calculus: $\frac{\text{d}}{\text{d}x} \int^x f(s) \,\text{d}m = f(x)$.

Next we equate the derivative to zero (that is the condition for the maximum) and rearrange terms to find

$$ \int_0^n \! p(m) \,\text{d}m = 1 - \frac{c_0}{c_1} $$

This is the final result. The lefthand side is the *cumulative
distribution function* of the demand, and the righthand side is a
simple expression involving the ratio of the purchase price and the
sales price. Given the cumulative distribution function for the
demand, we can now find the value of $n$ for which the cumulative
distribution function equals $1-c_0/c_1$. This value of $n$ is the
optimal initial inventory level.

The result is interesting and useful in its own right, but the primary
takeaway from this post is the explicit expression for the *expected
revenue*: because this expression can be formulated exactly, we could
find a analytical, closed-form solution to the problem.

This post has been adapted from a section in my book on Data Analysis. Look there for more details and further reading.