Probability density function
Our statistical model is based on three basic assumptions: (1) for each measurement scenario, Δ
E values are independent and identically distributed (i.i.d.); (2) samples of the color chart with 345 different patches reflect the overall population (derived from all printable colors) sufficiently well; and (3) samples of the color control strip, in turn, are representative for the color chart. Moreover, we impose the following requirements on the probability density function (pdf) sought for Δ
E. First, it has to be a continuous probability distribution defined on the positive half-line because Δ
E can only take positive values. Second, the pdf should contain only a small number of parameters to control its shape and scale. This restriction ensures a fast numerical treatment in “Results and discussion” section. Despite its small number of control parameters, the pdf sought ought to describe the distribution of the Δ
E values to a satisfactory extent.
It was determined that the log-logistic distribution is particularly well-suited for our purpose. The log-logistic distribution is often applied in survival analysis [15] and in economics to model the distribution of income [16]. Its pdf is given by
$$ f\left(\Delta E|\alpha,\beta\right) = \frac{\beta\alpha^{\beta}\Delta E^{\beta-1}}{\left(\alpha^{\beta}+\Delta E^{\beta}\right)^{2}}\, $$
(1)
where α>0 denotes the scale parameter and β>0 the shape parameter. Figure 2 shows log-logistic pdfs (dashed lines) obtained from a maximum likelihood estimation (MLE) [17] of their parameters for the measurement situations considered. In each case, parameter estimation is based on the assigned Δ
E sample with 345 values. Additionally, corresponding histograms are given. We observe in Fig. 2 that the log-logistic pdfs are in good agreement with the histograms. Additionally, we performed a Kolmogorov-Smirnov goodness-of-fit test [18] to examine whether the measured data conform to the log-logistic distribution. To this end, the full sample of 345 Δ
E values was randomly divided into two subsamples, one with 173 and the other with 172 values. The subsample with 173 values was used to estimate the parameters of the log-logistic distribution by means of MLE, while the second subsample served as input data for the Kolmogorov-Smirnov test, where the pre-estimated parameters were used. We repeated this procedure 104 times and calculated the mean p values for scenarios 1–4, which are 0.410, 0.301, 0.292, and 0.262, respectively. Comparatively large p values do not refute our assumption that the data are consistent with the chosen distribution.
Bayesian analysis
During the printing process, camera response data often arise from only a small fraction of the entire color space of the applied printer, making probabilistic statements about the accuracy of the applied camera system difficult. In the following, we develop a Bayesian treatment based on the log-logistic distribution that makes such an accuracy assessment possible by using only a small number of colors. In our case, the likelihood function for the sample of the 18 Δ
E values Δ
E
1,…,Δ
E
18 from the color control strip can be written (under the i.i.d. assumption) as a product of the individual log-logistic pdfs as follows:
$$ l\left(\alpha,\beta;\Delta E_{1},\ldots,\Delta E_{18}\right) = \prod_{i=1}^{18}f\left(\Delta E_{i}|\alpha,\beta\right)\,. $$
(2)
For the prior distribution of the parameters α and β, we apply the corresponding (non-informative) Jeffreys prior [19, 20]
$$ \pi\left(\alpha,\beta\right) \propto \frac{1}{\alpha}\,. $$
(3)
According to Bayes’ theorem, the likelihood function, together with the prior, leads to the posterior distribution
$$ {\begin{aligned} \pi\left(\alpha,\beta|\Delta E_{1},\ldots,\Delta E_{18}\right) =&\ C^{-1}\,l\left(\alpha,\beta;\Delta E_{1},\ldots,\Delta E_{18}\right)\\ \qquad&\pi\left(\alpha,\beta\right) \end{aligned}} $$
(4)
of α and β given Δ
E
1,…,Δ
E
18, where the normalization constant is
$$ C = \int_{0}^{\infty}\int_{0}^{\infty} l\left(\alpha,\beta;\Delta E_{1},\ldots,\Delta E_{18}\right)\,\pi\left(\alpha,\beta\right)\, \mathrm{d}\alpha\, \mathrm{d}\beta\,. $$
(5)
The posterior (4) expresses the probability density that the distribution of Δ
E obeys a log-logistic pdf with parameters α and β, given the measured data Δ
E
1,…,Δ
E
18 from the color control strip.
An advantage of the Bayesian approach is that probabilistic statements can be made conditional on the observed data. An important quantity in our case is the probability P that the Δ
E distribution of the 345 patches of the color chart exhibits at least a proportion P
∗ of the population that has values of Δ
E below or equal to a certain threshold Δ
E
th, mathematically expressed by
$$ \text{Pr}\left(\Delta E\leq \Delta E_{\text{th}}\right)\geq P^{*}\,. $$
(6)
A larger P
∗ leads to a stricter requirement on the quality of the multispectral measurements. In the following, we restrict ourselves to Δ
E
th=1. This choice is motivated by the fact that Δ
E values smaller than 1 are often required in real applications. For given data Δ
E
1,…,Δ
E
18, the probability P can be calculated by
$$ P = \iint_{D_{P^{*}}} \pi\left(\alpha,\beta|\Delta E_{1},\ldots,\Delta E_{18}\right) \, \mathrm{d}\alpha\, \mathrm{d}\beta\,. $$
(7)
The integration is performed over the region D
P∗
in the (α,β) parameter space, in which the underlying pdfs for Δ
E defined by the corresponding (α,β) pairs fulfill constraint (6). This shows the merit of the Bayesian concept. Our Bayesian approach takes into account all pdfs constructed from the 18 Δ
E values that are compatible with the desired measurement accuracy. The green area in Fig. 3 indicates the domain of integration for the example P
∗=0.9.
For the numerical treatment of the double integral in Eq. (7), we divide the (α,β) parameter space into a grid area with a grid width of 5×10−4 for α and 5×10−3 for β. This choice makes it possible to achieve both acceptable computing times and reliable integration results in “Results and discussion” section. Additionally, we generate a matrix indicating grid nodes with 1 if they belong to the region D
P∗
(i.e., the associated pdfs fulfill Pr(Δ
E≤1)≥P
∗ there) and 0 otherwise. The element-wise multiplication of this matrix by the posterior matrix (containing the posterior values for each grid node) ensures the correct numerical evaluation of Eq. (7).