Novel accuracy test for multispectral imaging systems based on ΔE measurements

In the printing industry, multispectral line scan cameras are being applied with increasing frequency in print inspection. This field of application requires highly accurate camera systems. In this article, we describe a novel approach to determining the accuracy of multispectral measurements recorded by line scan cameras. The approach is based on Bayesian statistics and paves the way for inline applications. Our approach uses the distribution of color distances, as expressed by ΔE values, that arise when the reference color spectra of a color chart are compared with corresponding spectra reconstructed from the measured camera responses of observed color patches. By means of 18 ΔE values originating from a color control strip, our approach provides an accuracy evaluation of multispectral imaging systems with line scan technology. To demonstrate this, four scenarios are considered in which the multispectral imaging system is used with different measurement accuracies. It is shown that the imaging system in these cases can be reliably characterized with respect to the quality of the multispectral measurements.


Background
The use of camera systems to inspect printed materials is ubiquitous.More and more manufacturers are adopting optical solutions, since such solutions allow a seamless monitoring of print jobs at high production speeds and ensure a reliable detection of defects.Imaging systems that consist of a camera and a light source, and that provide accurate measurements, are a prerequisite for these purposes.Often multispectral line scan cameras are used, since they combine spectral color measurements with high speed inspection of continuous materials (as, for example, printed web).To ensure a sufficient quality of the measurements of such imaging systems, a simple method is to stop the production process from time to time to check the camera and the applied light source.To this end, a suitable color chart can be integrated into the production line and is scanned by the camera.For this purpose, the color chart has to be conveyed under the camera.A comparison between the recorded camera responses and *Correspondence: sascha.eichstaedt@ptb.de 1 Physikalisch-Technische Bundesanstalt, Abbestraße 2-12, 10587 Berlin, Germany Full list of author information is available at the end of the article reference camera responses indicates whether the imaging system has to be recalibrated.However, this procedure is time-consuming and its implementation is technically complicated when considering line scan applications.
In this paper, we propose and demonstrate an alternative method that can serve as a basis for evaluating the accuracy of multispectral line scan camera systems inline during the printing process.From a few color patches arranged on a small strip, camera responses are completely obtained by a line scan imaging system without having to move the color strip under the camera by means of a conveyor; these responses are used to derive corresponding reflectance spectra [1,2].Here, the basic idea is to check whether the differences between these spectra and known reference spectra are acceptable or not.To quantify such deviations, we use the E metric [3], which describes the distance between the colors associated with these spectra within the CIELAB space, a commonly used, device-independent colorimetric coordinate system.
Here, we develop a Bayesian approach which uses only the small number of colors on the control strip and the corresponding E values to assess the accuracy of multispectral imaging systems.Bayesian inference is a wellestablished tool for data analysis [4][5][6][7] and has already found its way into the field of image processing, where Bayesian methods are frequently employed for spectralreflectance recovery and image restoration [2,[8][9][10][11].Because of the small number of colors utilized by our Bayesian method, the color control strip is small enough that it can be completely captured by the line scan camera.In this way, a transport of the strip on a conveyor belt is avoided.Thus, this procedure does not influence the production process, and there is no need to stop the print job.We demonstrate the performance of our approach by considering four scenarios in which the multispectral imaging system is used with different measurement accuracies.
The remainder of this paper is organized as follows: the basic idea of our approach, together with the experimental setup, is described in "Basic concept and experimental setup" section, while "Statistical model" section is dedicated to the statistical model used.Results that demonstrate the suitability of the method proposed are presented and discussed in "Results and discussion" section.Conclusions follow in "Conclusion" section.

Basic concept and experimental setup
Our method for characterizing the accuracy of multispectral imaging systems is based on the E metric, by means of which color differences can be quantified [3,12,13].In colorimetry, it is common practice to specify colors in terms of special coordinates within a corresponding color space.Often, the Euclidean distance between two color coordinate points (or modifications of this distance) is used to express the difference between the colors associated with these points.Color coordinates can be calculated by resorting to the underlying reflectance spectrum of the considered colors [14].This spectrum can be directly obtained, for example, by spectrophotometer measurements.Alternatively, multispectral camera measurements also make it possible to obtain such a spectral function by means of sophisticated reconstruction techniques [1,2].
The basic idea of our approach is to calculate a E value for a given color on the color control strip by comparing the color coordinates of the reflectance spectrum previously retrieved from a spectrophotometer with the color coordinates of the reflectance spectrum obtained by means of a suitable multispectral imaging system.E values derived in this way act as a measure of the accuracy of the imaging system; small E values indicate that the camera measurements were recorded with high accuracy, while large values suggest a low measurement quality.Throughout the paper, we use the CIEDE2000 formula [3,12,13] to evaluate color differences; this formula is embedded in the CIELAB color space.Often, the symbol E 00 for color differences is utilized to indicate this choice.For ease of presentation, we suppress the subscript of E 00 in the following.CIELAB coordinates of the reflectance spectra are calculated, using the CIE-1931 2 • standard observer and assuming the CIE-D50 standard illuminant.
In practical terms, the color control strip should be small, especially when considering inline applications.To this end, we have constructed a strip containing only 18 color patches (see Fig. 1), and we propose an accuracy test utilizing these colors.The control strip was constructed in such a way that as many color patches as possible can be completely captured by the imaging system without having to move the strip under the line scan camera.Nonetheless, color patches should be large enough to Fig. 1 Illustration of the problem.The task is to infer the E distribution of a color chart (here with 345 different color patches) from a small sample (here with 18 values).E values are obtained by comparing the reference color spectra of the color patches with the corresponding spectra obtained from multispectral measurements ensure a sufficient averaging of camera responses over the pixels of the camera channels.This led to 18 patches in our experimental setup.The colors of the strip stem from a color chart designed and used by Chromasens GmbH.As illustrated in Fig. 1, this chart consists of an upper part with 15 color stripes, each composed of 22 identical color patches, and a lower part with 330 different color patches, where the colors of the stripes are not used.For our analysis, we took the 330 patches of the lower part together with 15 patches chosen from the middle of each stripe.This provides 345 different color patches, from which the colors of the control strip were randomly chosen.We assume that this selection can be considered to be a representative sample of the 345 colors.The color chart serves another purpose: it is crucial for the elaboration of our statistical model in "Bayesian analysis" section, since we also assume that the 345 colors reflect the color space of the printer to be inspected to a sufficient extent.
The multispectral imaging system used for our measurements was developed by Chromasens GmbH and consists of a truePIXA line scan camera, which has 12 image channels, and a Corona II LED line scan illumination.The camera is arranged at 0 • with respect to the surface normal, the light source at 45 • .For measurements on image data as the here considered color chart, the camera measurement spot is adjusted to approximate the aperture of a typical spectrophotometer, i.e. circular shape with diameter of 3 mm and the pixel-wise camera responses are averaged within this region.The color control strip considered here has a rectangular shape with a dimension of about 10 mm times 0.127 mm.The 10 mm are given by the field of view and the number of patches, and the 0.127 mm correspond to the height of a pixel of the line scan sensor.This is a compromise between the number of patches and the number of pixels within a patch for averaging and outlier removal.
We apply the reconstruction method reported in [1] with a linear kernel in order to compute reflectance spectra from the camera response data.Reference reflectance spectra are recorded with a Konica Minolta FD-7 spectrophotometer.Camera response data are acquired with different measurement accuracies: we gradually reduce the measurement quality of the imaging system by changing the current of the LEDs of the light source from 100% to 70% in steps of 10% (related to the current feed of a wellcalibrated case, called scenario 1 below), thus simulating a potential technical defect of the system.For each current feed, we calculate E values by comparing reference spectra from the color chart and from the control strip with corresponding spectra estimated from the camera data.In this way, four measurement scenarios are obtained, providing a large and a small E sample with 345 and 18 values, respectively, in each case.As will be shown in the following, the large samples are used for the elaboration of our statistical model and, by means of the small samples, the suitability of our approach is demonstrated.

Probability density function
Our statistical model is based on three basic assumptions: (1) for each measurement scenario, E values are independent and identically distributed (i.i.d.); (2) samples of the color chart with 345 different patches reflect the overall population (derived from all printable colors) sufficiently well; and (3) samples of the color control strip, in turn, are representative for the color chart.Moreover, we impose the following requirements on the probability density function (pdf ) sought for E. First, it has to be a continuous probability distribution defined on the positive half-line because E can only take positive values.Second, the pdf should contain only a small number of parameters to control its shape and scale.This restriction ensures a fast numerical treatment in "Results and discussion" section.Despite its small number of control parameters, the pdf sought ought to describe the distribution of the E values to a satisfactory extent.
It was determined that the log-logistic distribution is particularly well-suited for our purpose.The log-logistic distribution is often applied in survival analysis [15] and in economics to model the distribution of income [16].Its pdf is given by where α > 0 denotes the scale parameter and β > 0 the shape parameter.Figure 2 shows log-logistic pdfs (dashed lines) obtained from a maximum likelihood estimation (MLE) [17] of their parameters for the measurement situations considered.In each case, parameter estimation is based on the assigned E sample with 345 values.Additionally, corresponding histograms are given.We observe in Fig. 2 that the log-logistic pdfs are in good agreement with the histograms.Additionally, we performed a Kolmogorov-Smirnov goodness-of-fit test [18] to examine whether the measured data conform to the log-logistic distribution.To this end, the full sample of 345 E values was randomly divided into two subsamples, one with 173 and the other with 172 values.The subsample with 173 values was used to estimate the parameters of the log-logistic distribution by means of MLE, while the second subsample served as input data for the Kolmogorov-Smirnov test, where the pre-estimated parameters were used.We repeated this procedure 10

Bayesian analysis
During the printing process, camera response data often arise from only a small fraction of the entire color space of the applied printer, making probabilistic statements about the accuracy of the applied camera system difficult.In the following, we develop a Bayesian treatment based on the log-logistic distribution that makes such an accuracy assessment possible by using only a small number of colors.In our case, the likelihood function for the sample of the 18 E values E 1 , . . ., E 18 from the color control strip can be written (under the i.i.d.assumption) as a product of the individual log-logistic pdfs as follows: For the prior distribution of the parameters α and β, we apply the corresponding (non-informative) Jeffreys prior [19,20] According to Bayes' theorem, the likelihood function, together with the prior, leads to the posterior distribution of α and β given E 1 , . . ., E 18 , where the normalization constant is ( The posterior (4) expresses the probability density that the distribution of E obeys a log-logistic pdf with parameters α and β, given the measured data E 1 , . . ., E 18 from the color control strip.
An advantage of the Bayesian approach is that probabilistic statements can be made conditional on the observed data.An important quantity in our case is the probability P that the E distribution of the 345 patches of the color chart exhibits at least a proportion P * of the population that has values of E below or equal to a certain threshold E th , mathematically expressed by A larger P * leads to a stricter requirement on the quality of the multispectral measurements.In the following, we restrict ourselves to E th = 1.This choice is motivated by the fact that E values smaller than 1 are often required in real applications.For given data E 1 , . . ., E 18 , the probability P can be calculated by The integration is performed over the region D P * in the (α, β) parameter space, in which the underlying pdfs for E defined by the corresponding (α, β) pairs fulfill constraint (6).This shows the merit of the Bayesian concept.Our Bayesian approach takes into account all pdfs constructed from the 18 E values that are compatible with the desired measurement accuracy.The green area in Fig. 3 indicates the domain of integration for the example P * = 0.9.
For the numerical treatment of the double integral in Eq. ( 7), we divide the (α, β) parameter space into a grid area with a grid width of 5 × 10 −4 for α and 5 × 10 −3 for β.This choice makes it possible to achieve both acceptable computing times and reliable integration results in "Results and discussion" section.Additionally, we generate a matrix indicating grid nodes with 1 if they belong to the region D P * (i.e., the associated pdfs fulfill Pr( E ≤ 1) ≥ P * there) and 0 otherwise.The element-wise multiplication of this matrix by the posterior matrix (containing the posterior values for each grid node) ensures the correct numerical evaluation of Eq. (7).

Results and discussion
We consider four scenarios with different accuracies of multispectral measurements, as described in "Basic con- Here, P * = 0.9 was chosen cept and experimental setup" section.Since our color control strip consists of 18 patches, we obtain 18 E values in each case, which are visualized in Fig. 4 as a strip plot.As expected, the E values become larger with decreasing accuracy of the imaging system.
We have calculated the posterior given in (4) for the four cases.Figure 5a-d display the isolines of the distributions in the (α, β) plane.In the first scenario, the posterior is fairly localized in the (α, β) plane, as can be seen in Fig. 5a.With decreasing quality of the multispectral measurements, the posterior drops in height and broadens in width in such a way that its normalization is retained.Moreover, its maximum moves to larger α values.
From the posteriors shown in Fig. 5a-d we are able to calculate, for each measurement scenario, the probabilities P that the corresponding E distribution of the 345 patches of the color chart fulfills the constraints specified by P * [see Eqs. ( 6) and ( 7)].The dashed lines in the figures indicate the area of integration D P * for P * = 0.8, 0.85, and 0.9.Findings for the probability P are given in Fig. 5a-d.As expected, with increasing P * , probability values P become smaller for a fixed measurement scenario.This can be explained by looking at the dashed and dotted lines in the figures, which specify the border of D P * .The border moves upwards and consequently reduces the domain of integration if P * is increased.In other words, with increasing P * , a decreasing number of pdfs fulfills the required accuracy expressed by the P * constraint and contributes to the calculation of P. Very small probabilities are obtained for the last two cases [see Fig. 5c and d], while large probability values are calculated for the first two scenarios [see Fig. 5a and b].This is because of the posterior movement mentioned above.In Fig. 5a and b, the posterior is almost entirely located in the domain of integration (above the dashed and dotted lines), whereas in Fig. 5c, most of the posterior and in Fig. 5d almost the entire posterior is outside this area (below the dashed and dotted lines).Finally, the conclusion can be drawn that scenarios 1 and 2 are very probably in accordance with the required measurement accuracies specified by P * = 0.8, 0.85, and 0.9.However, scenarios 3 and 4 are highly likely to fall short of the desired measurement quality.Moreover, the calculated P values in Fig. 5, which decrease from scenario 1 to 4, support the conclusion that our approach allows a successful classification of the considered cases with respect to the measurement quality.Future research may extend these investigations and explore the robustness of the concept with respect to the influence of printing material roughness level, print finish and printing technology or to situations in which changes are uncorrelated and restricted to single colors only.

Conclusion
Camera systems are being used in increasing numbers to inspect the quality of printed products as the application of such systems entails many advantages, such as a seamless and fast inline monitoring of an entire print job.However, measurements with high accuracy are a prerequisite for this purpose.In this paper, we have developed a Bayesian method that allows the accuracy of multispectral line scan camera measurements to be checked and a malfunction of the imaging system to be detected.A central aspect of our approach is the distribution of color distances, expressed by E; these distances arise when the reference color spectra of a color chart are compared with corresponding spectra obtained from the measured camera responses of the observed color patches.We have shown that, by means of 18 E values derived from a control strip with 18 color patches, our Bayesian treatment furnishes a probabilistic evaluation of the camera system's accuracy.
We have tested our method by considering four scenarios in which the accuracy of multispectral measurements was gradually reduced by changing the current feed of the LEDs of the applied light source in order to simulate a potential technical defect of the system.It was possible to reliably evaluate the four cases with respect to the quality of the multispectral measurements.For our examples, log-logistic probability density functions were used in the Bayesian treatment, but our approach is not restricted to this particular probability density function.
The chosen probability density function assumes that the measurement system operates under repeatable color generation conditions.When these assumptions are violated, for example, when single colors are drifting or when the measurement system changes slowly, the proposed concept could still be applied provided that the statistical model is augmented accordingly.

Fig. 2
Fig. 2 Representation of the four data sets (containing 345 E values in each case) as histograms together with corresponding log-logistic distributions resulting from a maximum likelihood estimation of their parameters: a scenario 1 with α = 0.3483 and β = 2.5797; b scenario 2 with α = 0.4660 and β = 3.5907; c scenario 3 with α = 0.6546 and β = 4.3827; d scenario 4 with α = 0.8712 and β = 4.7626

Fig. 3
Fig. 3 Part of the two-dimensional parameter space of the log-logistic distribution.The green-colored domain D P * represents pdfs for E with parameters α and β, which fulfill Pr( E ≤ 1) ≥ P * .Here, P * = 0.9 was chosen

Fig. 4 Fig. 5
Fig. 4 Visualization of the E samples (each with 18 values) for the four measurement scenarios