Stimulated Raman scattering simulation for imaging optimization

Two simulation programs of a stimulated Raman scattering microscopy (SRS) imaging system with lock-in amplifier (LIA) detection were developed. SRS is an imaging technique based on the vibrational Raman cross-section as the contrast mechanism and enables fast, label-free imaging. Most SRS implementations are based on LIA detection of a modulated signal. However, building and operating such SRS set-ups still poses a challenge when selecting the LIA parameter settings for optimized acquisition speed or image quality. Moreover, the type of sample, e.g. a sparse sample vs. a densely packed sample, the required resolution as well as the Raman cross-section and the laser powers affect the parameter choice. A simulation program was used to find these optimal parameters. The focal spot diameters of the individual lasers (pump and Stokes) were used to estimate the effective SRS signal focal spot and the (optical) spatial resolution. By calibrating the signal and noise propagation through an SRS system for a known molecule, we estimated the signal and noise input to the LIA. We used a low pass filter model to simulate the LIA behavior in order to find the optimal parameters (i.e. filter order and time constant). Optimization was done for either image quality (expressed as contrast to noise ratio) or acquisition time. The targeted object size was first determined as a measure for the required resolution. The simulation output consisted of the LIA parameters, pixel dwell time and contrast to noise ratio. In a second simulation we evaluated SRS imaging based on the same principles as the optimal setting simulation, i.e. the signals were propagated through an imaging system and LIA detection. The simulated images were compared to experimental SRS images of polystyrene beads. Finally, the same software was used to simulate multiplexed SRS imaging. In this study we modeled a six-channel frequency-encoded multiplexed SRS system demodulated with six LIA channels. We evaluated the inter-channel crosstalk as a function of chosen LIA parameters, which in multiplex SRS imaging also needs to be considered. These programs to optimize the contrast to noise ratio, acquisition speed, resolution and crosstalk will be useful for operating stimulated Raman scattering imaging setup, as well as for designing novel setups.


Introduction
Stimulated Raman scattering microscopy (SRS) is a nonlinear imaging modality with a spectral response similar to spontaneous Raman scattering. Compared with spontaneous Raman, SRS offers an increase in mapping speed of several orders of magnitude [1,2]. Unlike spontaneous Raman, SRS is not affected by background fluorescence. SRS is realized by two laser beams that directly probe a specific vibrational state of a molecule if their photon energy difference matches the energy of a vibrational transition. Then the SRS signal can be detected as a gain in intensity of the low energy beam (stimulated Raman gain of Stokes) and a loss in the high energy beam intensity (stimulated Raman loss of pump) [3].
SRS, as a fast imaging modality, was first demonstrated by Freudiger et al. [2] and others [4,5] with the introduction of an amplitude modulation of one of the laser beams and demodulation by a lock-in amplifier (LIA) for detection and noise filtering. Modulation at high frequency allowed the rejection of the lasers' relative intensity noise (RIN) which dominates at frequencies below 1 MHz. A combination of a high modulation frequency, where the laser noise is low, and laser powers at which the shot-noise surpasses the thermal noise, results in shot-noise limited detection. After the probe beam is detected (e.g. the pump beam in stimulated Raman loss), demodulation of the signal takes place by means of a LIA or alternatively home built electronics (e.g. tuned amplifier [6]). Throughout this paper, we choose not to use the common convention of Stokes and pump beams to keep the discussion general. Instead, we refer to the amplitude modulated beam as the carrier, as it carries the amplitude modulation, and to the detected beam as probe.
While the modulation frequency is mentioned as an important parameter in most studies, only a few studies [6][7][8] discussed the (low pass) filter bandwidth, which is an important parameter that affects the acquisition speed and noise. Specifically for the LIA in the context of SRS, to the best of our knowledge no in-depth analysis has been reported for the input parameters of the filter bandwidth. While most SRS studies mention the time constant (TC), almost none [9] mention the equally important filter-order (n), which makes it impossible to calculate the bandwidth or compare modalities reported in the literature. Since the conventional imaging manner of non-linear modalities, such as SRS, is scanning the focal spot across the sample, the scanning speed, the filter response, the focal spot size, and the spatial resolution in the scanning direction are related. Thus, for an object of a given size and with the theory of the LIA filter response for a given bandwidth, one can extract the optimal settings needed for imaging this object with SRS, either with regard to the acquisition speed or to an image quality metric. For example, image quality metrics could be signal to noise ratio (SNR), contrast to noise ratio (CNR) [10][11][12], equivalent number of looks, edge preservation index or mean structural similarity index [13,14]. In this paper, we use CNR as the image quality metric, as it is a quantification of the ability to distinguish a signal from the background.
Here, we describe the operation principle of LIAs, and investigate the signal and noise propagation in the time and frequency domain. Then, we link this to the physical properties of an SRS setup, such as laser powers, focal spot size and pixel dwell time (PDT), in order to optimize the SRS imaging. The optimization procedure is compiled into a simulation program in order to address different imaging scenarios: optimization for CNR or imaging time in a sparse or tightly packed sample. A detailed description of the input and output parameters of the software's user interface is given in the optimal settings simulation section and in the supplementary material.
Furthermore, we generalize this result in order to demonstrate extended capabilities of this analysis. An additional simulation with a visual output (called imaging simulation) was created, based on the same LIA theory and input parameters. This simulation provides a qualitative image visualization, either for narrowband SRS, or for a multiplex SRS setup that has multiple modulated bands and multiple detection channels. As a test case, we analyze a simulated frequency channel encoded multiplex SRS setup as demonstrated before [15,16], however in our simulation the signal is demodulated with a multi-channel LIA. Here, crosstalk between channels is also treated, in addition to resolution, CNR and acquisition speed.

Theory: parameters of the lock-in amplifier
The SRS signal is a small modulation carried on a large background. This large DC background is the source for the laser RIN and shot noise. In the LIA this large DC background is filtered out, and the small modulation is mixed with a reference frequency. Then, the signal amplitude is recovered by applying a low-pass filter, here modeled as a discrete time RC filter, which is an exponential running average filter [17]: Where k is the index of the processed sample X out at specified discrete sampling times T s . Thus, every X out is a weighted average between the new input X in and the previous sample X out of k-1. It is possible to cascade these filters such that the signal and noise pass through multiple filters. The number of filters is called the filter order (n). The frequency domain response of the filter with TC and n is described as: Where Δf is the difference between modulation frequency and the detection frequencies; the input frequencies to the LIA. The filter transmission which describes the signal propagation through such a filter is then: Measurements that verify this model in the frequency and time domain are provided in Figure S1 in the supporting information.
The possible noise sources in SRS are the thermal (Johnson-Nyquist) noise from the different electronic components, the shot noise from the detected (probe) laser light and the frequency dependent RIN [18] from the laser source of the probe light. While the thermal noise is independent of the probe power, the shot noise is proportional to the square-root of the probe power. We ignored the RIN in our analysis for two reasons. One was the assumption that thermal or shot noise limited systems cover most scenarios for SRS imaging, and provide a sufficient description for our analysis. The second was that our experimental system is either thermal or shot noise limited. Since the detection frequency is above 1 MHz, the RIN is small compared with the thermal or shot-noise for the power range that was used. With increasing powers, the photo-detector saturates before RIN dominates the noise over thermal or shot noise ( Figure S2d).
For the thermal and shot noise contributions, the propagation of the noise from the detector through the LIA is determined by the bandwidth of the filter. A useful expression for the quantification of the noise bandwidth is the noise equivalent power bandwidth (NEPBW, Figure S2a), which assumes a flat (white) noise spectrum. It is modeled as a rectangular-shaped filter, which lets the same amount of noise power pass as the actual filter. The flat noise spectrum assumption is valid for small bandwidths relative to any large non-flatness in the spectrum caused by electronic components (e.g. photo-diode or trans-impedance amplifier). The NEPBW is derived from the discrete time RC filter, and is therefore described by the same TC and n parameters. Integration of the square of the filter transmission (Eq. 3) gives: Where the filter order n is expressed through the gamma function. The NEPBW has frequency units (Hz).

Calibration of signal and noise
For the calibration of our system we used the Raman signal of a polystyrene (PS) bead as a test sample at 1602 cm − 1 . The signal calibration coefficient C sig is defined as the product of the Raman cross-section of PS and the number of molecules in the focal volume, propagated through the collection optics and the detection electronics (photo detector, trans impedance amplifier and LIA). If we wish to use this result for another analyte than PS, we simply have to determine the ratio between the signals generated by both analytes on our system (i.e. convert the cross-section from one to another). Accordingly, C TN , C SN are the thermal and shot noise calibration coefficients, which are independent of the analyte cross-section in SRS: where S ps is the signal generated by PS and is measured in Volts at the output of the LIA, P c is the average intensity of the modulated carrier and P p is the intensity of the probe in milli-Watts. In addition, N T and N S are the noise contributions of the thermal noise and the shot noise, respectively, which are given in Volts. The calibration coefficients were extracted from a fit to the measurements shown in Figure S2c for the signal and S2d for the noise contributions. In these measurements the ratio between the carrier and the probe power was always set to 2:1 [19], which is reflected in the slope of 2 in the signal graph ( Figure S2c) that it follows from the product of the carrier and probe powers. From the linear part of that slope we extract the coefficient C sig following Eq. (5). The slope of the shot noise region in the noise graph as a function of probe power in Figure S2d follows a square root relation to the probe power, and is an indication that the system is indeed shot noise limited when using these powers as explained in Figure S2b. The coefficient C SN is derived from the slope of Figure S2d, following Eq. (7), and C TN is calculated from Eq. (6), where N T is measured with no light on the detector. These coefficients are used as default input to the simulations explained below, but can be changed to fit a simulation of any SRS system: C sig = 3 × 10 − 6 V/mW 2 , C TN = 1.518 × 10 − 7 V/√Hz and C SN = 1.36 × 10 − 7 V/√(mW Hz).

Focal spot size and optical spatial resolution
In order to relate the LIA performance with different settings to imaging parameters of a given object, the focal spot size of SRS has to be considered. Since the SRS signal is the product of two beam intensities, the effective focal spot intensity profile of SRS is the product of the carrier and probe beams' focal spot profiles. Assuming both have a Gaussian intensity profile and are slightly underfilling the objective's back aperture [20,21], we can approximate both beams to have a Gaussian profile at the focal plane and their product will be a Gaussian as well. The resulting product beam width or SRS beam full width at 1/e 2 , ω SRS , is described by Eq. (8) [22]: The focal spot widths of the carrier and the probe beams were measured by a knife-edge measurement, taking the derivative of the intensity profile and fitting it to a Gaussian. Since sub-micron widths were expected, we used a USAF target to replace the knife. This was repeated at various z depths. An X-Z intensity profile of the 1064 nm beam is shown in Fig. 1a for illustration (with x being the scanning direction and z as the depth); the dashed red line indicates the depth of the steepest measured intensity profile. The derivative of each line at different depths was fitted to a Gaussian from which the beam widths were extracted. The minimum of these widths was found by fitting a parabola. The effective SRS beam width was determined as described by Eq. (8). Figure 1b shows the smallest widths from knife edge measurements of the carrier and the probe with a Gaussian fit. The resulting 1/e 2 width is 0.89 μm for the probe focal spot size and 1.03 μm for the carrier focal spot size. Accordingly, the SRS beam effective spot size width is 0.67 μm. Following the resolution derivation for a Gaussian beam approximation as described in the supplementary information (Eq. S2-S5) [23][24][25] we get the SRS Rayleigh resolution as 1.03•ω SRS = 0.69 μm.

Optimal settings simulation
The optimal settings simulation gives the LIA's filter order and time constant optimal values. In order to properly simulate the SRS imaging behavior, we have to consider first the physical input parameters of the imaging system. The input signal is given by Eq. (5), and only requires the laser powers as input, given that C sig is calibrated. Upon calibration of C TN and C SN , the noise contributions can be simulated as well for each TC and filter order with Eqs. (6,7) and the NEPBW expression in Eq. (4). For the optical point spread function (PSF), the required input is the 1/e 2 radius of the probe and carrier beams and Eq. (8). A detailed description of the input parameters and the setup parameters is given in Figure S3 and the description of the algorithm is shown in Fig. 2. Simulation of the lock-in parameters is done with respect to an object size input. The object is modelled as a top-hat (black lines in Fig. 3) with a width of the object size and a height of the signal S PS (Eq. 5). First, it is convolved with the PSF (blue lines in Fig. 3), which we derived from the SRS signal's effective spot size. The convolution of the top-hat object with the PSF results in softening of the edges of the objects (i.e. blurring). For a small object, it might result in a lower peak than the original object (blue line in Fig. 3a and c), while for bigger objects the edges blurring is small relative to the object size, thus not affecting the peak height. This explains the peak height difference observed between Fig. 3a and b, as well as 3c and 3d. Then, the convolution of the input object with the PSF propagates through the low pass filter described by Eq. (1) (red lines Fig. 3). The signal propagation in time translates into a spatial signal propagation by the scanning speed, which is simply the pixel size divided by the pixel dwell time. The filter response of the object results in further broadening (i.e. blurring) of the object, a spatial (horizontal) offset and max signal reduction. In a dense sample it also results in contrast reduction (red lines in Fig. 3c and d).
The LIA parameters are then optimized by running this simulation multiple times over a range of TCs and filter orders, optimizing for either imaging time (for a given CNR), or CNR (for a given imaging time). The CNR [10][11][12] is calculated as follows: where I peak is the signal value at the apex of the output curve, and I valley is the value at a local minimum between two apexes in a dense sample case (red lines in Fig. 3cd). In the sparse sample case, the I valley is zero (red lines in Fig. 3a-b). The noise, however, is the same everywhere in SRS, as it is independent of the signal, but rather depends on the probe laser power as illustrated in Figure  S2b. The CNR is the difference between two signals with a similar noise level, so there is an additional √2 factor in the dominator compared with the SNR, which is simply dividing the signal by the noise. In both cases, CNR is a single term giving a description of the quality of the signal (like SNR [18]) as well as a measure to distinguish an object from its surrounding (like contrast). Different lock-in parameters are tested to find the combination that yields the shortest imaging time or the maximum CNR, respectively. This can be done either for a sparse sample (single particle) case or a dense sample case (tightly packed particles). In a dense sample, the object size represents both an object size and the gap between two objects, i.e. the spatial resolution is half of the object's reciprocal (in units of line-pairs per millimeters).
The simulation produces plots (Fig. 3) to illustrate the signal propagation of a given object in terms of signal response and delay. The LIA setting filter order and TC are in the title of these plots. The full output parameters, like the CNR, filter order, time constant and PDT are then shown in the user interface (see Figure S3).
As an example, we tested the response for a PDT as short as 10 μs, for an object in a dense sample with 1 μm object size and 1 μm pixel size. In this test case the maximum CNR was 4.6, found for a TC of 0.85 μs and a filter order of n = 8 (see Fig. 3c).
Alternatively, the simulation can also optimize the lock-in parameters to find the minimum imaging time for a specified CNR, which is calculated the same way as described above. In this case, the shortest PDT is chosen for the optimal TC-filter order combination. For the same test case above, first setting CNR = 2, we get PDT = 2.7 μs, TC = 0.23 μs with filter order n = 8 as the (See figure on previous page.) Fig. 2 Optimal settings algorithm description. The following steps describe the algorithm in the block diagram: 1. The interface input includes the choices of sample type (sparse/dense) and optimization parameter (CNR/Imaging time); the imaging parameters like imaging time/PDT or SNR, flyback portion of the time, pixel size, number of pixels, and object size; filter order range for optimization; powers applied on the sample; calibration constants and beam diameters; 2. A range of TCs is set as an input for the next step for each filter order. The TC is relative to an arbitrary (pre-selected) PDT. The input parameters are passed to the next step; 3. The object is constructed (top hat or square wave normalized to 1). The SRS focal spot (from beam radii) is convolved with the object. The convolved object serves as an input for the RC filter response, for the TC parameter range. This range is with a course precision; 4. The filter response over the object is given to represent the signal propagation: For a single particle, the maximum of the response is determined as the signal (in arbitrary units). For a dense sample, the difference of the second local maximum and second local minimum is determined as the signal (in arbitrary units); 5. The noise is calculated with a TC and filter order range, relative to an arbitrary PDT. Together with the signal (from the normalized response), a CNR range is determined and the maximum of the CNR is found. The maximum CNR is then used to determine the optimal TC (relative to the arbitrary PDT). A smaller range with higher precision (smaller intervals) around this optimal TC from the last step is used to repeat steps 2-4. This gives the optimal TC (relative to an arbitrary PDT); 6. From the smaller TC range the noise is calculated relative to an arbitrary PDT, a CNR range is determined and the maximum of the CNR is found. The maximum CNR is then used to determine the optimal TC with a higher precision (relative to the arbitrary PDT) for each filter order; 7. If CNR is optimized, then the PDT is given and the TC is scaled with the given PDT and object size. The CNR is scaled with the calibration parameters, beam powers and TC. If imaging time/PDT is optimized, then the CNR is a given and the TC is extracted from the ratio of the given CNR and the maximum CNR from the previous step. The PDT is extracted from the arbitrary PDT and the TC ratio; 8. The response curve is plotted; 9. The imaging time is calculated (can be done directly from the input in step 1, or after the program runs, depending on the optimization choice) parameters that give the shortest imaging time for a minimum required spatial resolution in the scanning direction.

Imaging simulation compared with SRS imaging
The same LIA's filter response in Eq. 1 in combination with the calibrated signal and noise terms (Eqs. 5-7) and the PSF (Eq. 9) can be used for visualizing SRS imaging. The imaging simulation receives the aforementioned system and imaging parameters as input together with an one-dimensional or two-dimensional object input. The result is a simulated image of an object passing through the SRS imaging system. Possible input objects are one dimensional top hat object, one dimensional square wave, and two-dimensional round objects to simulate beads. For example, beads imaged with SRS are compared with simulated 2D beads images in Fig. 4, both with the same system and imaging parameters. The imaging simulation creates an image using the input objects and input parameters as follows. Firstly, the shape of the object is convolved with the PSF. Secondly, the object is scaled by the beam powers according to Eq. (5) to form the input signal and the noise is added (Eqs. 6 and 7). Finally, the object is passing through the low-pass filter as described by Eq. (1) by virtual horizontal scanning (left to right). A description of the imaging simulation interface, with different input parameters and setup options, is given in the Supplementary Material, Figure S4.
The results in Fig. 4 demonstrate the principles discussed in the 'Optimal setting simulation' section. Fig. 3 The user interface of the SRS optimal parameters simulation is shown in Figure S3. The power input was 24.7 mW for the carrier and 14.6 mW for the probe on the sample, respectively. Pixel dwell time was set to 10 μs. A visualization of the signal response through the simulated system of a CNR optimized case, excluding the noise, is shown in (a) for a single object size of 1 μm and in (b) for an object size of 20 μm. In (c) and (d) a simulation is shown for a dense sample with an object size of 1 μm and 20 μm, respectively. The object size represents the half period distance of the spatial modulation, (i.e. it accounts for both the particle and the gap between particles in a dense sample)  Figure 4a and b show images from measurements and simulation of a 20-μm bead, respectively, optimized for a single object size of 1 μm, with a good spatial resolution and high noise as shown by the profile insets. Doing the same in Fig. 4c and d with an optimization for 20 μm objects reduces the noise due to the slower response of the filter (with a narrower filter bandwidth), but also reduces the signal's peak height in the measured bead image. The simulation automatically scales the signal to cover the whole dynamic range. Nevertheless, both in measurement and simulation, optimization for 20 μm results in a better CNR, which allows us to detect this single object better in an image with a given PDT, albeit blurred. The resulting CNR values, together with the optimized lock-in parameters are listed in Table 1. Additionally, the slower filter response leads to a noticeable horizontal displacement of the objects, as seen in Fig. 4c and d compared with Fig. 4a and b. Figure 4e-h show images of 2 measurement-simulation-pairs of two beads separated by 2 μm. One pair was optimized for a 2 μm object in a dense sample (Fig. 4e  and f), and the other for a 20 μm object (Fig. 4 g and h).
In the latter case, the slower filter response leads to a lower noise level, but also to a less deep signal in the gap between the beads, which counteracts any improvement of the noise level. Thus effectively, the CNR decreases (see Table 1).

Multiplex SRS imaging simulation
The imaging simulation was generalized to provide not only a visualization of a narrowband imaging system, but also of multiplexed SRS imaging approaches. In this section, we simulated and investigated settings for a multiplex SRS system, with modulation encoded channels and a single detector. This analysis of frequency encoded multiplex SRS also considered the channels' spacing, i.e. the separation between adjacent channels in the frequency domain. This is done in order to avoid crosstalk on a limited detection bandwidth while still optimizing for CNR or acquisition speed.
We use the same imaging simulation described in the previous section to investigate an imaging approach for the detection of multiple bands at the same time. It is a frequency-encoded multiplex SRS approach, similar to  [8]. Here we name it 'single carrier modulation'. In this approach, a combination of a narrowband and a broadband laser beam is used, where the narrowband is used as a probe. The broadband beam is divided into different spectral bands. Each band, that we also call channel, is being modulated at a different frequency. As a result, this multiplex SRS approach has many carriers, while only one probe is used. After the sample, the modulation is transferred to the probe by the SRS process. The probe is then measured by a single detector. Other multiplexing options are shown in the user interface ( Figure S4). In multiplex SRS, the low pass filter bandwidth plays a role not only in determining the signal and noise propagation, but also for the crosstalk between the channels. In our experimental setup we used a trans-impedance amplifier to amplify the signal around the modulation frequency. The trans-impedance amplifier has a bandwidth of 43.5 kHz around 3.63 MHz, with an amplification of above 99% of the original amplification of 86 dB. This means that the measured calibration coefficients (Eqs. 5-7) provide a good approximation that can be used for the multiplex simulation within that bandwidth. In this section we simulated multiple beads at 6 different modulation channels, modeling samples of different materials. For simplicity, all Raman cross-sections were set to that of the 1602 cm − 1 band of polystyrene. The frequency separation between adjacent channels was 8.7 kHz to fit the 6 channels in the aforementioned bandwidth.
In Fig. 5, we show the results of simulations with 3 TC settings, for a hypothetical situation of six different beads of 20 μm diameter, where each bead is in a different channel. Figure 5a is a schematic image showing the beads. Figure 5b is a simulated image with settings optimized for a 20 μm object (bead #3 in blue in Fig. 5a) in a sparse sample, which results in no crosstalk; note the horizontal displacement of the bead due to the slow filter response and the homogenous signal region on the left hand side of the image, which is a simulation artefact. Figure 5c is an image using parameters optimized for 1 μm, which shows a sharper image but also crosstalk in channel 3 from other channels. Figure 5d shows the results with filter order 1 and time constant 125 μs, which theoretically should result in the same CNR as in 5b, since these settings have the same NEPBW as in 5b. However, due to the less steep filter shape of filter order 1 than filter order 8, there is more crosstalk from neighboring channels.
These results can be understood through the filter transmission (Eq. (3)). If a channel's bandwidth is larger than the frequency distance between neighboring channels, the signal from a neighboring channel will appear as an oscillation modulated at the frequency difference of the channels (i.e. beating frequency at Δf). By multiplying the transmission of the channel at Δf, with the signal strength at the neighboring channel (Eq. 5), we get the value for the amplitude of this crosstalk oscillation.

Discussion
In this study, key parameters of the lock-in amplifier were investigated with respect to imaging parameters of SRS. We showed that the imaging parameters, like the effective SRS focal spot and laser powers, are important in the analysis for finding the optimal LIA parameters; filter order and time constant. For this analysis, the specific signal and noise propagation through the LIA's low pass filter are explained and described through the calibration coefficients. Based on the imaging parameters and Eq. 1, together with the definition for the NEPBW, we constructed a simulation program to find the optimal lock-in amplifier parameters. The optimization can be done for CNR or imaging speed, and for a sparse sample or a tightly packed sample. We also show an imaging simulation which is able to further evaluate our result. Finally, we showed how the insights we gained on the optimal settings can help to simulate and evaluate more advanced set-ups, such as multiplex SRS.
Combining the LIA parameters, the PDT and the SRS beam size (Eq. 8) with the object size to determine the CNR in spares and dense samples shows the benefit of the optimal setting simulation. The CNR determines with what contrast an object can be imaged with respect to the noise, and with what contrast two objects of a given size can be discerned from one another. This defines the system resolution in terms of contrast. Therefore, we choose CNR as the image quality definition to account for both contrast and noise, which is different from the common use of SNR for image quality assessment [18].
Other quality metrics which could be considered are SNR, equivalent number of looks, edge preservation index or mean structural similarity index [13,14]. However, SNR and equivalent number of looks do not account for the background level, which could affect the contrast, hence affecting the detectability of an object in a tightly packed sample. Edge preservation index and mean structural similarity index require a comparison to a reference image, which could be the original object, or the input object (after convolution with the focal spot). Figures 3 and 4 illustrate that for optimized parameters for CNR the edges are not always preserved. The mean structural similarity index would judge the quality of the image based on the structural similarities to the original input. Again, Fig. 3 shows that structural similarities could be quite different from the detectability metric, here represented by the CNR.
The numerical simulation for the signal response in time due to the filter shape is necessary for optimization since signal, contrast and noise are all affected when the LIA parameters are changed and they are constrained by the scanning speed. This is even more so when the CNR is calculated for a tightly packed sample to evaluate the resolution, when also the signal level between objects has to be determined.
These principles are illustrated in Figs. 3 and 4. In Fig.  3, the LIA parameters for CNR optimization are plotted in the title of the graphs in which the signal propagation is shown. The same can be done for acquisition speed optimization. The optimization is done for the smallest object size one would like to observe in a sample. The term "object" here can refer to a particle in a sparse sample, or the half period of the highest spatial frequency in a detailed sample. We demonstrated these principles in Fig. 4 by using the result parameters of the optimal settings simulation for (real) imaging and the imaging simulation for comparison. The results of experiment and simulation match quite well, supporting the argument that for an optimized quality of the images, the object size and the nature of the sample have to be considered when imaging with SRS.
We also use the imaging simulation to provide insight into a frequency encoded multiplex setup. Different realizations of setups can be investigated, and we demonstrated this for one of them. Figure 5 illustrates how in multiplex SRS the choice of filter bandwidth affects not only the signal time response and the noise levels, but also the crosstalk between the different channels. Although the NEPBW is used here as an expression for the bandwidth, it is not describing the real low pass filter shape which has a shape determined by the filter order. Generally, a steeper slope of the filter's frequency cutoff is preferred, as a shallow slope contributes to crosstalk from other channels (see Fig. 5d).
In Fig. 5b there is a homogeneous signal region at the left hand side of the image, which is a simulation artefact. Unlike a real imaging scenario, where the slow filter's response averages the signal and noise from the previous row of the image, the averaging of the signal and noise in the simulation starts again in each row. Since there is no signal and noise to average in the beginning of the row, the first few pixels in each row may appear as similar to those above and below them, i.e. the signal is less random and more homogenous. The same effect can be observed in Fig. 4d and h, but less prominently due to increased contrast in these images (compared to Fig. 5b).
Some assumptions, made for this simulation, need to be considered. First, the calibration of the noise assumes a flat spectrum, which is valid for a narrow bandwidth of the low-pass filter compared to any variations in the input frequency spectrum to the LIA, i.e. a transimpedance amplifier with an amplification band wider than the low-pass filter band. This means that for large bandwidths (very short time constants) the noise propagation cannot be calculated in the way presented in this study. Second, the input object was taken as a top-hat shaped object, which is an unrealistic object shape, especially for micron size objects. For a more realistic shape such as a sphere, the effective signal response will be slower, because it takes longer for the focal volume to completely overlap with the sample. Third, the laser beams are approximated to have a Gaussian profile at the focus, which assumes Gaussian beams before entering the objective (TEM 00), and an under-filling of the objective's back aperture. Of course, under-filling of the objective might reduce the effective numerical aperture, so we recommend matching the 1/e 2 radius of a beam to the objective's back aperture radius at this location. Modelling with Bessel beams should give even more exact solutions to the beam profile especially in the case of overfilled objective's back aperture [26]. Fourth, we used a PS bead as our test sample. Commercial PS beads are consistent in size and their physicochemical properties, hence the signal response is expected to be consistent. For correctly simulating another type of object with a significantly different Raman cross-section, one should carefully calibrate for this material, or compare between the PS Raman cross-section and that of the compound, since the SRS spectral response follows the spontaneous Raman spectra [2,3,27]. Finally, we found that the LIA operating at high frequencies and short time constants is different from the theory prediction (Supplementary Fig. S1). For every LIA these limits should be found to define where the simulation results are valid. Nevertheless, we believe that having a tool which provides these optimal settings can improve an imaging session in a straightforward manner for a given sample and a properly defined setup. For example, such a tool may contribute towards a more accurate analysis of environmental microplastics [1].
Finally, the imaging simulation was also used to determine crosstalk in a complex setup where a few channels were demodulated simultaneously. By doing this, we showed that many scenarios can be evaluated by using this kind of simulation and analysis. We believe that with the same analysis, other types of setups can be evaluated, prior to their realization. The design process of such a setup can then be made more efficient.
The proposed simulation is tailored specifically for SRS with a LIA detection. However, with some modifications, such as different signal and noise calibration and PSF determination, similar simulations could be realized for other techniques that are based on laser scanning a focal spot across a sample and using LIA for detection. For example, although scanning near-field optical microscopy [28] has a different signal and noise generation mechanism, the principles of LIA detection are the same and the tradeoff between the scanning speed, image quality and resolution still holds. The same applies to photothermal imaging [29,30], a technique with a setup scheme similar to SRS (pump-probe beams coinciding on a sample, with LIA detection). Another example is frequency domain photoacoustic tomography, a technique which is theoretically well understood [31], and which could benefit from this type of analysis to fine tune its resolution and scanning speed at no additional costs [32]. However, since the noise and signal might be generated differently than in SRS, this should be treated carefully (for example, the noise could be dependent on the signal level, so the CNR will be calculated differently).

Conclusion
In this study we presented two SRS simulations. One can directly improve SRS imaging by finding optimal LIA parameters of an existing setup for different imaging scenarios. The other simulation provides visualization of SRS imaging results and can also simulate various multiplex SRS approaches ( Figure S4).
The simulations describe the standard tools used for SRS imaging, i.e. two laser beams, modulator, scanning laser microscope and a lock-in amplifier. Thus, these simulations can be used to facilitate SRS studies, by directly improving SRS imaging, which will be especially important in the use of dynamic or irreplaceable samples. The same approach can also help to design new SRS setups. Other techniques which are based on lockin detection can benefit from a similar analysis as well, but may require some modifications.

Experimental set-up description
The narrow-band stimulated Raman loss setup used here was described previously [1,19,22,33]. A frequency doubled Nd:YAG laser (Plecter Duo, Lumera) with 80 MHz repetition rate, 8 ps pulse length at a 532 nm output pumped an optical parametric oscillator (OPO, Levante Emerald, APE). A second output laser beam at 1064 nm was amplitude modulated with an acoustooptical modulator (AOM, 3080194, Crystal Technology) at 3.636 MHz. The laser and OPO beams were overlapped temporally with a delay stage and overlapped spatially with a dichroic mirror. The overlapping beams were sent into a laser scanning microscope, 7MP (Zeiss), with a 32× water immersion objective (C-achroplan W, numerical aperture [NA] = 0.85) to scan the sample. All measurements were done with 14.6 mW average power of the probe, and 24.7 mW power of the carrier on the sample.
Collection was done by a water immersion condenser (NA = 1.2) below the sample, followed by an optical filter to block the carrier. The probe was detected by a largearea photodetector (DET36A, Thorlabs) integrated with a home-built trans-impedance amplifier, with a resonance amplification at 3.64 MHz and a bandwidth (FWHM) of 0.387 MHz. The large area detector was a requirement stemming from the non-descanned detection of the microscope, and the modulation frequency was a compromise between the detector's (area dependent) capacitance and the trans-impedance amplification.
The lock-in amplifier (LIA, HF2LI, Zurich Instruments) was used for signal demodulation, and reading its output is possible in two different ways. One is by directly sending one sample per pixel to the computer with the LIA software (LabOne) which we call digital sampling. The other method is called analog sampling, which sends the samples through an analog output to an analog-to-digital converter (ADC) which integrates all samples per pixel. Only the in-phase component with the carrier is measured. Throughout this study the LIA output was measured using the latter method. Digital acquisition can also be simulated, and it only differs in the calibration coefficients of the noise (see Figure S2d).
The simulations and apps were written in Matlab 2018b, including the communication toolbox. The integration in Eq. 4 was done with Mathematica.