When dealing with laboratory data, smoothing the data is normally one of the key steps in data pre-treatment. There are several reasons for this. First, feature detection (e.g. peak detection in chromatography) is normally based on derivatives, and derivative computation increases the noise in the signal unless it is associated to some smoothing step. Second, it is important to eliminate high-frecuency noise in the data (e.g. NIR spectra), in order to apply subsequent modelling methods (e.g. PLS) that are more robust to instrument variability. Among the smoothing methods, Savitsky-Golay (SG) is one of the preferred smoother, mainly due to its simplicity and its performance.
However, there are always two parameters to optimise in SG smoothing: (i) the polynomial degree and (ii) the window size. Both parameters control the degree of smoothing. A too “soft” smoothing is not able to eliminate completely the (high-frequency) noise features, whereas a too “hard” smoothing will distort the signal too much. Normally the user fixes the polynomal (a second-degree is normlly optimal) and optmizes the window size. An objective way to establish the optimal window size was developed [16a]. The method is based on the adjustment of the autocorrelation in both noise and smoothed signal. In a first step, the autocorrelation of the noise of the instrument (when it is free of relevant signals) is measured. In a second step, the autocorrelation of the residuals of the SG smoother are monitored when different window sizes are applied to the data that we want to smooth. The window size delivering the autocrrelation value closest to the autocorrelation of the noise results optimal. The method is pretty simple and straighforward and it has been applied to HPLC, NMR, MS and NIR (among other instrumentation), with a broad range of application areas.
This project was developed at University of Amsterdam. Several people were involved. Look into presentations co-authorship for more information.