This is implemented in Liston & Elder, 2006 according to Walton, 1996 (https://www.tandfonline.com/doi/abs/10.1080/02664769624332?casa_token=fEVPFRYrr7sAAAAA:ozZFAcUWX4mKaUI8tvOn6R-3giOHefH0p8vaRDFCN1ORGy0d9evP7Hn9aLbMWsUQsIKrKEKxP-M).
So a new resampling algorithm should be implemented (see https://models.slf.ch/docserver/dev/meteoio/doc/html/dev_1Dinterpol.html). In order to implement ARIMA, we should rely on the multiple linear regression class from meteoStats/libfit1D (see https://models.slf.ch/docserver/dev/meteoio/doc/html/classmio_1_1FitMult.html). Then the following steps would be taken:
- locate the gap to fill in into the input data (ie from the point to resample, find the start and end of the gap);
- compute the regression coefficients over a user-defined window width before the gap (for example, 3 days): for each point in the window, fill the observations and predictors matrix: Yn is the observation while Yn-1, Yn-2, Yn-3, etc are the predictors. The number of predictors should be user-defined (for example, 3 days). Each new set is added through FitMult.addData(). Once all the observations have been pushed (so, for example all values for the last 3 days before the data gap), get the regression coefficients with FitMult.getParams().
- using the computed regression coefficients, compute the requested value into the data gap. Of course, these coefficients should be cached so subsequent calls would not have to recompute them for the same gap.
- since we deal with variable/unregular sampling rate and since the requested point might not even match the input data sampling rate, we should do the following: over the user-defined window, compute the most probable sampling rate (we could read all sampling intervals and extract the median). If a point is requested at a sampling rate that does not match the original sampling rate, we would interpolates points before and after using ARIMA and then linearly interpolate to the exact requested point. For more efficiency, we could also compute and cache ALL points over the data gap (at the original sampling rate) and then only have to linearly interpolate to the exact requested point when needed...
- the case of missing data in the calibration window still has to be handled...
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information