Translate this page into:
Linear Wavelet Estimation for Regression with Additive-Multiplicative Noise for Ergodic Data
*Corresponding author: Dr. Sultana Didi Department of Statistics and Operations Research, Qassim University, Buraydah, Saudi Arabia. Email: S.biha@qu.edu.sa
-
Received: ,
Accepted: ,
How to cite this article: Didi S, Esmail TA. Linear Wavelet Estimation for Regression with Additive-multiplicative Noise for Ergodic Data. J Qassim Univ Sci. 2026;1:85-95. doi: 10.25259/JQUS_12_2025
Abstract
Objectives
This paper builds on the linear wavelet estimator for nonparametric regression models incorporating additive and multiplicative noise under i.i.d. conditions.
Material and Methods
We extend their approach to scenarios where the covariate process is strictly stationary and ergodic. A significant advancement is the establishment of a martingale framework that elucidates the asymptotic properties of the projection estimator without requiring assumptions beyond ergodicity.
Results
We demonstrate that this estimator achieves the minimax-optimal convergence rate concerning the mean integrated square error (MISE) across Besov spaces. Additionally, we propose a data-driven technique for selecting the truncation parameter, and we validate our theoretical results through numerical simulations conducted on ergodic data.
Conclusion
This extension broadens the theoretical and practical reach of wavelet-based regression to a wide class of dependent data scenarios and provides a robust estimation framework for ergodic data common in fields like econometrics and signal processing.
Keywords
Ergodicity
Multiplicative noise
Nonparametric regression
Rates of convergence
Wavelets
INTRODUCTION
This study focuses on a unidimensional nonparametric regression framework defined by
In this model is the unknown function of interest. The covariate is strictly stationary and ergodic with values in [0, 1]. The sequences and represent identically distributed multiplicative and additive noise components, respectively, with U_t following a uniform distribution symmetric about zero. We maintain the assumptions of independence between and , as well as between and , for all indices i.
Our primary aim is to estimate the function from the observed data where the noise structure is a combination of multiplicative ) and additive () components. Model (1.1) extends the conventional nonparametric regression framework through the introduction of a multiplicative uniform noise component, which directly scales the unknown function f. Such models are widely used in applications such as signal processing (GPS signal detection with multiplicative and additive noise) and econometrics (e.g., volatility estimation).
Despite similarities to heteroscedastic regression models-see, for example, Chichignoud, Comte, Cai et al. the model considered here differs in its stochastic assumptions.[1,2,3] In particular, Cai et al.[1] study a fixed-design model with deterministic Vi, whereas our setting treats all noise components as random.
We propose a wavelet-based estimator for , leveraging the ability of wavelets to represent complex functions. A linear wavelet estimator is constructed and shown to achieve a fast convergence rate in mean integrated square error (MISE) over Besov function classes, provided an appropriate tuning parameter is chosen. Since this parameter depends on the unknown smoothness of r, we introduce a practical data-driven selection method using an adapted version of the 2-Fold Cross Validation (2FCV) approach from Nason[4]. A simulation study demonstrates the applicability of this method.
The relaxation of the i.i.d. assumption to a strictly stationary and ergodic framework for the covariate process significantly broadens the applicability of our results. Ergodicity is a fundamental property in the theory of stochastic processes, ensuring that sample averages over time converge to their expected values across the ensemble. This makes it a natural and essential assumption for the analysis of time series data (e.g., financial returns, meteorological records) and spatially dependent processes, where observations are inherently correlated and the i.i.d. assumption is violated. While much of the existing literature on dependent data relies on strong mixing conditions,[1,3] our approach based on ergodicity offers a substantial generalization. It is well-established that many important processes, such as certain long-memory time series, Markov processes, and deterministic dynamical systems, are ergodic but not necessarily mixing Didi and bouzebda.[5,6] By developing a martingale framework that requires only ergodicity, we circumvent these more restrictive and often harder-to-verify mixing conditions. This represents a key originality of our work, extending the reach of nonparametric wavelet estimation to a wider class of dependent data structures without compromising the achievability of the minimax-optimal convergence rate.
While kernel-based estimators have also been studied under dependent and ergodic processes (see, e.g., Didi and Louani[7] for applications in frontier analysis), our wavelet-based framework offers distinct advantages in the present context. Wavelet estimators naturally provide a multiresolution analysis, which is adept at capturing local features like discontinuities and sharp peaks that are challenging for kernel methods with a fixed bandwidth. Furthermore, the linear wavelet projection estimator employed here leads to a computationally efficient algorithm, and the theoretical derivation of minimax rates over Besov spaces is particularly well-suited to the wavelet characterization of function smoothness. This provides a clear and established path for optimality theory, which we have extended to the ergodic setting.
Examples of wavelet applications include estimating the integrated squared derivative of a univariate density for independent data Prakasa 1996[8] and negatively or positively associated sequences [Chaubey et al., 2006].[9] The work of Prakasa[10] extended these methods to partial derivatives of multivariate densities under independence, while Hosseinioun et al., Koshkin and Vasil’iev[11,12] addressed the mixing scenario. More recently, Prakasa[13] studied wavelet estimators for the partial derivatives of multivariate densities under additive noise.
This paper is organized as follows. Section 2 reviews the Materials and Methods, covering the basics of wavelets and Besov balls, along with the assumptions and estimators. Section 3, titled Results and Discussion, presents the main theoretical result, the simulation study, and the proofs of our findings. The final section concludes the paper.
MATERIALS & METHODS
Basics on wavelets and BESOV Balls
This section provides a concise overview of wavelets and Besov spaces, which are central to the construction of our estimator. We employ compactly supported wavelets from the Daubechies family. For a comprehensive treatment, we refer to standard texts such as, Daubechies[14] Mallat[15]. For any resolution level j ≥ 0, the index set is For any , in this set, the scaling and wavelet basis functions are generated via translation and dilation of the father and mother wavelets, respectively:
Following the methodology, Cohen et al.[16] there exists an integer such that, for any integer the collection of functions
forms an orthonormal basis of . Therefore, for any integer and any function , admits the wavelet expansion:
where the coefficients are given by the inner products:
A property that is instrumental in our proofs is the integral of the scaling function over its domain:
Let denote the orthogonal projection operator from onto the space with the orthonormal basis . Then, for any , we have
Besov spaces are a flexible class of function spaces that can characterize a wide range of smoothness properties, including spatially inhomogeneous behavior. For further details, see [Härdle et al.,[17] Meyer,[18] Triebel.[19] Definitions of those spaces are given below. Suppose that is m regular (i.e. and for each , with ). Let , and . Then the following assertions are equivalent:
The Besov norm of h can be defined by
Assumptions and Estimators
To facilitate the presentation of our main results, we introduce the following notation. For each integer i, let be the algebra generated by the collection . When i ranges from 1 to n, we define the conditional density of given as
Technical assumptions on the model (1.1) are formulated below.
A.1 We suppose that is bounded from above.
A.2 We suppose that with a fixed real number.
A.3 We suppose that has a moment of order 4.
A.4 We suppose that and are independent for any and is stationary and ergodic.
A.5 For every x ∈ R, the sequence
both almost surely (a.s.) and in the L 2 sense.
A.6 The conditional mean of given the σ −field depends only on , i.e., for any ,
Comments on the assumptions
A.1 (Boundedness of ): This is a standard technical condition that ensures the function is also bounded, which is necessary to control the moments of the observations and guarantee the finiteness of the wavelet coefficients.
A.2 (Uniform Multiplicative Noise): Specifying the uniform distribution for is crucial as it allows us to compute the constant factor that appears in the key identity of A.6. The symmetry about zero is central to canceling the cross-term in .
A.3 (Additive Noise Moments): Requiring a fourth-order moment for is necessary to control the variance of our coefficient estimator , which involves the squared observations .
A.4 (Independence): The independence between and is crucial to ensure that the additive noise does not confound the relationship between the covariate and the function r. This allows the conditional expectation to cleanly separate into a component depending on and the constant variance of .
A.5 (Ergodicity of the Covariate Process): This is a significant generalization of the typical i.i.d. covariate assumption. The ergodic theorem ensures that time averages converge to ensemble expectations, which is the cornerstone for establishing the consistency of our sample-based estimator [GETIMGFILENAME=D:\ZAZA\SS\JQUS\XML\JQUS_12_2025_1006317\JQUS_12_2025_R21_InDesign\Links\JQUS12_78.wmf]. This relaxation widens the applicability of our model to many time series and spatially dependent data scenarios where mixing conditions may not hold or are difficult to verify Didi and Bouzebda.[5,6]
A.6 (Conditional Mean Structure): This assumption is fundamental to our estimator construction. It formalizes the fact that the relevant information for predicting is fully captured by the current covariate , rather than the entire filtration history. This conditional independence structure is what allows us to derive the pivotal relationship , which directly isolates and forms the basis for our estimator in (3.1) and (3.2).
The extension from an i.i.d. to a strictly stationary and ergodic framework for the covariate process is theoretically justified by the ergodic theorem, which ensures that time averages converge to ensemble expectations. This forms the cornerstone for establishing the consistency of our sample-based estimator . Methodologically, this approach is supported by modern asymptotic theory for dependent data, particularly the convergence results for empirical processes based on ergodic data as developed in Didi and Bouzebda, and Didi et al.[5,6] This relaxation significantly widens the applicability of our model to include many time series and spatially dependent data scenarios.
The reader may also be referred to Didi S, Bouzebda S.[6,21,22]
Our linear wavelet estimator for the function r is defined by the projection:
The empirical wavelet coefficients are calculated as:
The form of the estimator is an original contribution developed by Chesneau et al.[23] to address the challenges of multiplicative noise in nonparametric regression. While straightforward to compute, the efficacy of this estimator is largely governed by the tuning parameter j0. This linear wavelet approach is well-established in nonparametric regression [Härdle et al., 1998],[17] with contemporary extensions found in Chaubey et al.[24]
The form of the estimator is derived from the mathematical relationship , which isolates under the assumptions of the model (e.g., independence between (for the same ), uniformity of , and known noise moments). The term arises from the integral of the scaling function over .
RESULTS AND DISCUSSION
The central theoretical result of this work establishes the convergence rate of our estimator , measured by the MISE over Besov function spaces.
Theorem 3.1. Consider the problem defined by (1.1) under the assumptions A1-A6, let with . Let the linear wavelet estimator be defined as in (3.1), with the resolution level chosen such that , where s’ accounts for the adaptation of the Besov space to the L2-norm used in the MISE criterion, defined by
Then, the estimator achieves the minimax-optimal convergence rate
The resolution level specified in Theorem 3.1 is selected to asymptotically minimize the MISE of for functions within Besov spaces. The resulting convergence rate of is the minimax-optimal rate familiar from standard nonparametric regression theory [Härdle et al., 1998,[18] Tsybakov, 2009].[25] The proof of Theorem 3.1 relies on a bias-variance decomposition of the MISE, supported by intermediate results on the estimator (given in (3.2) which are detailed in Lemmas 1 and 2 (Section 6). The remainder of this section addresses the practical challenge of selecting j0 without prior knowledge of the smoothness s. We adapt the 2-Fold Cross Validation (2FCV) method from Nason[4] originally designed for wavelet thresholding-to choose the truncation level j0 for our linear estimator. A numerical study demonstrates the effectiveness of this data-driven approach.
Simulation study
In order to illustrate the empirical performance of the proposed estimator under ergodic covariates, a numerical study was conducted. To reflect a realistic application context, we implemented an automatic data-driven method for selecting the truncation parameter j0, which does not require prior knowledge of the regularity of the function to be estimated. Simulations were carried out using R, with wavelet computations supported by the ‘wavethresh’ package Nesson 2023.[26] This procedure involves randomly partitioning the dataset into two equally sized subsets, constructing an estimator on one subset, and validating its performance on the other. The resolution level j0 that minimizes the average validation error across both subsets is selected. This well-established routine provides a fully data-driven and computationally efficient method for choosing the tuning parameter without prior knowledge of the function’s smoothness.
In our numerical study, we generated data from model (1.1) with a sample size of n=4096. The covariates were simulated from a stationary AR(1) process, with and , normalized to the interval [0,1]. This process is known to be ergodic for , thereby satisfying our core Assumption (A.5).
In accordance with Assumption A.2, the multiplicative noise was drawn from a uniform distribution For this numerical study, we used . The additive noise was drawn from a normal distribution .
For the wavelet construction, we selected Daubechies’ compactly-supported wavelets with 8 vanishing moments, implemented via the wavethresh package in R. We utilized the Daubechies extremal phase wavelets (filter.number=8) and employed periodic boundary handling to manage the interval [0,1].
We consider the HeaviSine and Bumps test functions for f , and we wish to estimate.
To implement the estimator with a random design, we sorted the () pairs by and then applied Mallat’s algorithm to the sequentially ordered values. The truncation parameter j0 was selected using a 2-Fold Cross Validation (2FCV) method. Figure 2 displays the observations, sample estimators, and the selection criterion curves. The method successfully identifies a near-optimal truncation level, and the resulting estimator accurately captures the shape of r(x). Similar results were obtained with the Bumps function, confirming the robustness of the procedure.

- The two test (squared) functions to be estimated. (a) HeavySine function, (b) Bumps Function.

- (a) Mean Integrated Squared Error (log scale) comparing proposed and standard (i.i.d case) methods across increasing sample sizes (n = 512, 2048, 4096) (b) Visual comparison of true HeaviSine function against estimates from both methods in a single simulation trial (c) Average truncation level j₀ with standard deviation bars, showing stable parameter choice across sample sizes.
The empirical performance of our proposed estimator is further summarized in Figures 3 and 4, which provide a comparative analysis against a standard estimator designed for i.i.d. covariates. For both the HeaviSine [Figure 3] and Bumps [Figure 4] functions, Sub-Figure(a) displays the Mean Integrated Squared Error (MISE) on a log scale across different sample sizes, demonstrating the competitive convergence of our method. Sub-Figure(b) offers a visual comparison from a single simulation trial, illustrating that our estimator accurately captures the underlying function shape. Furthermore, Sub-Figure(c) shows the average data-driven truncation level selected by the 2FCV method, confirming its stable and adaptive selection across simulations.

- Bumps Function (a) Mean Integrated Squared Error comparison between proposed and standard methods for Bumps function across sample sizes n = 512, 2048, 4096 (b) Visual comparison of true Bumps function against estimates from both methods, demonstrating superior peak preservation by the proposed approach (c) Average truncation level j₀ with standard deviation bars, showing adaptive resolution selection for the spatially complex Bumps function.

- ( a ) Noisy observations (X,Y 2). (b): Sample of the model collection. (c) Graph of the MSE (blue) against j0 and (re-scaled) 2FCV criterion. (d): Typical estimations from one simulation with n = 4096. Blue lines indicate the true functions; red lines correspond to the estimators r̂0,n. FCV: Fold cross validation
Proofs
To prove Theorem 3.1, we use the following lemmas. We establish a new upper bound for the partial sums of unbounded martingale differences. This result is crucial for analyzing the asymptotic properties of a density estimator derived from strictly stationary and ergodic samples. Throughout, C denotes a positive constant that may change from one instance to the next. The corresponding inequality is presented in the following lemmas.
The following well-known inequality is crucial for controlling the growth of martingale sequences in spaces. We use it to bound the variance of our empirical wavelet coefficients.
Lemma 6.1. (Burkholder-Rosenthal inequality) Following Notation 1 in Burkholder.[27]
Let be a stationary martingale adapted to the filtration , define is the sequence of martingale differences adapted to and , then for any positive integer n
Where as usual the norm .
This exponential tail bound for martingale differences is essential for establishing almost-sure convergence rates. It will be used to handle the deviation of the coefficient estimators in the uniform convergence analysis.
Lemma 6.2. (Exponential Inequality for Martingale Differences) Let be a sequence of real-valued martingale differences with respect to the filtration . Define the partial sum:
Suppose there exist constants and a sequence of nonnegative real numbers such that for all , , the following moment condition holds almost surely:
Then, for any , we have the tail bound:
Where .
Proof of Lemma 6.2. The proof follows as a particular case of Theorem 8.2.2.[28]
This lemma establishes that our empirical wavelet coefficient is an unbiased estimator for the true coefficient, which is a direct consequence of the model assumptions and the linearity of the estimator.
Lemma 6.3. Chesneau et al.[22]
Let be (3.2). Then, under A.1-A.4, we have
Define
This result strengthens Lemma 6.3 by showing that the conditional expectation of our estimator, given the past, also converges to the true coefficient. This demonstrates the robustness of the estimator under the ergodic dependent structure.
Lemma 6.4. Let be (6.2). Then, under A.1-A.6, we have
Proof of Lemma 6.4. Consider the following decomposition
Owing to Lemma 6.3 we have
considering the structure of the filtration we have
therefore,
observe that under assumption (A.6), we have
And
gathering statement (6.5) and (6.6), we have
Where, under assumption (A.5), we have
and
using statements (6.7) and (6.8), we deduce
By combining statements (6.3), (6.4) and (6.9), we conclude the proof of Lemma 1.
This lemma provides a bound on the mean-squared error of the empirical wavelet coefficients. It is the key step in the bias-variance decomposition that leads to the final MISE convergence rate in Theorem 3.1. The proof relies on our martingale framework to handle the dependence.
Lemma 6.5. Let such that , be (3.2). Then, under A.1-A.4, we have
Proof of Lemma 6.5. Consider first the decomposition
From decomposition (6.10), it follows that
Owing to Lemma 6.4 we have
On the other hand, from statements (3.2) and (6.2), we observe that
Therefore,
Lastly, notice that is a sequence of martingale differences with respect to the sequence of -fields , Using the inequality provided by Lemma 6.1 we immediately deduce
Furthermore, one can establish:
To analyze these terms, we use a standard decomposition and note that is the trivial -field. Hence, one obtains
following a similar argument as in (6.6), and using the statement we obtain
Using the independence assumptions on the random variables, A.1-A.5 with , observe that
Moreover, , observe that
and by assumption (A.3) we have
Therefore, using similar mathematical arguments with , and assumption (A.1), we have
Lastly, by the independence assumption between and and (A.1) and (A.2), we have
Thus, all the terms of (6.15) are bounded from above
Next, we analyze the second component of the decomposition in equation (6.13). Specifically, we consider
Applying a standard identity, we find
Using equations (6.14) and (6.15), it follows that
Combining the results from equations (6.16) and (6.17), we derive
Consequently,
This ends the proof of Lemma 2.
Proof of Theorem 3.1. The proof follows the standard bias-variance decomposition paradigm for projection estimators. The novel challenge, addressed by our martingale framework, lies in establishing the variance bound under the general ergodic setting.
We proceed via a conventional bias-variance trade-off analysis [Härdle et al., 2012].[18] The pivotal step is the application of Lemma 6.5, which provides the necessary variance bound for the coefficient estimators; this lemma’s proof relies critically on our martingale framework. The optimal convergence rate is then achieved by selecting j_ to equilibrate this variance with the projection bias. To this end, we begin with the projector-based decomposition:
From the orthonormality of the wavelet basis, we obtain:
It follows from Lemma 6.5 that, and ,
Case 1: . By Hölder inequality and we have
Case 2: . Using the embedding we have
In both cases,
By statements (6.18), (6.19) and (6.20), we obtain
This proves Theorem 3.1.
CONCLUSION
This paper introduces a straightforward wavelet-based method for estimating an unknown function in the presence of both additive and multiplicative noise. We extend the framework of Chesneau et al.[6] by considering a strictly stationary and ergodic covariate process, rather than the conventional i.i.d. assumption. Under the condition of a uniform multiplicative noise distribution, we develop a linear wavelet estimator that achieves a minimax-optimal convergence rate. Practical implementations of the estimator are discussed and supported by numerical experiments demonstrating its effectiveness.
Several promising directions for future research emerge from this work. The model could be generalized to accommodate unknown parameter ¸ or more flexible distributions for the multiplicative noise component. Another worthwhile extension would be the development of a nonlinear wavelet estimator incorporating thresholding techniques and adaptive dependence structures, building upon the work of Chesneau et al. [5] originally developed for additive noise settings. These research avenues present substantial opportunities for further investigation beyond the scope of this study.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
Use of artificial intelligence (AI)-assisted technology for manuscript preparation
The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.
References
- Adaptive variance function estimation in heteroscedastic nonparametric regression. Ann Statist. 2008;36 Available from: https://arxiv.org/pdf/0810.4780 [Last accessed on 2025 Aug 24]
- [CrossRef] [Google Scholar]
- Minimax and minimax adaptive estimation in multiplicative regression: Locally bayesian approach. Probab Theory Relat Fields. 2012;153:543-86.
- [CrossRef] [Google Scholar]
- Nonparametric estimation. Spartacus-IDH; 2015. Available from: https://spartacus-idh.com/liseuse/030/#page/1 [Last accessed on 2025 Aug 24].
- Wavelet shrinkage using cross-validation. J Royal Stat Society: Ser B (Methodological). 1996;58:463-79.
- [CrossRef] [Google Scholar]
- Wavelet density and regression estimators for continuous time functional stationary and ergodic processes. Mathematics. 2022;10:4356.
- [CrossRef] [Google Scholar]
- Wavelet estimation of partial derivatives in multivariate regression under discrete-time stationary ergodic processes. Mathematics. 2025;13:1587.
- [CrossRef] [Google Scholar]
- Consistency results for the kernel density estimate on continuous time stationary and dependent data. Statistics & Probability Lett. 2013;83:1262-70.
- [CrossRef] [PubMed] [Google Scholar]
- Nonparametric estimation of the derivatives of a density by the method of wavelets. Bull Inform Cybern. 1996;28:91-100.
- [Google Scholar]
- Wavelet based estimation of the derivatives of a density with associated variables. Int J Pure Appl Mathematics. 2006;27:97-106.
- [Google Scholar]
- Nonparametric estimation of partial derivatives of a multivariate probability density by the method of wavelets. In: Asymptotics in statistics and probability Asymptotics in statistics and probability. De Gruyter; 2000. p. :321-30.
- [Google Scholar]
- Nonparametric estimation of a multivariate probability density for mixing sequences by the method of wavelets. Italian J Pure Appl Mathematics. 2011;28:31-40.
- [Google Scholar]
- Career transitions in competitive sport. In: Morris T, Summers J, eds. Sport psychology: Theory, applications and issues. Brisbane, Australia: Wiley; 2004. p. :584-610.
- [Google Scholar]
- Wavelet estimation for derivative of a density in the presence of additive noise. Braz J Probab Stat. 2018;32:834-850.
- [CrossRef] [Google Scholar]
- Ten lectures on wavelets, 61. SIAM; 1992.
- A wavelet tour of signal processing: The sparse way. Academic Press; 1999.
- Wavelets on the interval and fast wavelet transforms. Appl Computational Harmonic Analysis. 1993;1:54-81.
- [CrossRef] [Google Scholar]
- Wavelets, Approximation, and Statistical Applications. New York: Springer; 1998.
- Wavelets and operators. In: Analysis at Urbana Analysis at Urbana. Cambridge University Press; 1989. p. :256-365.
- [Google Scholar]
- Wavelet density and regression estimators for functional stationary and ergodic data: Discrete time. Mathematics. 2022;10:3433.
- [CrossRef] [Google Scholar]
- Linear wavelet-based estimators of partial derivatives of multivariate density function for stationary and ergodic continuous time processes. Entropy (Basel). 2025;27:389.
- [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
- Wavelet-based estimators of partial derivatives of a multivariate density function for discrete stationary and ergodic processes. AIMS Mathematics. 2025;10:12519-53.
- [Google Scholar]
- Linear wavelet estimation inregression with additive and multiplicative noise. In: Springer proceedings in mathematics & statistics, Nonparametric statistics Springer proceedings in mathematics & statistics, Nonparametric statistics. Springer International Publishing; 2018. p. :135-44.
- [Google Scholar]
- Adaptive wavelet estimation of a density from mixtures under multiplicative censoring. Statistics. 2015;49:638-59.
- [CrossRef] [Google Scholar]
- Introduction to nonparametric estimation. Springer; 2009. Available from: https://link.springer.com/book/10.1007/b13794 [Last accessed on 2025 Aug 24]
- Wavethresh: Wavelets Statistics and Transforms. 2023. https://CRAN.R-project.org/package=wavethresh. R package version 4.7.2.
- Distribution function inequalities for martingales. Ann Probab. 1973;1:19-42.
- [CrossRef] [Google Scholar]
- Decoupling: from dependence to independence. New York: Springer-Verlag; 1999.

