Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
Letter to Editor
Mini Review
Original Article
Research Article
Review Article
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
Letter to Editor
Mini Review
Original Article
Research Article
Review Article
View/Download PDF

Translate this page into:

Original Article
5 (
1
); 85-95
doi:
10.25259/JQUS_12_2025

Linear Wavelet Estimation for Regression with Additive-Multiplicative Noise for Ergodic Data

Department of Statistics and Operations Research, College of Science, Qassim University, Buraydah, Saudi Arabia.

*Corresponding author: Dr. Sultana Didi Department of Statistics and Operations Research, Qassim University, Buraydah, Saudi Arabia. Email: S.biha@qu.edu.sa

Licence
This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-Share Alike 4.0 License, which allows others to remix, transform, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.

How to cite this article: Didi S, Esmail TA. Linear Wavelet Estimation for Regression with Additive-multiplicative Noise for Ergodic Data. J Qassim Univ Sci. 2026;1:85-95. doi: 10.25259/JQUS_12_2025

Abstract

Objectives

This paper builds on the linear wavelet estimator for nonparametric regression models incorporating additive and multiplicative noise under i.i.d. conditions.

Material and Methods

We extend their approach to scenarios where the covariate process is strictly stationary and ergodic. A significant advancement is the establishment of a martingale framework that elucidates the asymptotic properties of the projection estimator without requiring assumptions beyond ergodicity.

Results

We demonstrate that this estimator achieves the minimax-optimal convergence rate concerning the mean integrated square error (MISE) across Besov spaces. Additionally, we propose a data-driven technique for selecting the truncation parameter, and we validate our theoretical results through numerical simulations conducted on ergodic data.

Conclusion

This extension broadens the theoretical and practical reach of wavelet-based regression to a wide class of dependent data scenarios and provides a robust estimation framework for ergodic data common in fields like econometrics and signal processing.

Keywords

Ergodicity
Multiplicative noise
Nonparametric regression
Rates of convergence
Wavelets

INTRODUCTION

This study focuses on a unidimensional nonparametric regression framework defined by

(1.1)
Yi=Uif Xi +Vi,i=1,...,n

In this model f:[0,1] is the unknown function of interest. The covariate Xi 1in  is strictly stationary and ergodic with values in [0, 1]. The sequences  Ui 1in  and  Vi 1in  represent identically distributed multiplicative and additive noise components, respectively, with U_t following a uniform distribution symmetric about zero. We maintain the assumptions of independence between  Xi and Ui , as well as between  Ui and  Vi , for all indices i.

Our primary aim is to estimate the function r=f2 from the observed data Xi,Yi 1in  where the noise structure is a combination of multiplicative  Ui ) and additive ( Vi ) components. Model (1.1) extends the conventional nonparametric regression framework through the introduction of a multiplicative uniform noise component, which directly scales the unknown function f. Such models are widely used in applications such as signal processing (GPS signal detection with multiplicative and additive noise) and econometrics (e.g., volatility estimation).

Despite similarities to heteroscedastic regression models-see, for example, Chichignoud, Comte, Cai et al. the model considered here differs in its stochastic assumptions.[1,2,3] In particular, Cai et al.[1] study a fixed-design model with deterministic Vi, whereas our setting treats all noise components as random.

We propose a wavelet-based estimator for r, leveraging the ability of wavelets to represent complex functions. A linear wavelet estimator is constructed and shown to achieve a fast convergence rate in mean integrated square error (MISE) over Besov function classes, provided an appropriate tuning parameter is chosen. Since this parameter depends on the unknown smoothness of r, we introduce a practical data-driven selection method using an adapted version of the 2-Fold Cross Validation (2FCV) approach from Nason[4]. A simulation study demonstrates the applicability of this method.

The relaxation of the i.i.d. assumption to a strictly stationary and ergodic framework for the covariate process significantly broadens the applicability of our results. Ergodicity is a fundamental property in the theory of stochastic processes, ensuring that sample averages over time converge to their expected values across the ensemble. This makes it a natural and essential assumption for the analysis of time series data (e.g., financial returns, meteorological records) and spatially dependent processes, where observations are inherently correlated and the i.i.d. assumption is violated. While much of the existing literature on dependent data relies on strong mixing conditions,[1,3] our approach based on ergodicity offers a substantial generalization. It is well-established that many important processes, such as certain long-memory time series, Markov processes, and deterministic dynamical systems, are ergodic but not necessarily mixing Didi and bouzebda.[5,6] By developing a martingale framework that requires only ergodicity, we circumvent these more restrictive and often harder-to-verify mixing conditions. This represents a key originality of our work, extending the reach of nonparametric wavelet estimation to a wider class of dependent data structures without compromising the achievability of the minimax-optimal convergence rate.

While kernel-based estimators have also been studied under dependent and ergodic processes (see, e.g., Didi and Louani[7] for applications in frontier analysis), our wavelet-based framework offers distinct advantages in the present context. Wavelet estimators naturally provide a multiresolution analysis, which is adept at capturing local features like discontinuities and sharp peaks that are challenging for kernel methods with a fixed bandwidth. Furthermore, the linear wavelet projection estimator employed here leads to a computationally efficient algorithm, and the theoretical derivation of minimax rates over Besov spaces is particularly well-suited to the wavelet characterization of function smoothness. This provides a clear and established path for optimality theory, which we have extended to the ergodic setting.

Examples of wavelet applications include estimating the integrated squared derivative of a univariate density for independent data Prakasa 1996[8] and negatively or positively associated sequences [Chaubey et al., 2006].[9] The work of Prakasa[10] extended these methods to partial derivatives of multivariate densities under independence, while Hosseinioun et al., Koshkin and Vasil’iev[11,12] addressed the mixing scenario. More recently, Prakasa[13] studied wavelet estimators for the partial derivatives of multivariate densities under additive noise.

This paper is organized as follows. Section 2 reviews the Materials and Methods, covering the basics of wavelets and Besov balls, along with the assumptions and estimators. Section 3, titled Results and Discussion, presents the main theoretical result, the simulation study, and the proofs of our findings. The final section concludes the paper.

MATERIALS & METHODS

Basics on wavelets and BESOV Balls

This section provides a concise overview of wavelets and Besov spaces, which are central to the construction of our estimator. We employ compactly supported wavelets from the Daubechies family. For a comprehensive treatment, we refer to standard texts such as, Daubechies[14] Mallat[15]. For any resolution level j ≥ 0, the index set is Λ 1in = 0,..., 2 j1 . For any kΛj , in this set, the scaling and wavelet basis functions are generated via translation and dilation of the father and mother wavelets, respectively:

ϕj,k x= 2 j/2 ϕ 2Jxk ,ψj,k x= 2 j/2 ψ 2Jxk

Following the methodology, Cohen et al.[16] there exists an integer τ such that, for any integer the collection of functions

S= ϕ j0 ,k ,kΛ j0 ;ψj,k ;j 0,,j0 1 ,kΛ j0

forms an orthonormal basis of L2 ([0,1]). Therefore, for any integer j0 ,τ and any function fL2 ([0,1]), admits the wavelet expansion:

fx= kΛ j0 α j0 ,k ϕ j0 ,k x+ j=j0 kΛ j0 β j0 ,k ψ j0 ,k x,x 0,1

where the coefficients are given by the inner products:

α j0 ,k = 0 1 fx ϕ j0 ,k xdx,β j0 ,k = 0 1 fx ψ j0 ,k xdx 

A property that is instrumental in our proofs is the integral of the scaling function over its domain:

0 1 ϕj,k xdx= 2 j/2 .

Let Pi denote the orthogonal projection operator from L2 ([0,1]) onto the space with the orthonormal basis ϕj,k = 2 j/2 ϕ 2jk ,kΛj . Then, for any fL2 ([0,1]), we have

Pifx= kΛ j0 α j0 ,k ϕ j0 ,k x,x 0,1

Besov spaces Bp,qs([0,1]) are a flexible class of function spaces that can characterize a wide range of smoothness properties, including spatially inhomogeneous behavior. For further details, see [Härdle et al.,[17] Meyer,[18] Triebel.[19] Definitions of those spaces are given below. Suppose that ϕ is m regular (i.e. ϕCm and Dαϕx c 1+ x 2 l for each l, with α=0,1,...,m). Let fL2 ([0,1]), p,q[1,] and 0<s<m. Then the following assertions are equivalent:

  1. fBp,qs 0,1

  2. 2 js Pi+1 fPi fp lq;

  3. 2 j s 1p+ 1 2 βj,. p lq.

The Besov norm of h can be defined by

f Bp,qs :=αj, p+ 2 j s 1p+ 1 2 βj,. p jτ q, where βj,. pp= kΛ j0 βj,k p

Assumptions and Estimators

To facilitate the presentation of our main results, we introduce the following notation. For each integer i, let Fi be the σalgebra generated by the collection Xj:0ji . When i ranges from 1 to n, we define the conditional density of Xi given Fi1 as

f Xi Fi1 =f Fi1

Technical assumptions on the model (1.1) are formulated below.

A.1 We suppose thatf: 0,1 is bounded from above.

A.2 We suppose that U1 U θ,θ with θ>0 a fixed real number.

A.3 We suppose that V1 has a moment of order 4.

A.4 We suppose that Xi and Vi are independent for any i 1,...,n and Vi is stationary and ergodic.

A.5 For every xR, the sequence

1n i=1 nf Fi1 X=fx asn

both almost surely (a.s.) and in the L 2 sense.

A.6 The conditional mean of Yi2 given the σ −field Fi depends only on Xi , i.e., for any i1 ,

E Yi2 |Fi =E Yi2 |Xi = θ2 3 r Xi +E V1 2

Comments on the assumptions

A.1 (Boundedness of f): This is a standard technical condition that ensures the function r=f2  is also bounded, which is necessary to control the moments of the observations  Yi  and guarantee the finiteness of the wavelet coefficients.

A.2 (Uniform Multiplicative Noise): Specifying the uniform distribution for  Ui  is crucial as it allows us to compute the constant factor  θ2 /3  that appears in the key identity of A.6. The symmetry about zero is central to canceling the cross-term in E Yi2 |Xi .

A.3 (Additive Noise Moments): Requiring a fourth-order moment for  V1  is necessary to control the variance of our coefficient estimator  α^j,k , which involves the squared observations  Yi2 .

A.4 (Independence): The independence between  Xi  and  Vi  is crucial to ensure that the additive noise does not confound the relationship between the covariate and the function r. This allows the conditional expectation E Yi2 |Xi  to cleanly separate into a component depending on r Xi  and the constant variance of  Vi .

A.5 (Ergodicity of the Covariate Process): This is a significant generalization of the typical i.i.d. covariate assumption. The ergodic theorem ensures that time averages converge to ensemble expectations, which is the cornerstone for establishing the consistency of our sample-based estimator  α^j,k [GETIMGFILENAME=D:\ZAZA\SS\JQUS\XML\JQUS_12_2025_1006317\JQUS_12_2025_R21_InDesign\Links\JQUS12_78.wmf]. This relaxation widens the applicability of our model to many time series and spatially dependent data scenarios where mixing conditions may not hold or are difficult to verify Didi and Bouzebda.[5,6]

A.6 (Conditional Mean Structure): This assumption is fundamental to our estimator construction. It formalizes the fact that the relevant information for predicting  Yi2  is fully captured by the current covariate  Xi , rather than the entire filtration history. This conditional independence structure is what allows us to derive the pivotal relationship E Yi2 |Xi = θ2 3 r Xi +E V1 2 , which directly isolates r Xi  and forms the basis for our estimator in (3.1) and (3.2).

The extension from an i.i.d. to a strictly stationary and ergodic framework for the covariate process is theoretically justified by the ergodic theorem, which ensures that time averages converge to ensemble expectations. This forms the cornerstone for establishing the consistency of our sample-based estimator α^j,k . Methodologically, this approach is supported by modern asymptotic theory for dependent data, particularly the convergence results for empirical processes based on ergodic data as developed in Didi and Bouzebda, and Didi et al.[5,6] This relaxation significantly widens the applicability of our model to include many time series and spatially dependent data scenarios.

The reader may also be referred to Didi S, Bouzebda S.[6,21,22]

Our linear wavelet estimator for the function r is defined by the projection:

(3.1)
r^ j0 ,n x= kΛ j0 α^ j0 ,k ϕ j0 ,k x,x 0,1 ,

The empirical wavelet coefficients  α^j,k  are calculated as:

(3.2)
α^j,k = 3 θ2 1n i=1 nYi2 ϕj,k Xi E V1 2 2 j/2 ,

The form of the estimator α^j,k is an original contribution developed by Chesneau et al.[23] to address the challenges of multiplicative noise in nonparametric regression. While straightforward to compute, the efficacy of this estimator is largely governed by the tuning parameter j0. This linear wavelet approach is well-established in nonparametric regression [Härdle et al., 1998],[17] with contemporary extensions found in Chaubey et al.[24]

The form of the estimator is derived from the mathematical relationship E Yi2 |Xi = θ2 3 r Xi +E V1 2 , which isolates r Xi under the assumptions of the model (e.g., independence between Xi,Ui,Vi (for the same (i), uniformity of Ui , and known noise moments). The term 2 j/ 2 arises from the integral of the scaling function ϕj,k x over 0,1 .

RESULTS AND DISCUSSION

The central theoretical result of this work establishes the convergence rate of our estimator r^ j0 ,n , measured by the MISE over Besov function spaces.

Theorem 3.1. Consider the problem defined by (1.1) under the assumptions A1-A6, let rBp,qs([0,1]) with p,q 1, ,s>1 /p. Let the linear wavelet estimator r^ j0 ,n be defined as in (3.1), with the resolution level j0 chosen such that 2 j0 n 1 2 s+1 , where s’ accounts for the adaptation of the Besov space  Bp,qs  to the L2-norm used in the MISE criterion, defined by

s=s 1/p1/2 +

Then, the estimator achieves the minimax-optimal convergence rate

E 0 1 r^ j0 ,n xrx 2 dx n 2 s 2 s+1 .

The resolution level j0 specified in Theorem 3.1 is selected to asymptotically minimize the MISE of r^ j0 ,n for functions within Besov spaces. The resulting convergence rate of n 2 s 2 s+1 is the minimax-optimal rate familiar from standard nonparametric regression theory [Härdle et al., 1998,[18] Tsybakov, 2009].[25] The proof of Theorem 3.1 relies on a bias-variance decomposition of the MISE, supported by intermediate results on the estimator α^j,k (given in (3.2) which are detailed in Lemmas 1 and 2 (Section 6). The remainder of this section addresses the practical challenge of selecting j0 without prior knowledge of the smoothness s. We adapt the 2-Fold Cross Validation (2FCV) method from Nason[4] originally designed for wavelet thresholding-to choose the truncation level j0 for our linear estimator. A numerical study demonstrates the effectiveness of this data-driven approach.

Simulation study

In order to illustrate the empirical performance of the proposed estimator under ergodic covariates, a numerical study was conducted. To reflect a realistic application context, we implemented an automatic data-driven method for selecting the truncation parameter j0, which does not require prior knowledge of the regularity of the function to be estimated. Simulations were carried out using R, with wavelet computations supported by the ‘wavethresh’ package Nesson 2023.[26] This procedure involves randomly partitioning the dataset into two equally sized subsets, constructing an estimator on one subset, and validating its performance on the other. The resolution level j0​ that minimizes the average validation error across both subsets is selected. This well-established routine provides a fully data-driven and computationally efficient method for choosing the tuning parameter without prior knowledge of the function’s smoothness.

In our numerical study, we generated data from model (1.1) with a sample size of n=4096. The covariates  Xi  were simulated from a stationary AR(1) process,  Xt=φXt1 +εt, with φ=0.5  and  εtN 0,0.1 , normalized to the interval [0,1]. This process is known to be ergodic for  φ<1 , thereby satisfying our core Assumption (A.5).

In accordance with Assumption A.2, the multiplicative noise  Ui  was drawn from a uniform distribution U([θ,θ]). For this numerical study, we used θ=1 . The additive noise  Vi  was drawn from a normal distribution N 0,0.01 .

For the wavelet construction, we selected Daubechies’ compactly-supported wavelets with 8 vanishing moments, implemented via the wavethresh package in R. We utilized the Daubechies extremal phase wavelets (filter.number=8) and employed periodic boundary handling to manage the interval [0,1].

We consider the HeaviSine and Bumps test functions for f , and we wish to estimater=f2 .

To implement the estimator with a random design, we sorted the ( Xi,Yi2 ) pairs by  Xi and then applied Mallat’s algorithm to the sequentially ordered  Y2  values. The truncation parameter j0 was selected using a 2-Fold Cross Validation (2FCV) method. Figure 2 displays the observations, sample estimators, and the selection criterion curves. The method successfully identifies a near-optimal truncation level, and the resulting estimator accurately captures the shape of r(x). Similar results were obtained with the Bumps function, confirming the robustness of the procedure.

The two test (squared) functions to be estimated. (a) HeavySine function, (b) Bumps Function.
Figure 1:
The two test (squared) functions to be estimated. (a) HeavySine function, (b) Bumps Function.
(a) Mean Integrated Squared Error (log scale) comparing proposed and standard (i.i.d case) methods across increasing sample sizes (n = 512, 2048, 4096) (b) Visual comparison of true HeaviSine function against estimates from both methods in a single simulation trial (c) Average truncation level j₀ with standard deviation bars, showing stable parameter choice across sample sizes.
Figure 2:
(a) Mean Integrated Squared Error (log scale) comparing proposed and standard (i.i.d case) methods across increasing sample sizes (n = 512, 2048, 4096) (b) Visual comparison of true HeaviSine function against estimates from both methods in a single simulation trial (c) Average truncation level j₀ with standard deviation bars, showing stable parameter choice across sample sizes.

The empirical performance of our proposed estimator is further summarized in Figures 3 and 4, which provide a comparative analysis against a standard estimator designed for i.i.d. covariates. For both the HeaviSine [Figure 3] and Bumps [Figure 4] functions, Sub-Figure(a) displays the Mean Integrated Squared Error (MISE) on a log scale across different sample sizes, demonstrating the competitive convergence of our method. Sub-Figure(b) offers a visual comparison from a single simulation trial, illustrating that our estimator accurately captures the underlying function shape. Furthermore, Sub-Figure(c) shows the average data-driven truncation level  j0 ​ selected by the 2FCV method, confirming its stable and adaptive selection across simulations.

Bumps Function (a) Mean Integrated Squared Error comparison between proposed and standard methods for Bumps function across sample sizes n = 512, 2048, 4096 (b) Visual comparison of true Bumps function against estimates from both methods, demonstrating superior peak preservation by the proposed approach (c) Average truncation level j₀ with standard deviation bars, showing adaptive resolution selection for the spatially complex Bumps function.
Figure 3:
Bumps Function (a) Mean Integrated Squared Error comparison between proposed and standard methods for Bumps function across sample sizes n = 512, 2048, 4096 (b) Visual comparison of true Bumps function against estimates from both methods, demonstrating superior peak preservation by the proposed approach (c) Average truncation level j₀ with standard deviation bars, showing adaptive resolution selection for the spatially complex Bumps function.
( a ) Noisy observations (X,Y 2). (b): Sample of the model collection. (c) Graph of the MSE (blue) against j0 and (re-scaled) 2FCV criterion. (d): Typical estimations from one simulation with n = 4096. Blue lines indicate the true functions; red lines correspond to the estimators r̂0,n. FCV: Fold cross validation
Figure 4:
( a ) Noisy observations (X,Y 2). (b): Sample of the model collection. (c) Graph of the MSE (blue) against j0 and (re-scaled) 2FCV criterion. (d): Typical estimations from one simulation with n = 4096. Blue lines indicate the true functions; red lines correspond to the estimators r̂0,n. FCV: Fold cross validation

Proofs

To prove Theorem 3.1, we use the following lemmas. We establish a new upper bound for the partial sums of unbounded martingale differences. This result is crucial for analyzing the asymptotic properties of a density estimator derived from strictly stationary and ergodic samples. Throughout, C denotes a positive constant that may change from one instance to the next. The corresponding inequality is presented in the following lemmas.

The following well-known inequality is crucial for controlling the growth of martingale sequences in Lp spaces. We use it to bound the variance of our empirical wavelet coefficients.

Lemma 6.1. (Burkholder-Rosenthal inequality) Following Notation 1 in Burkholder.[27]

Let Xi i1 be a stationary martingale adapted to the filtration Fi i1 , define di i1 is the sequence of martingale differences adapted to Fi i1 and Sn= i=1 n di , then for any positive integer n

(6.1)
max 1jn Sj pn 1p d1 p+ k=1 nE dk2 |Fk1 p2 1 2 , for any p2;

Where as usual the norm p= E p 1/p .

This exponential tail bound for martingale differences is essential for establishing almost-sure convergence rates. It will be used to handle the deviation of the coefficient estimators in the uniform convergence analysis.

Lemma 6.2. (Exponential Inequality for Martingale Differences) Let Zn n1 be a sequence of real-valued martingale differences with respect to the filtration Fn=σ Z1 ,,Zn n1 . Define the partial sum:

Sn= i=1 n Zi

Suppose there exist constants C>0 and a sequence d n n1 of nonnegative real numbers such that for all p2 , n1 , the following moment condition holds almost surely:

E Zip|Fn1 Cp1 p!dn2 , almost surely

Then, for any ε>0 , we have the tail bound:

Sn >ε 2exp ε2 2 Dn+Cε

Where Dn= i=1 n di2 .

Proof of Lemma 6.2. The proof follows as a particular case of Theorem 8.2.2.[28]

This lemma establishes that our empirical wavelet coefficient is an unbiased estimator for the true coefficient, which is a direct consequence of the model assumptions and the linearity of the estimator.

Lemma 6.3. Chesneau et al.[22]

Let jτ,kΛj,α^j,k be (3.2). Then, under A.1-A.4, we have

E  α^j,k =αj,k .

Define

(6.2)
α˜j,k,n = 3 θ2 l=1 nE Yl2 ϕj,k Xl |Fl1 E V1 2 2 j/2 ,

This result strengthens Lemma 6.3 by showing that the conditional expectation of our estimator, given the past, also converges to the true coefficient. This demonstrates the robustness of the estimator under the ergodic dependent structure.

Lemma 6.4. Let jτ,kΛj,α˜j,k,n be (6.2). Then, under A.1-A.6, we have

α˜j,k,n =αj,k ,asn

Proof of Lemma 6.4. Consider the following decomposition

(6.3)
α˜j,k,n αj,k =α˜j,k,n E  α^j,k +E  α^j,k αj,k =Ej,k,1 +Ej,k,2 , 

Owing to Lemma 6.3 we have

(6.4)
Ej,k,2 =o 1 .

considering the structure of the σfiltration Fl we have

Fl1 Fl

therefore,

Ej,k,1 =α˜j,k,n E α^j,k = 3 θ2 1n l=1 n E Yl2 ϕj,k Xl |Fl1 E Yl2 ϕj,k Xl = 3 θ2 1n l=1 n E E Yl2 |Fl ϕj,k Xl |Fl1 E E Yl2 |Fl ϕj,k Xl

observe that under assumption (A.6), we have

(6.5)
E E Yl2 |Fl ϕj,k Xl |Fl1 =E E Yl2 |Xl ϕj,k Xl |Fl1 =E θ2 3 r Xi +E V1 2 ϕj,k Xl |Fl1 = θ2 3 E r Xi ϕj,k Xl |Fl1 +E V1 2 E ϕj,k Xl |Fl1 ,

And

(6.6)
E E Yl2 |Fl ϕj,k Xl =E E Yl2 |Xl ϕj,k Xl = θ2 3 E r Xi ϕj,k Xl +E V1 2 E ϕj,k Xl ,

gathering statement (6.5) and (6.6), we have

Ej,k,1,1 = 1n l=1 n E r Xi ϕj,k Xl |Fl1 E r Xi ϕj,k Xl +E V1 2 3 θ2 1n l=1 n E ϕj,k Xl |Fl1 E ϕj,k Xl =Ej,k,1,1 +Ej,k,1,2

Where, under assumption (A.5), we have

(6.7)
Ej,k,1,1 = 1n l=1 n E r Xi ϕj,k Xl |Fl1 E r Xi ϕj,k Xl = 0 1rx ϕj,k x 1n l=1 nf Fl1 x fx dx=o 1 αj,k =o 1 .

and

(6.8)
Ej,k,1,2 =E V1 2 3 θ2 1n l=1 n E ϕj,k Xl |Fl1 E ϕj,k Xl =E V1 2 3 θ2 0 1ϕj,k x 1n l=1 nf Fl1 xfx dx =o 1 3E V1 2 2 j2 θ2 =o 1 .

using statements (6.7) and (6.8), we deduce

(6.9)
Ej,k,1 =o 1 .

By combining statements (6.3), (6.4) and (6.9), we conclude the proof of Lemma 1.

This lemma provides a bound on the mean-squared error of the empirical wavelet coefficients. It is the key step in the bias-variance decomposition that leads to the final MISE convergence rate in Theorem 3.1. The proof relies on our martingale framework to handle the dependence.

Lemma 6.5. Let jτ such that 2jn, k Λj, α^j,k be (3.2). Then, under A.1-A.4, we have

E α^j,k αj,k 2 1n.

Proof of Lemma 6.5. Consider first the decomposition

(6.10)
α^j,k αj,k = α^j,k α˜j,k,n + α˜j,k,n αj,k

From decomposition (6.10), it follows that

E α^j,k αj,k 2 =E α^j,k α˜j,k,n 2 +2E α^j,k αj,k α˜j,k,n j,k αj,k +E α˜j,k,n j,k αj,k 2 =Aj,k,1 +Aj,k,2 +Aj,k,3 ,

Owing to Lemma 6.4 we have

(6.11)
Aj,k,3 =o 1 .

On the other hand, from statements (3.2) and (6.2), we observe that

α^j,k α˜j,k,n = 3 θ2 1n l=1 n Yl2 ϕj,k Xl E Yl2 ϕj,k Xl |Fl1 =  3 θ2 1n l=1 nZj,k,l ,

Therefore,

(6.12)
Aj,k,2 =2E α^j,k αj,k α˜j,k,n j,k αj,k =o 1 3 θ2 1n l=1 nE Yl2 ϕj,k Xl E Yl2 ϕj,k Xl |Fl1 =0.  

Lastly, notice that Zj,k,l 0ln is a sequence of martingale differences with respect to the sequence of σ-fields Fl 0ln , Using the inequality provided by Lemma 6.1 we immediately deduce

Aj,k,1 = 9 θ4 1 n2 E l=1 nZj,k,l 2

Furthermore, one can establish:

(6.13)
E l=1 nZj,k,l 2 1/2 n Zj,k,l 2 + l=1 nE Zj,k,l 2 |Fl1 1 1 2 = Φ 1 + Φ 2 .

To analyze these terms, we use a standard decomposition and note that F0 is the trivial σ-field. Hence, one obtains

(6.14)
1n Φ 1 2 =E Y1 2 ϕj,k X1 E Y1 2 ϕj,k X1 |F0 2 = i=1 2 2 i E Y1 2 ϕj,k X1 i E Y1 2 ϕj,k X1 2i 4E Y1 2 ϕj,k X1 2 ,

following a similar argument as in (6.6), and using the statement r=f2 we obtain

(6.15)
E Y1 2 ϕj,k X1 2 =E U1 f X1 +V1 4 ϕj,k 2 X1 =E U1 4 f4 X1 +4 U1 2 V1 2 f2 X1 +4 U1 3 V1 f3 X1 +2 U1 V1 3 f X1 +V1 4 ϕj,k 2 X1 ,

Using the independence assumptions on the random variables, A.1-A.5 with E[U1 ]=0 , observe that

E U1 V1 3 f X1 ϕj,k 2 X1 =E U1 E V1 3 E f X1 ϕj,k 2 X1 =0.

Moreover, E V1 =0 , observe that

E U1 3 V1 f3 X1 ϕj,k 2 X1 =E U1 3 E V1 E f3 X1 ϕj,k 2 X1 =0.

and by assumption (A.3) we have

E V1 4 ϕj,k 2 X1 =E V1 4 E ϕj,k 2 X1 =E V1 4 0 1ϕj,k 2 xdx=E V1 4 1.

Therefore, using similar mathematical arguments with E U1 2 = θ2 3 , and assumption (A.1), we have

E U1 2 V1 2 f2 X1 ϕj,k 2 X1 = θ2 3 E V1 2 E f2 X1 ϕj,k 2 X1 Cf2 θ2 3 E V1 2 .

Lastly, by the independence assumption between U1 and X1 and (A.1) and (A.2), we have

E U1 4 f4 X1 ϕj,k 2 X1 =E U1 4 E f4 X1 ϕj,k 2 X1 E U1 4 = θ4 5 .

Thus, all the terms of (6.15) are bounded from above

(6.16)
Φ 1 =O n 1/2 .

Next, we analyze the second component of the decomposition in equation (6.13). Specifically, we consider

Φ 2 = E l=1 nE Zj,k,l 2 |Fl1 1/2 = l=1 nE E Zj,k,l 2 |Fl1 1/2 = l=1 nE Zj,k,l 2 .

Applying a standard identity, we find

E Zj,k,l 2 =E Yl2 ϕj,k Xl E Yl2 ϕj,k Xl |Fl1 2 2E Yl2 ϕj,k Xl 2 +2E Yl2 ϕj,k Xl 2 |Fl1 4E Yl2 ϕj,k Xl 2 .

Using equations (6.14) and (6.15), it follows that

(6.17)
Φ 2 =O n 1/2 .

Combining the results from equations (6.16) and (6.17), we derive

E l=1 nZj,k,l 2 1/2 =O n 1/2 .

Consequently,

Aj,k,1 =E α^j,k α˜j,k,n 2 = 9 θ4 1 n2 E l=1 nZj,k,l 2 =O n1 .

This ends the proof of Lemma 2.

Proof of Theorem 3.1. The proof follows the standard bias-variance decomposition paradigm for projection estimators. The novel challenge, addressed by our martingale framework, lies in establishing the variance bound under the general ergodic setting.

We proceed via a conventional bias-variance trade-off analysis [Härdle et al., 2012].[18] The pivotal step is the application of Lemma 6.5, which provides the necessary variance bound for the coefficient estimators; this lemma’s proof relies critically on our martingale framework. The optimal convergence rate is then achieved by selecting j_ to equilibrate this variance with the projection bias. To this end, we begin with the projector-based decomposition:

(6.18)
E 0 1 r^ j0 ,n xrx 2 dx =E r^ j0 ,n P j0 r2 2 +P j0 rr2 2 ; 

From the orthonormality of the wavelet basis, we obtain:

E r^ j0 ,n P j0 r2 2 =E kΛ j0 α^ j0 ,k α j0 ,k ϕ j0 ,k 2 2 = kΛ j0 E α^ j0 ,k α j0 ,k 2

It follows from Lemma 6.5 that, Λ j0 2 j0 and 2 j0 n 1 2 s+1 ,

(6.19)
E r^ j0 ,n P j0 r2 2 2 j0 nn 2 s 2 s+1 ;  

Case 1: p2,s=s. By Hölder inequality and rBp,qs 0,1 we have

P j0 rr2 2 P j0 rrp2 2 j0 s n 2 s 2 s+1 ;

Case 2: 1p 2,s 1/p. Using the embedding  Bp,qs 0,1 B 2, s 0,1 we have

(6.20)
P j0 rr2 2 j=j0 2 js 2 j0 s n 2 s 2 s+1 ;

In both cases,

P j0 rr2 2 n 2 s 2 s+1 ;

By statements (6.18), (6.19) and (6.20), we obtain

E 0 1 r^ j0 ,n xrx 2 dx n 2 s 2 s+1 ;

This proves Theorem 3.1.

CONCLUSION

This paper introduces a straightforward wavelet-based method for estimating an unknown function in the presence of both additive and multiplicative noise. We extend the framework of Chesneau et al.[6] by considering a strictly stationary and ergodic covariate process, rather than the conventional i.i.d. assumption. Under the condition of a uniform multiplicative noise distribution, we develop a linear wavelet estimator that achieves a minimax-optimal convergence rate. Practical implementations of the estimator are discussed and supported by numerical experiments demonstrating its effectiveness.

Several promising directions for future research emerge from this work. The model could be generalized to accommodate unknown parameter I^¸ or more flexible distributions for the multiplicative noise component. Another worthwhile extension would be the development of a nonlinear wavelet estimator incorporating thresholding techniques and adaptive dependence structures, building upon the work of Chesneau et al. [5] originally developed for additive noise settings. These research avenues present substantial opportunities for further investigation beyond the scope of this study.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.

Use of artificial intelligence (AI)-assisted technology for manuscript preparation

The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.

References

  1. , . Adaptive variance function estimation in heteroscedastic nonparametric regression. Ann Statist. 2008;36 Available from: https://arxiv.org/pdf/0810.4780 [Last accessed on 2025 Aug 24]
    [CrossRef] [Google Scholar]
  2. . Minimax and minimax adaptive estimation in multiplicative regression: Locally bayesian approach. Probab Theory Relat Fields. 2012;153:543-86.
    [CrossRef] [Google Scholar]
  3. . Nonparametric estimation. Spartacus-IDH; 2015. Available from: https://spartacus-idh.com/liseuse/030/#page/1 [Last accessed on 2025 Aug 24].
  4. . Wavelet shrinkage using cross-validation. J Royal Stat Society: Ser B (Methodological). 1996;58:463-79.
    [CrossRef] [Google Scholar]
  5. , . Wavelet density and regression estimators for continuous time functional stationary and ergodic processes. Mathematics. 2022;10:4356.
    [CrossRef] [Google Scholar]
  6. , . Wavelet estimation of partial derivatives in multivariate regression under discrete-time stationary ergodic processes. Mathematics. 2025;13:1587.
    [CrossRef] [Google Scholar]
  7. , . Consistency results for the kernel density estimate on continuous time stationary and dependent data. Statistics & Probability Lett. 2013;83:1262-70.
    [CrossRef] [PubMed] [Google Scholar]
  8. . Nonparametric estimation of the derivatives of a density by the method of wavelets. Bull Inform Cybern. 1996;28:91-100.
    [Google Scholar]
  9. , , . Wavelet based estimation of the derivatives of a density with associated variables. Int J Pure Appl Mathematics. 2006;27:97-106.
    [Google Scholar]
  10. . Nonparametric estimation of partial derivatives of a multivariate probability density by the method of wavelets. In: Asymptotics in statistics and probability Asymptotics in statistics and probability. De Gruyter; . p. :321-30.
    [Google Scholar]
  11. , , . Nonparametric estimation of a multivariate probability density for mixing sequences by the method of wavelets. Italian J Pure Appl Mathematics. 2011;28:31-40.
    [Google Scholar]
  12. , . Career transitions in competitive sport. In: , , eds. Sport psychology: Theory, applications and issues. Brisbane, Australia: Wiley; . p. :584-610.
    [Google Scholar]
  13. . Wavelet estimation for derivative of a density in the presence of additive noise. Braz J Probab Stat. 2018;32:834-850.
    [CrossRef] [Google Scholar]
  14. . Ten lectures on wavelets, 61. SIAM; .
  15. . A wavelet tour of signal processing: The sparse way. Academic Press; .
  16. , , . Wavelets on the interval and fast wavelet transforms. Appl Computational Harmonic Analysis. 1993;1:54-81.
    [CrossRef] [Google Scholar]
  17. , , , . Wavelets, Approximation, and Statistical Applications. New York: Springer; .
  18. . Wavelets and operators. In: Analysis at Urbana Analysis at Urbana. Cambridge University Press; . p. :256-365.
    [Google Scholar]
  19. . Theory of function spaces II. Bull Am Math Society. 1994;31:119-25.
    [Google Scholar]
  20. , , . Wavelet density and regression estimators for functional stationary and ergodic data: Discrete time. Mathematics. 2022;10:3433.
    [CrossRef] [Google Scholar]
  21. , . Linear wavelet-based estimators of partial derivatives of multivariate density function for stationary and ergodic continuous time processes. Entropy (Basel). 2025;27:389.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  22. , . Wavelet-based estimators of partial derivatives of a multivariate density function for discrete stationary and ergodic processes. AIMS Mathematics. 2025;10:12519-53.
    [Google Scholar]
  23. , , . Linear wavelet estimation inregression with additive and multiplicative noise. In: Springer proceedings in mathematics & statistics, Nonparametric statistics Springer proceedings in mathematics & statistics, Nonparametric statistics. Springer International Publishing; . p. :135-44.
    [Google Scholar]
  24. , , . Adaptive wavelet estimation of a density from mixtures under multiplicative censoring. Statistics. 2015;49:638-59.
    [CrossRef] [Google Scholar]
  25. . Introduction to nonparametric estimation. Springer; . Available from: https://link.springer.com/book/10.1007/b13794 [Last accessed on 2025 Aug 24]
  26. . Wavethresh: Wavelets Statistics and Transforms. . https://CRAN.R-project.org/package=wavethresh. R package version 4.7.2.
  27. . Distribution function inequalities for martingales. Ann Probab. 1973;1:19-42.
    [CrossRef] [Google Scholar]
  28. , . Decoupling: from dependence to independence. New York: Springer-Verlag; .
Show Sections