Uniformly Lipschitz

Wavelet Zoom

Stéphane Mallat , in A Wavelet Tour of Signal Processing (Third Edition), 2009

6.5 EXERCISES

6.1

2 Lipschitz regularity:

(a)

Prove that if f is uniformly Lipschitz α on [ a, b], then it is pointwise Lipschitz α at all t 0 ∈ [a, b].

(b)

Show that f(t) = t sin t −1 is Lipschitz 1 at all t 0 ∈ [−1, 1] and verify that it is uniformly Lipschitz α over [−1, 1] only for ∈ ≤ 1/2. Hint: Consider the points tn = (n + 1/2)−1 π−1.

6.2

2 Regularity of derivatives:

(a)

Prove that f is uniformly Lipschitz α > 1 over [a, b] if and only if f′ is uniformly Lipschitz α — 1 over [a, b].

(b)

Show that f may be pointwise Lipschitz a > 1 at t 0 while f′ is not pointwise Lipschitz α — 1 at t 0. Consider f(t) = t 2 cos t −1 at t = 0.

6.3

2 Find f(t) that is uniformly Lipschitz 1, but does not satisfy the sufficient Fourier condition (6.1).

6.4

1 Let f(t) = cos ω0 t and ψ(t) be a wavelet that is symmetric about 0.

(a)

Verify that

W f ( u , s ) = s ψ ^ ( s ω 0 ) cos ω 0 t .

(b)

Find the equations of the curves of wavelet modulus maxima in the time-scale plane (u, s). Relate the decay of |Wf(u, s)| along these curves to the number n of vanishing moments of ψ.

6.5

1 Let f(t) = |t|α. Show that Wf (u, s) = s α + 1/2 Wf(u/s, 1). Prove that it is not sufficient to measure the decay of |Wf(u, s)| when s goes to zero at u = 0 in order to compute the Lipschitz regularity of f at t = 0.

6.6

3 Let f(t) = |t|α sin |t|—β with α > 0 and β > 0. What is the pointwise Lipschitz regularity of f and f′ at t = 0? Find the equation of the ridge curve in the (u, s) plane along which the high-amplitude wavelet coefficients |Wf(u, s)| converge to t = 0 when s goes to zero. Compute the maximum values of α and α′ such that Wf(u, s) satisfies (6.21).

6.7

2 For a complex wavelet, we call lines of constant phase the curves in the (u, s) plane along which the complex phase of Wf(u, s) remains constant when s varies.

(a)

If f(t) = |t|α, prove that the lines of constant phase converge toward the singularity at t = 0 when s goes to zero. Verify this numerically.

(b)

Let ψ be a real wavelet and Wf(u, s) be the real wavelet transform of f. Show that the modulus maxima of Wf(u, s) correspond to lines of constant phase of an analytic wavelet transform, which is calculated with a particular analytic wavelet ψa that you will specify.

6.8

3 Prove that if f = 1 [0, + ∞), then the number of modulus maxima of Wf(u, s) at each scale s is larger than or equal to the number of vanishing moments of ψ.

6.9

2 The spectrum of singularity of the Riemann function

f ( t ) = Σ n = + 1 n 2 sin n 2 t

is defined on its support by D(α) = 4α — 2 if α ∈[1/2, 3/4] and D(3/2) = 0 [304, 313]. Verify this result numerically by computing this spectrum from the partition function of a wavelet transform modulus maxima.

6.10

3 Let ψ = —θ′ where θ is a positive window of compact support. If f is a Cantor devil's staircase, prove that there exist lines of modulus maxima that converge toward each singularity.

6.11

3 Implement an algorithm that detects oscillating singularities by following the ridges of an analytic wavelet transform when the scale s decreases. Test your algorithm on f(t) = sin t −1.

6.12

2 Implement an algorithm that reconstructs a signal from the local maxima of its dyadic wavelet transform with a dual synthesis (6.48) using a conjugate-gradient algorithm.

6.13

3 Let X[n] = f[n] + W[n] be a signal of size N, where W is a Gaussian white noise of variance σ2. Implement in WaveLab an estimator of f that thresholds at T = λ σ the maxima of a dyadic wavelet transform of X. The estimation of f is reconstructed from the thresholded maxima representation with the dual synthesis (6.48) implemented with a conjugate-gradient algorithm. Compare numerically the risk of this estimator with the risk of a thresholding estimator over the translation-invariant dyadic wavelet transform of X.

6.14

2 Let θ(t) be a Gaussian of variance 1.

(a)

Prove that the Laplacian of a two-dimensional Gaussian

ψ ( x 1 , x 2 ) = 2 θ ( x 1 ) x 2 θ ( x 2 ) + θ ( x 1 ) θ ( x 2 ) x 2 2

satisfies the dyadic wavelet condition (5.101) (there is only one wavelet).

(b)

Explain why the zero-crossings of this dyadic wavelet transform provide the locations of multiscale edges in images. Compare the position of these zero-crossings with the wavelet modulus maxima obtained with ψ1(x 1, x 2) = —θ′(x 1) θ(x 2) and ψ2(x 1, x 2) = —θ(x 1) θ′(x 2).

6.15

2 The covariance of a fractional Brownian motion BH (t) is given by (6.86). Show that the wavelet transform at a scale s is stationary by verifying that

E { W B H ( u 1 , s ) W B H ( u 2 , s ) } = σ 2 2 s 2 H + 1 + | t | 2 H Ψ ( u 1 u 2 s t ) d t ,

with Ψ ( t ) = ψ ψ ¯ ( t ) and ψ ¯ ( t ) = ψ ( t ) .

6.16

2 Let X(t) be a stationary Gaussian process with a covariance RX (τ) = E{X(t)X(t — τ)} that is twice differentiable. One can prove that the average number of zero-crossings over an interval of size 1 is —πRX ″(0) (π2 RX (0))−1 [53]. Let BH (t) be a fractional Brownian motion and ψ a wavelet that is C 2. Prove that the average numbers, respectively, of zero-crossings and of modulus maxima of WBH (u, s) for u ∈[0, 1] are proportional to s. Verify this result numerically.

6.17

2 Implement an algorithm that estimates the Lipschitz regularity α and the smoothing scale σ of sharp variation points in one-dimensional signals by applying the result of Theorem 6.7 on the dyadic wavelet transform maxima.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123743701000100

Sparsity in Redundant Dictionaries

Stéphane Mallat , in A Wavelet Tour of Signal Processing (Third Edition), 2009

Approximation of Piecewise Cα Images

Definition 9.1 defines a piecewise C α image f as a function that is uniformly Lipschitz α everywhere outside a set of edge curves, which are also uniformly Lipschitz α. This image may also be blurred by an unknown convolution kernel. If f is uniformly Lipschitz α without edges, then Theorem 9.16 proves that a linear wavelet approximation has an optimal error decay ɛ l ( M , f ) = | | f f M | | 2 = O ( M α ) . Edges produce a larger linear approximation error ɛ l ( M , f ) = O ( M 1 / 2 ) , which is improved by a nonlinear wavelet approximation ɛ n ( M , f ) = O ( M 1 ) , but without recovering the O(M −α) decay. For α = 2, Section 93 shows that a piecewise linear approximation over an optimized adaptive triangulation with M triangles reaches the error decay O(M −2). Thresholding curvelet frame coefficients also yields a nonlinear approximation error ɛ n ( M , f ) = O ( M 2 ( log M ) 3 ) that is nearly optimal. However, curvelet approximations are not as efficient as wavelets for less regular functions such as bounded variation images. If f is piecewise C α with α > 2, curvelets cannot improve the M −2 decay either.

The beauty of wavelet and curvelet approximation comes from their simplicity. A simple thresholding directly selects the signal approximation support. However, for images with geometric structures of various regularity, these approximations do not remain optimal when the regularity exponent α changes. It does not seem possible to achieve this result without using a redundant dictionary, which requires a more sophisticated approximation scheme.

Elegant adaptive approximation schemes in redundant dictionaries have been developed for images having some geometric regularity. Several algorithms are based on the lifting technique described in Section 7.8, with lifting coefficients that depend on the estimated image regularity [155, 234, 296, 373, 477]. The image can also be segmented adaptively in dyadic squares of various sizes, and approximated on each square by a finite element such as a wedglet, which is a step edge along a straight line with an orientation that is adjusted [216]. Refinements with polynomial edges have also been studied [436], but these algorithms do not provide M-term approximation errors that decay like O(M −α) for all piecewise regular C α images.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123743701000161

Approximations in Bases

Stéphane Mallat , in A Wavelet Tour of Signal Processing (Third Edition), 2009

Lipschitz Regularity

A different measure of uniform regularity is provided by Lipschitz exponents, which compute the error of a local polynomial approximation. A function f is uniformly Lipschitz α over [0, 1] if there exists K > 0, such that for any v ɛ [0, 1], one can find a polynomial pv of degree ⌊ α ⌋ such that

(9.20) t [ 0 , 1 ] , | f ( t ) p v ( t ) | K | t v | α .

The infimumof the K that satisfy (9.20) is the homogeneous Hölder α norm f C ˜ α . The Hölder α norm of f also imposes that f is bounded:

(9.21) f C α = f C ¯ α + f .

Space C α [0, 1] of functions f such that f C α < + is called a Hölder space. Theorem 9.6 characterizes the decay of wavelet coefficients.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123743701000136

Special Volume: Mathematical Modeling and Numerical Methods in Finance

Huyên Pham , ... Wolfgang J. Runggaldier , in Handbook of Numerical Analysis, 2009

Lemma B.4

Let H2, H3, and H4 hold. Then, for all k = 0, …, n, the function u k α is Lipschitz, uniformly with respect to α and

[ u k α ] Lip L k ,

where

L k : = ( L ¯ g f ¯ ( n - k ) + M ¯ + 3 L ¯ g h ¯ ) ( 2 L ¯ g ) n - k 2 L ¯ g - 1 ,

and f ¯ := max([f]sup, [f]Lip), h ¯ := max([h]sup, [h]Lip), L ¯ g := max(L g , 1), and M ¯ := max([H]Lip, 1).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/S1570865908000094

Compression

Stéphane Mallat , in A Wavelet Tour of Signal Processing (Third Edition), 2009

Intra- and Cross-Scale Correlation

The significance maps in Figure 10.10 show that significant coefficients tend to be aggregated along contours or in textured regions. Indeed, wavelet coefficients have a large amplitude where the signal has sharp transitions. At each scale and for each direction, a wavelet image coder can take advantage of the correlation between neighbor wavelet coefficient amplitude, induced by the geometric image regularity. This was not done by the wavelet coder from Section 10.4.1, which makes a binary encoding of each coefficient independently from its neighbors. Taking advantage of this intrascale amplitude correlation is an important source of improvement for JPEG-2000.

Figure 10.10 also shows that wavelet coefficient amplitudes are often correlated across scales. If a wavelet coefficient is large and thus significant, the coarser scale coefficient located at the same position is also often significant. Indeed, the wavelet coefficient amplitude often increases when the scale increases. If an image f is uniformly Lipschitz α in the neighborhood of ( x 0, y 0), then (6.58) proves that for wavelets ψ j , p , q 1 located in this neighborhood, there exists A ≥ 0 such that

| f , ψ j , p , q 1 | A 2 j ( α + 1 ) .

The worst singularities are often discontinuities, so α ≥ 0. This means that in the neighborhood of singularities without oscillations, the amplitude of wavelet coefficients decreases when the scale 2 j decreases. This property is not always valid, in particular for oscillatory patterns. High-frequency oscillations create coefficients at large scales 2 j that are typically smaller than at the fine scale that matches the period of oscillation.

To take advantage of such correlations across scales, wavelet zero-trees have been introduced by Lewis and Knowles [348]. Shapiro [432] used this zero-tree structure to code the embedded significance maps of wavelet coefficients by relating these coefficients across scales with quad-trees. This was further improved by Said and Pearlman [422] with a set partitioning technique. Yet, for general natural images, the coding improvement obtained by algorithms using cross-scale correlation of wavelet coefficient amplitude seems to be marginal compared to approaches that concentrate on intrascale correlation due to geometric structures. This approach was, therefore, not retained by the JPEG-2000 expert group.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123743701000148

Beyond the Basic Conformal Prediction Framework

Vladimir Vovk , in Conformal Prediction for Reliable Machine Learning, 2014

2.6.2 Positive Results

Proposition 2.6 does not prevent the existence of set predictors that are object conditionally valid in a partial and asymptotic sense and simultaneously asymptotically efficient. We will now discuss them following Lei and Wasserman [201]. The rest of this section is closely related to Section 1.4. Let the object space X be [ 0 , 1 ] d , for simplicity; we consider the problem of regression, Y = R . Until the end of this section we fix the data-generating distribution Q on Z ; as before, the data are generated from Q l + 1 ( l , however, will not be fixed; in particular, we will be interested in asymptotics as l ). Let us also fix a significance level > 0 . We will use the same notation Λ for the Lebesgue measure on R and on R d .

The conditional oracle band C or is now defined as the set (the assumptions to be made momentarily will ensure that it is essentially unique) C Z

with the conditional Q -probability ofcoverage { ( x , y ) Z y C x } given x at least 1 - for Q X -almost all x

and minimizing Λ ( C x ) for Q X -almost all x

where C x stands for the x -cut of C : C x { y Y ( x , y ) C } . We will be interested in set predictors whose prediction for a new object x is asymptotically close to C x or .

Lei and Wasserman construct a conditional conformal predictor Γ (independent of ) that is asymptotically efficient in the following object conditional sense:

(2.23) sup x X Λ Γ ( z 1 , , z l , x ) C x or 0

in probability, and even almost surely. They also establish the optimal rate of convergence in (2.23). Notice that (2.23) implies that Γ is object conditionally valid in an asymptotic sense.

For the additional properties of efficiency and validity (on top of what is guaranteed for all conditional conformal predictors) established by Lei and Wasserman for their predictor the following regularity conditions are sufficient:

1.

The marginal distribution Q X of Q has a differentiable density that is bounded above and bounded away from 0.

2.

The conditional Q -probability distribution Q x of the label y given any object x has a differentiable density q x .

3.

Both q x and q x are continuous and bounded uniformly in x .

4.

As a function of x , q x ( y ) is Lipschitz uniformly in y .

5.

For each x X there exists t x such that Q x ( { y q x ( y ) t x } ) = 1 - .

6.

For some δ > 0 , the gradient of q x is bounded above and bounded away from 0 uniformly in x X and y R satisfying q x ( y ) - t x < δ .

7.

Finally, inf x X t x > 0 .

For concreteness, let us set

C or = { ( x , y ) q x ( y ) t x } .

The following is a special case of Lei and Wasserman's result.

Theorem 2.1

[201], Theorem 9

Suppose Assumptions 1–7 hold. There exists a conditional conformal predictor Γ (independent of ) such that for any λ > 0 there exists B such that, as l ,

(2.24) P sup x X Λ Γ ( z 1 , , z l , x ) C x or B log l l 1 d + 3 = O l - λ .

Lei and Wasserman also show that the convergence rate in (2.24) is optimal (see [201], Theorem 12).

The proof of Theorem 2.1 given in [201] is constructive: the authors define explicitly a conditional conformal predictor satisfying (2.24). It is determined by the following taxonomy and conditional conformity measure. Let z 1 , , z n be a sequence of examples z i = ( x i , y i ) ; the corresponding categories and conformity scores are defined as follows. Partition X = [ 0 , 1 ] d into ( 1 / h n ) d axis-parallel cubes with sides of length

h n log n n 1 d + 3 .

Define the category κ i of z i as the cell of the partition containing x i . Let A be a cell of the partition. Define the "conditional" kernel density estimate

p ˆ ( y A ) 1 n A h n i : x i A K y - y i h n ,

where K is a fixed kernel satisfying the same conditions as in Section 1.4 (cf. (1.12)) and n A { i x i A } (if n A = 0 , define p ˆ ( y A ) arbitrarily). Finally, set the conformity scores to α i p ˆ ( y i A i ) , where A i is the cell of the partition that contains x i . This defines Lei and Wasserman's conditional conformal predictor Γ .

To make the predictor Γ more computationally efficient (and to facilitate proofs), Lei and Wasserman use again the idea of approximating the density estimate based on the training set augmented by the new object with a postulated label by the density estimate based on the training set alone (cf. (1.13) and (1.14) (1.13) (1.14) ). See [201], Section 4.2, for details.

It is an interesting direction of further research to construct conformal predictors that are object conditionally valid and efficient in an asymptotic sense in the case of classification. Another direction is to explore the training conditional validity of Lei and Wasserman's object conditional predictors.

Remark 2.3

In Section 1.8 we discussed asymptotically efficient conformal predictors in the case of classification and stated constructing asymptotically efficient conformal predictors in the case of regression as an open problem. Such asymptotically efficient conformal predictors would not be object conditionally valid, even in an approximate or asymptotic sense, unlike the asymptotically efficient conformal predictors of this section. From the point of view of traditional statistics, object conditional validity is a natural ideal goal, even if in its pure form it is not achievable under the randomness assumption (see Proposition 2.6). However, there are cases where this ideal goal has to be abandoned. One example is using (1.21)–(1.22) (1.21) (1.22) as the criterion of efficiency (or even using proper criteria of efficiency, in the terminology of [363]). Another example is perhaps even more important. As explained in the next section, in the case of classification we sometimes also want label conditional validity; a typical problem where label conditional validity is desirable is spam detection. We might sometimes want label conditional validity, approximate or asymptotic, in the case of regression as well. However, the ideal goals of object conditional validity and label conditional validity are incompatible, even if we know the data-generating distribution: conditioning on the whole test example does not leave us any probabilities that we could use in predicting the label. At least one of the two goals has to be abandoned, at least partially.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012398537800002X

Wavelet Bases

Stéphane Mallat , in A Wavelet Tour of Signal Processing (Third Edition), 2009

7.6.1 Interpolation and Sampling Theorems

Section 3.1.3 explains that a sampling scheme approximates a signal by its orthogonal projection onto a space U s and samples this projection at intervals s. The space U s is constructed so that any function in U s can be recovered by interpolating a uniform sampling at intervals s. We relate the construction of interpolation functions to orthogonal scaling functions and compute the orthogonal projector on U s.

An interpolation function any φ such that {φ(tn)} n ∈ℤ is a Riesz basis of the space U 1 it generates, and that satisfies

(7.190) φ ( n ) = { 1 if n=0 0 if  n 0.

Any fU 1 is recovered by interpolating its samples f(n):

(7.191) f ( t ) = Σ n = + f ( n ) φ ( t n ) .

Indeed, we know that f is a linear combination of the basis vector {φ(tn)} n ɛℤ and the interpolation property (7.190) yields (7.191). The Whittaker sampling Theorem 3.2 is based on the interpolation function

φ ( t ) = sin π t π t .

In this case, space U 1 is the set of functions having a Fourier transform support included in [–π, π].

Scaling an interpolation function yields a new interpolation for a different sampling interval. Let us define φs (t) = φ(t/s) and

U s = { f L 2 ( ) with f ( s t ) U 1 } .

One can verify that any fU s can be written as

(7.192) f ( t ) = Σ n = + f ( n s ) φ s ( t n s ) .

Scaling Autocorrelation

We denote by φo an orthogonal scaling function, defined by the fact that {φo (tn)} n ɛℤ is an orthonormal basis of a space V 0 of a multiresolution approximation. Theorem 7.2 proves that this scaling function is characterized by a conjugate mirror filter ho . Theorem 7.20 defines an interpolation function from the autocorrelation of φo [423].

Theorem 7.20.

Let φ ¯ 0 ( t ) = φ 0 ( t ) and h ˜ 0 [ n ] . If | φ ^ 0 ( ω ) | = O ( ( 1 + | ω | ) 1 ) , then

(7.193) φ ( t ) = + φ 0 ( u ) φ 0 ( u t ) u = φ 0 φ ¯ 0 ( t )

is an interpolation function. Moreover,

(7.194) φ ( t 2 ) = Σ n = + h [ n ] φ ( t n )

with

(7.195) h [ n ] = Σ m = + h 0 [ m ] h 0 [ m n ] = h 0 h ¯ 0 [ n ] .

Proof.

Observe first that

φ ( n ) = φ 0 ( t ) , φ 0 ( t n ) = δ [ n ] ,

which proves the interpolation property (7.190). To prove that {φ(tn)} n ∈ℤ is a Riesz basis of the space U 1 it generates, we verify the condition (7.9). The autocorrelation φ ( t ) = φ o φ ¯ o ( t ) has a Fourier transform φ ^ ( ω ) = | φ ^ o ( ω ) | 2 . Thus, condition (7.9) means that there exist BA > 0 such that

(7.196) ω [ π , π ] , A Σ k = + | φ ^ 0 ( ω 2 k π ) | 4 B .

We proved in (7.14) that the orthogonality of a family {φo (tn)} n ∈ℤ is equivalent to

(7.197) ω [ π , π ] , Σ k = + | φ ^ 0 ( ω + 2 k π ) | 2 = 1.

Therefore, the right inequality of (7.196) is valid for A = 1. Let us prove the left inequality. Since | φ ^ o ( ω ) | = O ( ( 1 + | ω | ) 1 ) , one can verify that there exists K > 0 such that for all ω [ π , π ] , Σ | k | > k | φ ^ 0 ( ω + 2 k π ) | 2 < 1 / 2 , so (7.197) implies that Σ k = k k | φ ^ 0 ( ω + 2 k π ) | 2 1 / 2 It follows that

Σ k = k k | φ ^ 0 ( ω + 2 k π ) | 4 1 4 ( 2 k + 1 ) ,

which proves (7.196) for A −1 = 4(2K + 1).

Since φo is a scaling function, (7.23) proves that there exists a conjugate mirror filter ho such that

1 2 φ 0 ( t 2 ) = Σ n = + h 0 [ n ] φ 0 ( t n ) .

Computing φ ( t ) = φ o φ ¯ o ( t ) yields (7.194) with h [ n ] = h o h ¯ o [ n ] .

Theorem 7.20 proves that the autocorrelation of an orthogonal scaling function φo is an interpolation function φ that also satisfies a scaling equation. One can design φ to approximate regular signals efficiently by their orthogonal projection in U s. Definition 6.1 measures the regularity of f with a Lipschitz exponent, which depends on the difference between f and its Taylor polynomial expansion. Theorem 7.21 gives a condition for recovering polynomials by interpolating their samples with φ. It derives an upper bound for the error when approximating f by its orthogonal projection in U s.

Theorem 7.21:

Fix, Strang. Any polynomial q(t) of degree smaller or equal to p − 1 is decomposed into

(7.198) q ( t ) = Σ n = + q ( n ) φ ( t n )

if and only if h ^ ( ω ) has a zero of order p at ω = π.

Suppose that this property is satisfied. If f has a compact support and is uniformly Lipschitz αp, then there exists C > 0 such that

(7.199) s > 0 , f p U s f C s α .

Proof.

The main steps of the proof are given without technical detail. Let us set s = 2 j . One can verify that the spaces {V j = U 2j } j ∈ℤ define a multiresolution approximation of L 2(ℝ). The Riesz basis of V 0 required by Definition 7.1 is obtained with θ = φ. This basis is orthogonalized by Theorem 7.1 to obtain an orthogonal basis of scaling functions. Theorem 7.3 derives a wavelet orthonormal basis {ψj,n }(j,n) ∈ℤ2 of L2 (ℝ).

Using Theorem 7.4, one can verify that ψ has p vanishing moments if and only if h ^ ( ω ) has p zeros at π. Although φ is not the orthogonal scaling function, the Fix-Strang condition (7.70) remains valid. It is also equivalent that for k < p,

q k ( t ) = Σ n = + n k φ ( t n )

is a polynomial of degree k. The interpolation property (7.191) implies that qk(n) = nk for all n ∈ ℤ, so qk(t) = tk . Since {tk }0⩽k<p is a basis for polynomials of degree p −1, any polynomial q(t) of degree p − 1 can be decomposed over {φ(tn)} n ∈ℤ if and only if h ^ ( ω ) has p zeros at π.

We indicate how to prove (7.199) for s = 2 j . The truncated family of wavelets {ψl,n } lj,n ∈ℤ is an orthogonal basis of the orthogonal complement of U 2 j = V j in L 2(ℝ). Thus,

f P U 2 j f 2 = Σ l = j Σ n = + | f , ψ l , n | 2 .

If f is uniformly Lipschitz α, since ψ has p vanishing moments, Theorem 6.3 proves that there exists A > 0 such that

| W f ( 2 l n , 2 l ) | = | f , ψ l , n | A 2 ( α + 1 / 2 ) l .

To simplify the argument we suppose that ψ has a compact support, although this is not required. Since f also has a compact support, one can verify that the number of nonzero 〈f, ψl,n 〉 is bounded by K 2l for some K > 0. Thus,

f P U 2 j f 2 Σ l = j K 2 l A 2 2 ( 2 α + 1 ) l K A 2 1 2 α 2 2 α j ,

which proves (7.199) for s = 2 j .

As long as αp, the larger the Lipschitz exponent α, the faster the error ‖ fP Us f ‖ decays to zero when the sampling interval s decreases. If a signal f is C k with a compact support, then it is uniformly Lipschitz k, so Theorem 7.21 proves that ‖ fP Us f ‖ ≤ Csk .

EXAMPLE 7.11

A cubic spline-interpolation function is obtained from the linear spline-scaling function φo . The Fourier transform expression (7.5) yields

(7.200) φ ^ ( ω ) = | φ ^ 0 ( ω ) | 2 = 48 sin 4 ( ω / 2 ) ω 4 ( 1 + 2 cos 2 ( ω / 2 ) ) .

Figure 7.19(a) gives the graph of φ, which has an infinite support but exponential decay. With Theorem 7.21, one can verify that this interpolation function recovers polynomials of degree 3 from a uniform sampling. The performance of spline interpolation functions for generalized sampling theorems is studied in [162, 468].

FIGURE 7.19. (a) Cubic spline–interpolation function. (b) Deslauriers-Dubuc interpolation function of degree 3.

EXAMPLE 7.12

Deslauriers-Dubuc[206] interpolation functions of degree 2p − 1 are compactly supported interpolation functions of minimal size that decompose polynomials of degree 2p − 1. One can verify that such an interpolation function is the autocorrelation of a scaling function φo . To reproduce polynomials of degree 2p − 1, Theorem 7.21 proves that h ^ ( ω ) must have a zero of order 2p at π. Since h [ n ] = h o h ¯ o [ n ] , it follows that h ^ ( ω ) = | h ^ o ( ω ) | 2 , and thus h ^ o ( ω ) has a zero of order p at π. The Daubechies theorem (7.7) designs minimum-size conjugate mirror filters ho that satisfy this condition. Daubechies filters ho have 2p nonzero coefficients and the resulting scaling function φo has a support of size 2p − 1. The autocorrelation φ is the Deslauriers-Dubuc interpolation function, which support [− 2p + 1, 2p − 1].

For p = 1, φo = 1 [0,1] and φ are the piecewise linear tent functions with a support that [– 1, 1]. For p = 2, the Deslauriers-Dubuc interpolation function φ is the autocorrelation of the Daubechies 2 scaling function, shown in Figure 7.10. The graph of this interpolation function is in Figure 7.19(b). Polynomials of degree 2p − 1 = 3 are interpolated by this function.

The scaling equation (7.194) implies that any autocorrelation filter verifies h[2n] =0 for n ≠ 0. For any p ≤0, the nonzero values of the resulting filter are calculated from the coefficients of the polynomial (7.168) that is factored to synthesize Daubechies filters. The support of h is[− 2p + 1, 2p − 1] and

(7.201) h [ 2 n + 1 ] = ( 1 ) p | n k = 0 2 p 1 ( k p + 1 / 2 ) ( n + 1 / 2 ) ( p n 1 ) ! ( p + n ) ! for -p n < p .

Dual Basis

If fU s, then it is approximated by its orthogonal projection P Us f on U s before the samples at intervals s are recorded. This orthogonal projection is computed with a biorthogonal basis { φ ˜ s ( t n s ) } n [82]. Theorem 3.4 proves that φ ˜ s ( t ) = s 1 φ ˜ ( s 1 t ) where the Fourier transform of φ is

(7.202) φ ˜ ^ = φ ^ * ( ω ) Σ k = + | φ ^ * ( ω + 2 k π ) | 2 .

Figure 7.20 gives the graph of the cubic spline φ ˜ associated to the cubic spline-interpolation function. The orthogonal projection of f over U s is computed by decomposing f in the biorthogonal bases

FIGURE 7.20. The dual cubic spline φ ˜ (t) associated to the cubic spline-interpolation function φ(t) shown in Figure 7.19(a).

(7.203) P U s f ( t ) = Σ n = + ( f ( u ) , Φ ˜ s ( u n s ) Φ s ( t n s ) .

Let φ ˜ ¯ s ( t ) = φ ˜ s ( t ) . The interpolation property (7.190) implies that

(7.204) f ( u ) , Φ s ˜ ( u n s ) = f * Φ ˜ ¯ ¯ s ( n s ) .

Therefore, this discretization of f through a projection onto U s is obtained by a filtering with φ ˜ ¯ s ( t ) followed by a uniform sampling at intervals s. The best linear approximation of f is recovered with the interpolation formula (7.203).

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123743701000112