Every content here is my original work.Creative Commons Licence
Universitas Scripta is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
Your cookies may be used by Google and mathjax. See Google's privacy policy.

Saturday 18 May 2024

Interpolation function training

Ideal interpolation kernel is sinc function, which consists of all frequencies upto Nyquist frequency. However, this involves infinite-dimension convolution in discrete calculation, so we apply window function. In signal procession, thus, interpolation kernel is windowed-sinc function [https://www.analog.com/media/en/technical-documentation/dsp-book/dsp_book_Ch16.pdf], and the window function can vary to achieve (1) to narrow the window width to reduce calculation, (2) and in the same time to increase accuracy and reduce artifacts. We have to satisfy these two contradictory conditions, so we have to compromise and optimise conditions.

    The kernel generally must satisfy f(0) = 1 and f(n != 0 but integer) = 0 and additionally I force to satisfy to be continuous function, to avoid catastrophic artifacts. Generic window function may be written in Fourier series, (cosine if symmetric) [https://en.wikipedia.org/wiki/Window_function] 

\[ \displaystyle
w(t)=
\begin{cases}
 \sum_{k=0}^{N}a_k \cos k \pi t  + \sum_{k=1}^{N} b_k \sin k\pi t\quad\text{if  \(t\in[-1,1]\)}
\\
0\quad\text{else}
\end{cases} \]

with absolute constraints

\[\textbf{Constraint 1: }\quad w(0) = \sum_{k=0}^{N} a_k = 1,\]

\[\textbf{Constraint 2: }\quad w(1) =\sum_{k=0}^{N}(-1)^k a_k = 0, \]

assuming the window width to be [-1, 1] (to use in practice, scale this in \(t\)-direction) .


Projection of the window parameters \(\{a_k, b_k\}\) to the constraints.

    From \(\{a_k, b_k\}\) vector space, we have two constraints to absolutely be satisfied. Analogously to this situation: we must project a point to two planes simultaneosly, thus making it on the line, the intersection of the two planes; The projection must move perperdicular to either plane. We can develop a training process only moving on the constraints, but because of calculation errors, we have to develop a way to project on the constraints anyway. The training process to optimise the kernel to the given big data of audio may correspond to the loss optimization process of this \((2N+1)\)-dimensional point moving to the minima or maxima.

    Let us concretise the projection idea more. The \((2N+1)\)-dimensional point always satisfy 2 constraints above, so the valid space satisfying the constraints is \((2N-1)\) dimensions. In the quotient space, the kernel corresponding to a certain point must be \(2\)-dimensional, which passes the given point and is parallel to the two perpendicular vectors of the two constraint \(2N\)-plane. The intersection between the proper \((2N-1)\) dimensions and the 2D-kernel is the projected point we want.

    The two perpendicular vectors is the same as the coefficients of the equations of the planes, so the 2D kernel including a point \(\{a'_k, b'_k\}\) is written by 

\[ \{a_k, b_k\}  = \{a'_k, b'_k\} + \{1, 0\} x + \{(-1)^k, 0\} y \]

To obtain the intersection of this kernel to the two constraints \(w(0), w(1)\), we may substitute the  \(\{a_k, b_k\}\) expression of the kernel to the equation of each constraint.

\[ w(0) = \sum_{k=0}^{N} (a'_k + x + (-1)^k y) = 1 \]

\[ w(1) = \sum_{k=0}^{N} (-1)^k(a'_k + x + (-1)^k y) = 0 \]

From these equations, we have to obtain \(x\) and \(y\) to obtain how much the intersection have moved inside the kernel space from the original point. Simplified,

\[ w(0) =(N+1) x + \frac{1 + (-1)^N}{2}y + \sum_{k=0}^{N} a'_k  = 1 \]

\[ w(1) = \frac{1 + (-1)^N}{2} x + (N+1)y + \sum_{k=0}^{N} (-1)^k a'_k = 0 \]

More simplified,

\[
\begin{pmatrix}
(N+1) & \frac{1 + (-1)^N}{2}
\\ \frac{1 + (-1)^N}{2} & (N+1)
\end{pmatrix}
\begin{pmatrix} x\\y \end{pmatrix}
=
\begin{pmatrix}
1-\sum_{k=0}^{N} a'_k 
\\
-\sum_{k=0}^{N} (-1)^k a'_k 
\end{pmatrix}
\]

\[
\Rightarrow
\begin{pmatrix} x\\y \end{pmatrix}
=
\frac{1}{(N+1)^2 - \frac{1 + (-1)^N}{2}}
\begin{pmatrix}
(N+1) & -\frac{1 + (-1)^N}{2}
\\ -\frac{1 + (-1)^N}{2} & (N+1)
\end{pmatrix}
\begin{pmatrix}
1-\sum_{k=0}^{N} a'_k 
\\
-\sum_{k=0}^{N} (-1)^k a'_k 
\end{pmatrix}
\]

\[\Rightarrow
\begin{pmatrix} x+y\\x-y \end{pmatrix}=
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} \mathrm{diag}\{1, -1\}
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\begin{pmatrix}
1-\sum_{k=0}^{N} a'_k -\sum_{k=0}^{N} (-1)^k a'_k 
\\
1-\sum_{k=0}^{N} a'_k +\sum_{k=0}^{N} (-1)^k a'_k 
\end{pmatrix}
\]

\[\Rightarrow
\begin{pmatrix} x+y\\x-y \end{pmatrix}=
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} \mathrm{diag}\{1, -1\}
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\begin{pmatrix}
1-2\sum_{k=0}^{N} \frac{1 + (-1)^k}{2} a'_k 
\\
1-2\sum_{k=0}^{N} \frac{1 + (-1)^{k+1}}{2} a'_k 
\end{pmatrix}
\]

Thus, the projected point \(\{a_k, b_k\}\) from the original point \(\{a'_k, b'_k\}\) is

\[
\begin{cases}
a_k = a'_k + x + (-1)^k y,\quad k\in[0,N],
\\
b_k = b'_k,\quad k\in[1,N].
\end{cases}
\]

\[
\displaystyle
\Rightarrow
\begin{cases}
a_k = a'_k +
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} (-1)^k
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\left(
1 -2\sum_{m=0}^{N}\frac{1+ (-1)^{k+m}}{2} a'_m 
\right)
\\
b_k = b'_k,\quad k\in[1,N].
\end{cases}
\]

    Thus, here we may define projection operator \(P\) s.t.

\[
\displaystyle
\begin{cases}
P(a_k) := a_k +
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} (-1)^k
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\left(
1 -2\sum_{m=0}^{N}\frac{1+ (-1)^{k+m}}{2} a_m 
\right)
\\
P(b_k) := b_k,\quad k\in[1,N].
\end{cases}
\]


Gradient vector projection to the constraints

Now, as doing projection each iteration of training is inefficient, we want to develop to project each step only on the constraints.

    From the original valid point \(\{a^0_k, b^0_k\}\) s.t. \(P(a^0_k\) = a^0_k, we want to consider new train step \(\{a^0_k + \Delta a_k, b^0_k + \Delta b_k\}\) where \(\Delta a_k, \Delta b_k\) may not be determined with constraints but only gradient of loss. Then, we must project \(P(a^0_k + \Delta a_k)\) to finalize the step.

\[
\small
\begin{aligned}
P(a^0_k+\Delta a_k) &= 
a^0_k +\Delta a_k+
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} (-1)^k
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\left(
1 -2\sum_{m=0}^{N}\frac{1+ (-1)^{k+m}}{2} (a^0_m + \Delta a_m)
\right)
\\ &= P(a^0_k) + \Delta a_k
-2
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} (-1)^k
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\sum_{m=0}^{N}\frac{1+ (-1)^{k+m}}{2} \Delta a_m
\end{aligned}
\]

    Thus, here we define the new projection operator \(\Delta P\) for the differentiation vector of the coefficient point:

\[
\Delta P(\Delta a_k) :=  \Delta a_k
-2
\dfrac{
  (N+1) -\frac{1+(-1)^N}{2} (-1)^k
}{(N+1)^2 - \frac{1+(-1)^N}{2}}
\sum_{m=0}^{N}\frac{1+ (-1)^{k+m}}{2} \Delta a_m
\]


Pseudocodes

\(P\) and \(\Delta P\) can be written in PyTorch codes:

import torch 
import torch.nn as nn 
import torch.nn.functional as F
import math

class window_param_projection(nn.Module): 

def __init__(self, N: int): 
super(window_param_projection, self).__init__() 
self.N = N
self.odd_kernel = torch.zeros(N+1, dtype=torch.float32)
self.even_kernel = torch.zeros(N+1, dtype=torch.float32)
self.odd_kernel[1::2] = 1
self.even_kernel[0::2] = 1 
self.proj_coeffs = N+1 
if N % 2 != 0: # odd 
self.proj_coeffs = torch.ones(self.proj_coeffs) / self.proj_coeffs
else: 
self.proj_coeffs = (self.proj_coeffs - self.even_kernel + self.odd_kernel) / (torch.square(self.proj_coeffs) - 1)
 

def forward(a: torch.Tensor, is_differential_vector: bool): 
sum_odd_a = a.matmul(self.odd_kernel) 
sum_even_a = a.matmul(self.even_kernel) 
sum_mosaic_a = torch.zeros(a.shape) 
sum_mosaic_a[...,0::2] = sum_even_a
sum_mosaic_a[...,1::2] = sum_odd_a 
vector = -2 * sum_mosaic_a 
if not is_differential_vector: 
vector = vector + 1
delta_a = vector.matmul(self.proj_coeffs) 
return a + delta_a 

class window_function(nn.Module):

def __init__(self, N: int, a: torch.Tensor, b: torch.Tensor):  

self.counting_array = torch.linspace(0, N, N+1) 

self.a = a  # must be nn.Parameter

self.b = b   # must be nn.Parameter

self.b[..., 0] = 0

 

def forward(t: torch.Tensor): 

args = t.view(list(x.shape) + [1]).matmul(self.counting_array) * math.pi 

cos = torch.cos(args) 

sin = torch.sin(args) 

return cos.matmul(self.a) + sin.matmul(self.b) 





No comments:

Post a Comment