SYSC5401_Assignment 2

Author： William.W
发布时间：February 11, 2025
14 views
One comment
32470 words
Categories：代码-Code

Assignment2.pdf

Question 1: The Instrumental Variable Approach

Step 1) Initialization

First, we define the number of time steps （$N$） along with the coefficients $a_1$ and $a_2$. To introduce noise, we set the standard deviation of white noise corresponding to variances of 4.0 and 9.0. Additionally, we initialize the arrays for the clean system output ($y$), the noisy system output ($y_m$), and the noise vector ($e$).

% Simulation parameters
N = 400; % Number of time steps
a1 = 1.5; % Colored noise coefficient
a2 = -0.7; % Colored noise coefficient
sigma_e = 3; % Standard deviation of white noise (3 for variance = 9.0)
rng(109); %set random seed

% Initialize arrays
y = zeros(N, 1);
ym = zeros(N, 1);
u = 2*(randi([0, 1], N, 1))-1; % Random binary input signal
e = sigma_e * randn(N, 1); % White noise
v = zeros(N, 1); % Initialize colored noise

Step 2) System Simulation

For the simulation, we first apply the clean filter:

$$ y(k) = 1.5 y(k-1) - 0.7 y(k-2) + u(k-1) $$

Then, we create the white noise vector:

$$ e(k) = 2 \cdot \text{randn}(N,1) $$

Finally, the actual system output is simulated:

$$ y_m(k) = y(k) + e(k) $$

We start the simulation from $k=3$ due to dependencies.

Matlab Code

% System simulation (start from k=3 due to dependencies)
for k = 3:N
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1);
    ym(k) = y(k) + e(k); % Measurement signal
end

Step 3) Initial Parameters for IV Iterations

To get the first instrumental solution, we compute the Least Squares (LS) estimate:

$$ \theta_{LS} = (\varphi^T \varphi)^{-1} \varphi^T Y $$

where $\varphi$ is the design matrix containing the regressors $[y_m(k-1), y_m(k-2), u(k-1)]$ and $Y$ is the output vector $y_m(3), y_m(4), ... y_m(N)$.

Matlab Code

% Design matrix for noise case
Phi_noise_4 = [ym(2:end-1), ym(1:end-2), u(2:end-1)];
Y_noise_4 = ym(3:end);

% Design matrix for clean case
Phi_clean = [y(2:end-1), y(1:end-2), u(2:end-1)];
Y_clean = y(3:end);

From the Matlab output, the LS estimations for clean, noisy (variance $4.0$), and noisy (variance $9.0$) cases are:

Case	Estimated Parameters from LS
Clean Case (True Parameters)	[1.500000, -0.700000, 1.000000]
Noise Case (4.0 variance)	[2.077900, -1.357300, 0.373200]
Noise Case (9.0 variance)	[2.949300, -2.518900, -0.316400]

Step 4: Iterative IV Estimation

After obtaining the initial estimates, the Instrumental Variable (IV) method is used to refine them iteratively. The instrumental variable is computed as:

$$ X_i(k) = \theta_{LS}(1) X_i(k-1) - \theta_{LS}(2) X_i(k-2) + \theta_{LS}(3) u(k-1) $$

At each iteration, the instrument matrix $Z$ is formed using lagged values of $X_i(k)$ and the input $u(k)$:

$$ Z = [X_i(k-1), X_i(k-2), u(k-1)] $$

The IV estimate is then updated as:

$$ \theta_{\text{IV}} = (Z^T \Phi)^{-1} (Z^T Y) $$

where $\Phi$ is the regressor matrix. The predicted output is:

$$ y_{\text{pred}} = (\theta_{\text{IV}})^T \Phi_{\text{noise}} $$

The cost function, measuring estimation error, is:

$$ J = \sum (Y - y_{\text{pred}})^2 $$

This iterative process improves parameter estimation by reducing the influence of noise.

Matlab Code

% Simulation parameters
N = 400;
a1 = 1.5; 
a2 = -0.7; 
sigma_e = 3; % Standard deviation of white noise (3 for variance = 9.0)
rng(109); %set random seed

% Initialize arrays
y = zeros(N, 1);
ym = zeros(N, 1);
u = 2*(randi([0, 1], N, 1))-1; % Random binary input signal
e = sigma_e * randn(N, 1); % White noise
v = zeros(N, 1); % Initialize colored noise

% System simulation (start from k=3 due to dependencies)
for k = 3:N
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1);
    ym(k) = y(k) + e(k); % Measurement signal
end

% Design matrix for noise case
Phi_noise = [ym(2:end-1), ym(1:end-2), u(2:end-1)];
Y_noise = ym(3:end);

% Design matrix for clean case
Phi_clean = [y(2:end-1), y(1:end-2), u(2:end-1)];
Y_clean = y(3:end);

% Initialize matrices to store the estimated parameters
theta_IV_noise = zeros(3,5);
theta_LS_noise = Phi_noise \ Y_noise;
theta_IV_noise(:,1) = theta_LS_noise.';
cost = zeros(1,6);
J_clean = zeros(1,6);


%% Initialize plot part
t = 3:N;
% no noise
nexttile;
plot(t, y(t), 'LineWidth', 1.5);
title('Noise-Free System Output y');
xlabel('Time Steps');
ylabel('y');
grid on;

% With noise
nexttile;
plot(t, ym(t), 'LineWidth', 1.5);
title('Noisy Measurement Signal ym');
xlabel('Time Steps');
ylabel('ym');
grid on;


    % create tiledlayout
    figure;
    tiledlayout(6,1, 'TileSpacing', 'Compact', 'Padding', 'Compact');

% Run IV iterations
for iter = 1:10
    if iter == 1
        a1 = theta_LS_noise(1);
        a2 = theta_LS_noise(2);
        b1 = theta_LS_noise(3);
    else
        a1 = theta_IV_noise(1,iter-1);
        a2 = theta_IV_noise(2,iter-1);
        b1 = theta_IV_noise(3,iter-1);
    end

    Xi = zeros(N, 1); % Initialize instrumental variable

    for k = 3:N
        Xi(k) = a1 * Xi(k-1) + a2 * Xi(k-2) + b1 * u(k-1);
    end

    % Construct instrumental variable matrix
    Z = [Xi(2:end-1), Xi(1:end-2), u(2:end-1)];

    % IV estimation
    theta_IV_noise(:,iter) = (Z' * Phi_noise) \ (Z' * Y_noise);

    % Predict output
    y_pred = Phi_clean * theta_IV_noise(:,iter);

    % Compute cost function
    cost(iter) = (1/(2*N)) * sum((Y_noise - y_pred).^2);
    J_clean(iter) = (1/(2*N)) * sum((Y_clean - y_pred).^2);

    % Display results
    fprintf('Iteration %d: Parameters: [%f, %f, %f], Cost: %f, J_clean: %f\n', ...
        iter, theta_IV_noise(:,iter), cost(iter),J_clean(iter));

    nexttile;
    plot(t, y(t), 'k-', 'LineWidth', 0.1);
    hold on;
    plot(t, y_pred, 'k--', 'LineWidth', 0.1);
    title(sprintf('Iteration %d', iter));
    xlabel('Time Steps');
    ylabel('Output');
    legend('No Noise y', sprintf('IV Predict y'), 'Location', 'best');
    grid on;
end

Plot Variance = 4

Plot Variance = 9

Result

True Parameter [1.5, -0.7, 1]

Estimated Parameters for Each Iteration (Noise Case: 4.0 Variance)

Iteration	$a_1$	$a_2$	$b_1$	Cost$J$	$J_{\text{clean}}$
1	2.0779	-1.3573	0.3732	2.2825	0.6100
2	1.4199	-0.6144	0.5417	1.8268	0.1108
3	1.7092	-0.9068	0.4714	1.8785	0.1848
4	1.5979	-0.7941	0.4985	1.8377	0.1354
5	1.6385	-0.8328	0.4889	1.8490	0.1501

Estimated Parameters for Each Iteration (Noise Case: 9.0 Variance)

Iteration	$a_1$	$a_2$	$b_1$	Cost$J$	$J_{\text{clean}}$
1	2.9493	-2.5189	-0.3164	7.7867	4.1077
2	0.9185	-0.1524	0.4939	4.3865	0.4472
3	2.3535	-1.6182	-0.0457	5.0955	1.3665
4	1.3222	-0.4550	0.3596	4.1309	0.2648
5	1.9760	-1.1439	0.1104	4.3928	0.6198

Conclusion

Instrumental Variable (IV) estimation effectively reduces noise influence, but its performance depends on noise variance. Under moderate noise (4.0 variance), parameters converge smoothly toward the true values, with minor deviations. However, in high noise conditions (9.0 variance), convergence is slower, more erratic, and introduces bias, especially in $b_1$. The cost function decreases in both cases, but higher noise results in a consistently higher cost, indicating less accurate parameter estimates. Overall, IV estimation performs well under controlled noise but struggles with excessive noise.

Question 2) Steepest Descent, LMS and RLS

1. Problem Setup

The given system is represented in the Z-domain as:

$$ \frac{d(z)}{x(z)} = \frac{0.2}{z - 0.8}. $$

Rearranging:

$$ zd(z) - 0.8d(z) = 0.2x(z). $$

Converting to the time domain:

$$ d(n + 1) = 0.2x(n) + 0.8d(n). $$

Assume that the input signal $x(n)$ is white noise with variance $\text{var}[x(n)] = 1$.

The parameter vector to estimate is:

$$ \mathbf{H} = \begin{bmatrix} a \\ b \end{bmatrix}. $$

2. Wiener--Hopf Solution

To find the optimal parameters, we use the Wiener--Hopf equation:

$$ \mathbf{H}_{\text{optimal}} = \mathbf{R}^{-1} \mathbf{p}. $$

where:

Cross-correlation vector:

$$ \mathbf{p} = \mathbb{E}[ d(n) X(n) ], $$

where $d(n)$ is the desired output, and $X(n)$ is the input vector.

Autocorrelation matrix:

$$ \mathbf{R} = \mathbb{E} [ X(n) X(n)^T ]. $$

For this system:

The input signals at time $n$ are $x(n)$ and $d(n)$, so the input vector is:

$$ X(n) = \begin{bmatrix} d(n) \\ x(n) \end{bmatrix}. $$

The cross-correlation vector is:
$$ \mathbf{p} = \mathbb{E} \begin{bmatrix} d(n+1) d(n) \\ d(n+1) x(n) \end{bmatrix}. $$
The autocorrelation matrix is:
$$ \mathbf{R} = \begin{bmatrix} \mathbb{E}[ d^2(n) ] & \mathbb{E}[ d(n)x(n) ] \\[4pt] \mathbb{E}[ x(n)d(n) ] & \mathbb{E}[ x^2(n) ] \end{bmatrix}. $$

With these expressions, we can simulate the system and precompute $\mathbf{p}$ and $\mathbf{R}$ for further calculations.

Below is a straightforward way to carry out the Wiener--Hopf solution in closed form for the system

$$ d(n+1) \;=\; 0.8\,d(n)\;+\;0.2\,x(n), $$

with $x(n)$ white‐noise input of zero mean and variance $\sigma_x^2=1$. We seek the parameter vector

$$ \mathbf{H} = \begin{bmatrix} a\\[3pt] b \end{bmatrix} $$

that minimizes the mean‐square error between the "desired" $d(n+1)$ and the model output $\hat{d}(n+1) = a\,d(n)+b\,x(n)$.

3. Autocorrelation Matrix $\mathbf{R}$

We define the input vector

$$ X(n) \;=\; \begin{bmatrix} d(n)\\[4pt] x(n) \end{bmatrix}. $$

Then the autocorrelation matrix is

$$ \mathbf{R} \;=\; \mathbb{E}\bigl[\,X(n)\,X(n)^\mathsf{T}\bigr] \;=\; \begin{bmatrix} \mathbb{E}[\,d(n)^2\,] & \mathbb{E}[\,d(n)\,x(n)\,]\\[3pt] \mathbb{E}[\,x(n)\,d(n)\,] & \mathbb{E}[\,x(n)^2\,] \end{bmatrix}. $$

Compute $\mathbb{E}[x(n)^2]$.
Since $x(n)$ is white noise with variance 1,
$$ \mathbb{E}[\,x(n)^2\,] \;=\; 1. $$
Compute $\mathbb{E}[\,d(n)\,x(n)\,]$.
Because $d(n)$ depends on $x(n-1), x(n-2), \ldots$, and $x(n)$ is uncorrelated with past values of itself, it follows that $d(n)$ is uncorrelated with $x(n)$. Thus,
$$ \mathbb{E}[\,d(n)\,x(n)\,] \;=\; 0. $$
Compute $\mathbb{E}[\,d(n)^2\,]$.
Denote $\sigma_d^2 = \mathbb{E}[\,d(n)^2\,]$. From the difference equation,
$$ d(n+1) \;=\; 0.8 \, d(n) \;+\; 0.2 \, x(n). $$
In steady state, using the fact that $d(n)$ is uncorrelated with $x(n)$ and
$\sigma_x^2=1$,
$$ \sigma_d^2 \;=\; \mathbb{E}[\,d(n)^2\,] \;=\; 0.8^2\,\sigma_d^2 \;+\; 0.2^2\,\underbrace{\mathbb{E}[\,x(n)^2\,]}_{1}. $$
Hence
$$ \sigma_d^2 \;=\; 0.64\,\sigma_d^2 \;+\; 0.04, \quad\Longrightarrow\quad 0.36\,\sigma_d^2 \;=\; 0.04 \quad\Longrightarrow\quad \sigma_d^2 \;=\; \frac{0.04}{0.36} \;=\; \tfrac{1}{9}. $$
Therefore,
$$ \mathbb{E}[\,d(n)^2\,] \;=\; \frac{1}{9}. $$

Putting it all together,

$$ \mathbf{R} \;=\; \begin{bmatrix} \frac{1}{9} & 0\\[4pt] 0 & 1 \end{bmatrix}. $$

2. Cross‐Correlation Vector $\mathbf{p}$

We need

$$ \mathbf{p} \;=\; \mathbb{E}\bigl[\, d(n+1)\,X(n) \bigr] \;=\; \begin{bmatrix} \mathbb{E}[\,d(n+1)\,d(n)\,]\\[4pt] \mathbb{E}[\,d(n+1)\,x(n)\,] \end{bmatrix}. $$

Use $d(n+1) = 0.8\,d(n) + 0.2\,x(n)$:

First element $\mathbb{E}[\,d(n+1)\,d(n)\,]$:
$$ \mathbb{E}[\,d(n+1)\,d(n)\,] \;=\; \mathbb{E}\bigl[\,(\,0.8\,d(n) + 0.2\,x(n)\,)\,d(n)\bigr] \;=\; 0.8\,\mathbb{E}[\,d(n)^2\,] \;+\; 0.2\,\mathbb{E}[\,x(n)\,d(n)\,]. $$
Since $\mathbb{E}[\,x(n)\,d(n)\,] = 0$, we get
$$ \mathbb{E}[\,d(n+1)\,d(n)\,] \;=\; 0.8 \times \frac{1}{9} \;=\; \frac{0.8}{9} \;=\; \frac{4}{45}. $$
Second element $\mathbb{E}[\,d(n+1)\,x(n)\,]$:
$$ \mathbb{E}[\,d(n+1)\,x(n)\,] \;=\; \mathbb{E}\bigl[\,(0.8\,d(n) + 0.2\,x(n))\,x(n)\bigr] \;=\; 0.8\,\underbrace{\mathbb{E}[\,d(n)\,x(n)\,]}_{0} \;+\; 0.2\,\mathbb{E}[\,x(n)^2\,] \;=\; 0.2 \times 1 \;=\; 0.2. $$

Hence

$$ \mathbf{p} \;=\; \begin{bmatrix} \frac{4}{45}\\[3pt] 0.2 \end{bmatrix}. $$

4. Wiener--Hopf Solution

The optimal parameter vector $\mathbf{H}_{\text{optimal}}$ is given by

$$ \mathbf{H}_{\text{optimal}} \;=\; \mathbf{R}^{-1}\,\mathbf{p}. $$

$$ \mathbf{R} \;=\; \begin{bmatrix} \frac{1}{9} & 0\\[3pt] 0 & 1 \end{bmatrix} \quad\Longrightarrow\quad \mathbf{R}^{-1} \;=\; \begin{bmatrix} 9 & 0\\[3pt] 0 & 1 \end{bmatrix}. $$

Therefore,

$$ \mathbf{H}_{\text{optimal}} \;=\; \begin{bmatrix} 9 & 0\\[3pt] 0 & 1 \end{bmatrix} \begin{bmatrix} \frac{4}{45}\\[3pt] 0.2 \end{bmatrix} \;=\; \begin{bmatrix} 9 \times \frac{4}{45}\\[3pt] 1 \times 0.2 \end{bmatrix} \;=\; \begin{bmatrix} \frac{36}{45}\\[3pt] 0.2 \end{bmatrix} \;=\; \begin{bmatrix} 0.8\\[3pt] 0.2 \end{bmatrix}. $$

Hence the Wiener--Hopf solution for this system is

$$ \boxed{ a \;=\; 0.8, \quad b \;=\; 0.2. } $$

5.Steepest Descent Algorithm

Matlab Code

clear; clc; close all;

%% Generate data
N = 10000;                % total samples
x = randn(N,1);        % white noise input, var=1
d = zeros(N,1);        % desired signal initialization
d(1) = 0;              % initial condition

% True system: d(n+1) = 0.8*d(n) + 0.2*x(n)
for k = 1:N-1
    d(k+1) = 0.8*d(k) + 0.2*x(k);
end

R = [1/9   0;
     0     1  ];  
p = [0.8/9;  0.2];
d2 = 1/9;
H_opt = inv(R)*p;   % -> [0.8 ; 0.2]

%% Initialize Steepest Descent
alpha   = 0.01;      % VERY TINY STEP size
H_hat   = [0; 0];   % Initial START [a; b]

a_history = zeros(1,N);
b_history = zeros(1,N);
e_array   = zeros(1,N);   % store instantaneous error

% Store initial values
a_history(1) = H_hat(1);
b_history(1) = H_hat(2);

%% Steepest Descent Loop
for n = 2:N
    e_array(n) = d(n) - H_hat(1)*d(n-1) - H_hat(2)*x(n-1);
    gradient = -2*p + 2*R*H_hat;
    H_hat = H_hat - alpha * gradient;
  
    a_history(n) = H_hat(1);
    b_history(n) = H_hat(2);
end

fprintf('Final estimated a = %.4f\n', a_history(end));
fprintf('Final estimated b = %.4f\n', b_history(end));

%% Contour Plot of J(a,b) + Parameter Path
na = 60;  % grid size for a-axis
nb = 60;  % grid size for b-axis
a_range = linspace(-0.1,1.2,na);
b_range = linspace(-0.1,0.4, nb);
J_grid  = zeros(na, nb);

% Compute cost function grid
for i = 1:na
    for j = 1:nb
        A = [a_range(i); b_range(j)];
        J_grid(i,j) = d2 - 2*(A.'*p) + (A.'*R*A);
    end
end

figure;
contour(a_range, b_range, J_grid.', 25); hold on; grid on; colorbar;
xlabel('a'); ylabel('b');
title('Contour of J(a,b) + Steepest Descent Trajectory');
% Plot the parameter trajectory
plot(a_history, b_history, 'r-','LineWidth',1.2,'MarkerSize',2);
% Mark final estimate
plot(a_history(end), b_history(end), 'ks','MarkerSize',6,'LineWidth',1.5);
% Mark true Wiener solution
plot(H_opt(1), H_opt(2), 'kx','MarkerSize',8,'LineWidth',2);
legend('Cost contours','Path','Final estimate','True solution','Location','best');
hold off;

%% plot a&b est figure

figure;
tiledlayout(2,1, 'TileSpacing', 'Compact', 'Padding', 'Compact');

% a_est plot
nexttile;
plot(a_history,'-','MarkerSize',2); hold on; 
yline(H_opt(1),'r--','LineWidth',1.5);
xlabel('Iteration'); ylabel('a_{est}');
legend('a_{est}','true a=0.8','Location','best');
title('a_{est} vs. Iteration (Full)');
grid on;

% b_est plot
nexttile;
plot(b_history,'-','MarkerSize',2); hold on; 
yline(H_opt(2),'r--','LineWidth',1.5);
xlabel('Iteration'); ylabel('b_{est}');
legend('b_{est}','true b=0.2','Location','best');
title('b_{est} vs. Iteration (Full)');
grid on;

6.LMS Algorithm

The parameters are updated using the Least Mean Squares (LMS) rule:

$$ e(n) = d(n) - [a_{\text{est}} d(n-1) + b_{\text{est}} x(n-1)] $$

$$ H_{\text{hat}} = H_{\text{hat}} + \mu \cdot e(n) \cdot \begin{bmatrix} d(n-1) \\ x(n-1) \end{bmatrix} $$

for n = 2:N
  
    % Input vector at time n (one-step-behind signals)
    Xn = [ d(n-1); x(n-1) ];
  
    % Predicted output
    y_hat = H_hat.' * Xn;  % = a_est*d(n-1) + b_est*x(n-1)
  
    % Error
    e_array(n) = d(n) - y_hat;
  
    % LMS parameter update
    H_hat = H_hat + mu * e_array(n) * Xn;
  
    % Store
    a_history(n) = H_hat(1);
    b_history(n) = H_hat(2);
end

Plots

We will plot the following:

Parameter Trajectory: $(a_{\text{est}}, b_{\text{est}})$
Cost Surface Contour:
- The contour $J(a,b)$ derived from the Wiener formula.
- This allows visualization of how LMS moves within the cost surface.

7.Standard Recursive Least Squares Algorithm(λ = 1.0)

We use the classic RLS recursion:

Input Vector:
$$ X(n) = \begin{bmatrix} d(n-1) \\ x(n-1) \end{bmatrix} $$
Predicted Output:
$$ \hat{y}(n) = H_{\text{hat}}(n-1)^T X(n) $$
Error Calculation:
$$ e(n) = d(n) - \hat{y}(n) $$
Gain Vector:
$$ K(n) = \frac{P(n-1) X(n)}{\lambda + X(n)^T P(n-1) X(n)} $$
Parameter Update:
$$ H_{\text{hat}}(n) = H_{\text{hat}}(n-1) + K(n) e(n) $$
Covariance Matrix Update:
$$ P(n) = \frac{P(n-1) - K(n) X(n)^T P(n-1)}{\lambda} $$

%% 4. RLS LOOP
for n = 2:N
  
    % 1) Current regressor: X(n) = [ d(n-1); x(n-1) ]
    Xn = [ d(n-1); x(n-1) ];
  
    % 2) Predicted output
    y_hat = H_hat.' * Xn;
  
    % 3) Error
    e_array(n) = d(n) - y_hat;
  
    % 4) Gain vector
    denom = lambda + Xn.' * P * Xn;   % scalar
    K = (P * Xn) / denom;
  
    % 5) Parameter update
    H_hat = H_hat + K * e_array(n);
  
    % 6) Covariance update
    P = (P - K * (Xn.' * P)) / lambda;
  
    % Store results
    a_history(n) = H_hat(1);
    b_history(n) = H_hat(2);
end

Plots

We visualize:

The parameter trajectory $(a_{\text{est}}, b_{\text{est}})$.
The cost surface contour $J(a, b)$ from the Wiener solution to compare the behavior of the RLS algorithm.

8.Comment on the Difference

SGD (Stochastic Gradient Descent)

Performance: The core idea of SGD is to adjust the parameters by calculating the gradient of the error (that is, the direction in which the error changes the fastest), so that the parameters are updated in the direction of reducing the error. This method is very efficient in most cases, especially when the amount of data is large.
Complexity: The calculation of SGD mainly relies on the multiplication of matrices and vectors. If the data scale is small (such as low dimensions of parameters and data), the amount of calculation is not large; but if the data scale is large (such as high-dimensional data or large-scale matrices), the calculation and storage will become very laborious. Therefore, although SGD itself is not complex, calculation and storage may become a bottleneck when processing large-scale data, especially when the signal is unstable (such as the data distribution changes over time).
Implementation issues: SGD needs to store some data in advance (such as vector p and matrix R), which is feasible when the signal is stable. But if the signal is unstable (such as the data distribution changes over time) or the system characteristics change dynamically, the pre-stored data is no longer applicable.

LMS (Least Mean Squares)

Performance: LMS is a simpler gradient descent method, especially suitable for situations where data cannot be stored in advance (such as unstable signals or high-dimensional data). It does not directly calculate the entire error surface, but updates the parameters by estimating the current error, so it is called a "dirty gradient".
Complexity: LMS is very simple to calculate because it only uses a fixed learning rate to update the parameters, avoiding complex matrix operations. This makes it very efficient and suitable for real-time processing.
Implementation issues: The main problem with LMS is that it is very sensitive to the choice of learning rate. If the learning rate is set too small, the convergence speed will be slow; if it is set too large, it may cause parameter oscillation or even failure to converge.

RLS (Recursive Least Squares)

Performance: RLS is a very fast algorithm because it uses the inverse information of the autocorrelation matrix to adjust the parameters more accurately. It usually converges much faster than LMS and SGD.
Complexity: The disadvantage of RLS is that it has a high computational complexity because it requires matrix operations, such as matrix inversion. These operations require high computing resources, especially when the data dimension is large.
Implementation issues: Due to the high computational complexity of RLS, it is difficult to implement on hardware with limited resources. In addition, RLS has high requirements for numerical stability, and matrix inversion may cause numerical instability (such as "numerical expansion" phenomenon), which will affect the performance of the algorithm.

Question 3

a)

Matlab Code

% a) Recursive Least Squares (RLS) for system identification

clear; clc; close all;
rng(109);
N = 400;
sigma_e = 2;  % Standard deviation of noise
lambda_values = [1.0, 0.95];  % Forgetting factors

theta_true = [1.5; -0.7; 1.0];

y = zeros(N,1);
u = 2 * (rand(N,1) > 0.5) - 1; % Random binary input signal (±1)

% Simulate the system
for k = 3:N
    e = sigma_e * randn;
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1) + e;
end

% Function to perform RLS estimation
for lambda = lambda_values
    theta_est = zeros(3, N);
    P = 1000 * eye(3); 
    trace_P = zeros(N,1);

    for k = 3:N
        phi = [y(k-1); y(k-2); u(k-1)]; % Regression vector
        K = P * phi / (lambda + phi' * P * phi); % Gain
        error = y(k) - phi' * theta_est(:, k-1); % Prediction error
        theta_est(:, k) = theta_est(:, k-1) + K * error; 
        P = (P - (K * phi') * P) / lambda; % Update covariance
        trace_P(k) = trace(P); % Store trace
    end

    figure;
    subplot(2,1,1);
    plot(theta_est(1,:), 'r', 'LineWidth', 1.5); hold on;
    plot(theta_est(2,:), 'g', 'LineWidth', 1.5);
    plot(theta_est(3,:), 'b', 'LineWidth', 1.5);
    yline(1.5, 'r--', 'True \theta_1');
    yline(-0.7, 'g--', 'True \theta_2');
    yline(1.0, 'b--', 'True \theta_3');
    xlabel('Time Step');
    ylabel('\theta Estimates');
    legend('\theta_1', '\theta_2', '\theta_3', 'Location', 'best');
    title(['Parameter Estimates (\lambda = ', num2str(lambda), ')']);
    grid on;

    subplot(2,1,2);
    plot(trace_P, 'k', 'LineWidth', 1.5);
    xlabel('Time Step');
    ylabel('Trace(P)');
    title(['Trace of Covariance Matrix (\lambda = ', num2str(lambda), ')']);
    grid on;
end

Is there a difference?

Yes!When comparing the results of $\lambda = 1.0$ and $\lambda = 0.95$, we find that they have significant differences in parameter stability and covariance matrix behavior. When $\lambda = 1.0$, the parameter estimates quickly stabilize and closely track the true values, with little fluctuation, and the trace of the covariance matrix also converges smoothly, indicating that the estimation process is stable and consistent. When $\lambda = 0.95$, the parameter estimates show more fluctuations because the algorithm pays more attention to recent data and is more sensitive to new observations. The trace of the covariance matrix also fluctuates more, reflecting the increased uncertainty in the estimation process.

b)

%% MATLAB Code for Part (b): RLS with Changing Input Signal

clear all; close all; clc;
rng(109);
N = 400;  
sigma_e = 2;   
y = zeros(N,1);  
u = zeros(N,1);  

%% Generate the Input Signal u(k)
% For k = 1 to 99, u(k) is random binary: +1 or -1
u(1:99) = 2*(rand(99,1) > 0.5) - 1;

% For k = 100 to 300, set u(k) = +1.0
u(100:300) = 0;

% For k = 301 to 400, u(k) returns to random binary: +1 or -1
u(301:400) = 2*(rand(100,1) > 0.5) - 1;

%% Simulate the System
y(1:2) = 0;
for k = 3:N
    e = sigma_e * randn;   
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1) + e;
end

%% RLS Algorithm Setup
lambda = 0.95;  
n_params = 3;  
theta_est = zeros(n_params, N);   
P = 1000 * eye(n_params);   
traceP = zeros(N,1);  

%% Run the RLS Algorithm
for k = 3:N
    phi = [y(k-1); y(k-2); u(k-1)];
    K = P * phi / (lambda + phi' * P * phi);
    error = y(k) - phi' * theta_est(:, k-1);
    theta_est(:, k) = theta_est(:, k-1) + K * error;
    P = (P - K * phi' * P) / lambda;
    traceP(k) = trace(P);
end

time = 1:N;  % Time vector

figure;
subplot(2,1,1);
plot(time, theta_est(1,:), 'r-', 'LineWidth', 2); hold on;
plot(time, theta_est(2,:), 'b-', 'LineWidth', 2);
plot(time, theta_est(3,:), 'g-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Parameter Estimates');
legend('\theta_1','\theta_2','\theta_3','Location','Best');
title('RLS Parameter Estimates with \lambda = 0.95');
grid on;

subplot(2,1,2);
plot(time, traceP, 'k-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Trace of Covariance Matrix P');
title('Trace of Covariance Matrix P over Time');
grid on;

fprintf('Final parameter estimates:\n');
fprintf('theta1 = %.4f\n', theta_est(1,end));
fprintf('theta2 = %.4f\n', theta_est(2,end));
fprintf('theta3 = %.4f\n', theta_est(3,end));

The RLS algorithm shows fast convergence, quickly stabilizing the parameter estimates ($\theta_1$, $\theta_2$, and $\theta_3$) . From chosen forgetting factor $\lambda = 0.95$. Between $k = 100$ and $k = 300$. The parameters remain stable as the input signal $u(k)$ is unchanged. However, at $k = 300$, when the input switches back to a binary signal, $\theta_3$ experiences significant fluctuations, indicating its higher sensitivity to input variations. The trace of the covariance matrix $P$ rises sharply between $k = 200$ and $k = 300$, reflecting increased uncertainty, but it starts decreasing after $k = 300$. It is showing that the algorithm is adapting to the new input conditions.

c)

%% MATLAB Code for Part (c): Time-Varying Parameters

clear all; close all; clc;
rng(109);
N = 400;              % Total number of data points
sigma_e = 2;          % Standard deviation of noise (sqrt(variance)=2)
y = zeros(N,1);       % Pre-allocate output vector
u = zeros(N,1);       % Pre-allocate input vector

%% Generate the Input Signal u(k)
% For k = 1 to 99, u(k) is random binary: +1 or -1
u(1:99) = 2*(rand(99,1) > 0.5) - 1;

% For k = 100 to 300, set u(k) = +1.0
u(100:300) = 0;

% For k = 301 to 400, u(k) returns to random binary: +1 or -1
u(301:400) = 2*(rand(100,1) > 0.5) - 1;

%% Simulate the Time-Varying System
y(1:2) = 0;
for k = 3:N
    e = sigma_e * randn;
  
    if k < 250
        % Before time step 250: original parameters
        y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1) + e;
    else
        % At and after time step 250: changed parameters
        y(k) = 1.7*y(k-1) - 0.9*y(k-2) + 5*u(k-1) + e;
    end
end

%% RLS Algorithm Setup
lambda = 0.95;   
n_params = 3;   
theta_est = zeros(n_params, N);  
P = 1000 * eye(n_params);   
traceP = zeros(N,1);   

%% Run the RLS Algorithm
for k = 3:N
    phi = [y(k-1); y(k-2); u(k-1)];
    K = P * phi / (lambda + phi' * P * phi);
    error = y(k) - phi' * theta_est(:, k-1);
    theta_est(:, k) = theta_est(:, k-1) + K * error;
    P = (P - K * phi' * P) / lambda;
    traceP(k) = trace(P);
end

%% Plot the Results
time = 1:N;  % Time vector

figure;
subplot(2,1,1);
plot(time, theta_est(1,:), 'r-', 'LineWidth', 2); hold on;
plot(time, theta_est(2,:), 'b-', 'LineWidth', 2);
plot(time, theta_est(3,:), 'g-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Parameter Estimates');
legend('\theta_1','\theta_2','\theta_3','Location','Best');
title('RLS Parameter Estimates with a Parameter Change at k = 250');
grid on;

subplot(2,1,2);
plot(time, traceP, 'k-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Trace(P)');
title('Trace of the Covariance Matrix P over Time');
grid on;

fprintf('Final parameter estimates:\n');
fprintf('theta1 = %.4f\n', theta_est(1,end));
fprintf('theta2 = %.4f\n', theta_est(2,end));
fprintf('theta3 = %.4f\n', theta_est(3,end));

Observations on the Results:

Parameter Estimates Before k = 250 :
- The estimated parameters ($\theta_1$, $\theta_2$, and $\theta_3$) are stable and remain relatively constant, showing that the RLS algorithm has successfully converged to an initial solution.
$k$ = 250 :
- At $k$ = 250, the true system parameters were modified ($\theta_1$, $\theta_2$, and $\theta_3$). However, the estimated parameters take some time to reflect this change.
Excitation at $k$ = 300
- At $k = 300$ the input signal $u(k)$ switches back to a random binary signal, introducing significant variations in the estimated parameters.
- $\theta_3$ (green) is particularly affected, experiencing a sharp increase, indicating that it is more sensitive to changes in the input signal.
- This suggests that $\theta_3$ is closely linked to the input excitation and that the sudden change in signal structure strongly influences it.
Covariance Matrix Trace Behavior:
- The trace of the covariance matrix $P$ increases gradually from $k = 250$ and peaks around $k = 300$, which indicates a period of high uncertainty.
- Once the new parameter values are learned and the algorithm adapts, the trace begins to decrease, showing that confidence in the estimates is being restored.

Key Information:

The RLS algorithm successfully tracks parameter changes but takes time to adapt.
The impact of the signal excitation at $k = 300$ is significant, especially for $\theta_3$.
The covariance trace confirms increased uncertainty after parameter and input changes but stabilizes as the algorithm adapts.

Last modification：February 14, 2025

如果觉得我的文章对你有用，请随意赞赏

One comment

躲闪的匿名人士
February 12th, 2025 at 12:24 am

看不懂，但高级！好！ദ്ദി˶•̀֊•́)✧

Reply

SYSC5401_Assignment 2

William.W • 2025 年 02 月 11 日

<blockquote><h1><a href="https://wwang7.synology.me:8031/usr/uploads/2025/02/3449330084.pdf"><strong>Assignment2.pdf</strong></a></h1></blockquote><h1>Question 1: The Instrumental Variable Approach</h1><h2><strong>Step 1) Initialization</strong></h2><p>First, we define the number of time steps （$N$） along with the coefficients $a_1$ and $a_2$. To introduce noise, we set the standard deviation of white noise corresponding to variances of 4.0 and 9.0. Additionally, we initialize the arrays for the clean system output ($y$), the noisy system output ($y_m$), and the noise vector ($e$).</p><pre><code class="lang-mat">% Simulation parameters
N = 400; % Number of time steps
a1 = 1.5; % Colored noise coefficient
a2 = -0.7; % Colored noise coefficient
sigma_e = 3; % Standard deviation of white noise (3 for variance = 9.0)
rng(109); %set random seed

% Initialize arrays
y = zeros(N, 1);
ym = zeros(N, 1);
u = 2*(randi([0, 1], N, 1))-1; % Random binary input signal
e = sigma_e * randn(N, 1); % White noise
v = zeros(N, 1); % Initialize colored noise</code></pre><h2><strong>Step 2) System Simulation</strong></h2><p>For the simulation, we first apply the clean filter:</p><p>$$
y(k) = 1.5 y(k-1) - 0.7 y(k-2) + u(k-1)
$$</p><p>Then, we create the white noise vector:</p><p>$$
e(k) = 2 \cdot \text{randn}(N,1)
$$</p><p>Finally, the actual system output is simulated:</p><p>$$
y_m(k) = y(k) + e(k)
$$</p><p>We start the simulation from $k=3$ due to dependencies.</p><h3><strong>Matlab Code</strong></h3><pre><code class="lang-matlab">% System simulation (start from k=3 due to dependencies)
for k = 3:N
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1);
    ym(k) = y(k) + e(k); % Measurement signal
end</code></pre><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/2228047890.png" alt="image.png" title="image.png" style=""></p><h2><strong>Step 3) Initial Parameters for IV Iterations</strong></h2><p>To get the first instrumental solution, we compute the <strong>Least Squares (LS)</strong> estimate:</p><p>$$
\theta_{LS} = (\varphi^T \varphi)^{-1} \varphi^T Y
$$</p><p>where $\varphi$ is the design matrix containing the regressors $[y_m(k-1), y_m(k-2), u(k-1)]$ and $Y$ is the output vector $y_m(3), y_m(4), ... y_m(N)$.</p><h3><strong>Matlab Code</strong></h3><pre><code class="lang-matlab">% Design matrix for noise case
Phi_noise_4 = [ym(2:end-1), ym(1:end-2), u(2:end-1)];
Y_noise_4 = ym(3:end);

% Design matrix for clean case
Phi_clean = [y(2:end-1), y(1:end-2), u(2:end-1)];
Y_clean = y(3:end);</code></pre><p>From the <strong>Matlab output</strong>, the LS estimations for clean, noisy (variance $4.0$), and noisy (variance $9.0$) cases are:</p><table><thead><tr><th>Case</th><th>Estimated Parameters from LS</th></tr></thead><tbody><tr><td>Clean Case (True Parameters)</td><td>[1.500000, -0.700000, 1.000000]</td></tr><tr><td>Noise Case (4.0 variance)</td><td>[2.077900, -1.357300, 0.373200]</td></tr><tr><td>Noise Case (9.0 variance)</td><td>[2.949300, -2.518900, -0.316400]</td></tr></tbody></table><h2><strong>Step 4: Iterative IV Estimation</strong></h2><p>After obtaining the initial estimates, the <strong>Instrumental Variable (IV) method</strong> is used to refine them iteratively. The instrumental variable is computed as:</p><p>$$
X_i(k) = \theta_{LS}(1) X_i(k-1) - \theta_{LS}(2) X_i(k-2) + \theta_{LS}(3) u(k-1)
$$</p><p>At each iteration, the <strong>instrument matrix</strong> $Z$ is formed using lagged values of $X_i(k)$ and the input $u(k)$:</p><p>$$
Z = [X_i(k-1), X_i(k-2), u(k-1)]
$$</p><p>The IV estimate is then updated as:</p><p>$$
\theta_{\text{IV}} = (Z^T \Phi)^{-1} (Z^T Y)
$$</p><p>where $\Phi$ is the regressor matrix. The predicted output is:</p><p>$$
y_{\text{pred}} = (\theta_{\text{IV}})^T \Phi_{\text{noise}}
$$</p><p>The cost function, measuring estimation error, is:</p><p>$$
J = \sum (Y - y_{\text{pred}})^2
$$</p><p>This iterative process improves parameter estimation by reducing the influence of noise.</p><h3><strong>Matlab Code</strong></h3><pre><code class="lang-matlab">% Simulation parameters
N = 400;
a1 = 1.5; 
a2 = -0.7; 
sigma_e = 3; % Standard deviation of white noise (3 for variance = 9.0)
rng(109); %set random seed

% Initialize arrays
y = zeros(N, 1);
ym = zeros(N, 1);
u = 2*(randi([0, 1], N, 1))-1; % Random binary input signal
e = sigma_e * randn(N, 1); % White noise
v = zeros(N, 1); % Initialize colored noise

% System simulation (start from k=3 due to dependencies)
for k = 3:N
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1);
    ym(k) = y(k) + e(k); % Measurement signal
end

% Design matrix for noise case
Phi_noise = [ym(2:end-1), ym(1:end-2), u(2:end-1)];
Y_noise = ym(3:end);

% Design matrix for clean case
Phi_clean = [y(2:end-1), y(1:end-2), u(2:end-1)];
Y_clean = y(3:end);

% Initialize matrices to store the estimated parameters
theta_IV_noise = zeros(3,5);
theta_LS_noise = Phi_noise \ Y_noise;
theta_IV_noise(:,1) = theta_LS_noise.';
cost = zeros(1,6);
J_clean = zeros(1,6);

%% Initialize plot part
t = 3:N;
% no noise
nexttile;
plot(t, y(t), 'LineWidth', 1.5);
title('Noise-Free System Output y');
xlabel('Time Steps');
ylabel('y');
grid on;

% With noise
nexttile;
plot(t, ym(t), 'LineWidth', 1.5);
title('Noisy Measurement Signal ym');
xlabel('Time Steps');
ylabel('ym');
grid on;

% create tiledlayout
    figure;
    tiledlayout(6,1, 'TileSpacing', 'Compact', 'Padding', 'Compact');

% Run IV iterations
for iter = 1:10
    if iter == 1
        a1 = theta_LS_noise(1);
        a2 = theta_LS_noise(2);
        b1 = theta_LS_noise(3);
    else
        a1 = theta_IV_noise(1,iter-1);
        a2 = theta_IV_noise(2,iter-1);
        b1 = theta_IV_noise(3,iter-1);
    end

Xi = zeros(N, 1); % Initialize instrumental variable

for k = 3:N
        Xi(k) = a1 * Xi(k-1) + a2 * Xi(k-2) + b1 * u(k-1);
    end

% Construct instrumental variable matrix
    Z = [Xi(2:end-1), Xi(1:end-2), u(2:end-1)];

% IV estimation
    theta_IV_noise(:,iter) = (Z' * Phi_noise) \ (Z' * Y_noise);

% Predict output
    y_pred = Phi_clean * theta_IV_noise(:,iter);

% Compute cost function
    cost(iter) = (1/(2*N)) * sum((Y_noise - y_pred).^2);
    J_clean(iter) = (1/(2*N)) * sum((Y_clean - y_pred).^2);

% Display results
    fprintf('Iteration %d: Parameters: [%f, %f, %f], Cost: %f, J_clean: %f\n', ...
        iter, theta_IV_noise(:,iter), cost(iter),J_clean(iter));

nexttile;
    plot(t, y(t), 'k-', 'LineWidth', 0.1);
    hold on;
    plot(t, y_pred, 'k--', 'LineWidth', 0.1);
    title(sprintf('Iteration %d', iter));
    xlabel('Time Steps');
    ylabel('Output');
    legend('No Noise y', sprintf('IV Predict y'), 'Location', 'best');
    grid on;
end</code></pre><h3><strong>Plot Variance = 4</strong></h3><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/2547513296.png" alt="A1_1.png" title="A1_1.png" style=""></p><h3><strong>Plot Variance = 9</strong></h3><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/1987272828.png" alt="A1_2.png" title="A1_2.png" style=""></p><h2>Result</h2><ul><li>True Parameter [1.5, -0.7, 1]</li></ul><h3><strong>Estimated Parameters for Each Iteration (Noise Case: 4.0 Variance)</strong></h3><table><thead><tr><th>Iteration</th><th>$a_1$</th><th>$a_2$</th><th>$b_1$</th><th>Cost$J$</th><th>$J_{\text{clean}}$</th></tr></thead><tbody><tr><td>1</td><td>2.0779</td><td>-1.3573</td><td>0.3732</td><td>2.2825</td><td>0.6100</td></tr><tr><td>2</td><td>1.4199</td><td>-0.6144</td><td>0.5417</td><td>1.8268</td><td>0.1108</td></tr><tr><td>3</td><td>1.7092</td><td>-0.9068</td><td>0.4714</td><td>1.8785</td><td>0.1848</td></tr><tr><td>4</td><td>1.5979</td><td>-0.7941</td><td>0.4985</td><td>1.8377</td><td>0.1354</td></tr><tr><td>5</td><td>1.6385</td><td>-0.8328</td><td>0.4889</td><td>1.8490</td><td>0.1501</td></tr></tbody></table><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/1787075994.png" alt="image-1.png" title="image-1.png" style=""></p><h3><strong>Estimated Parameters for Each Iteration (Noise Case: 9.0 Variance)</strong></h3><table><thead><tr><th>Iteration</th><th>$a_1$</th><th>$a_2$</th><th>$b_1$</th><th>Cost$J$</th><th>$J_{\text{clean}}$</th></tr></thead><tbody><tr><td>1</td><td>2.9493</td><td>-2.5189</td><td>-0.3164</td><td>7.7867</td><td>4.1077</td></tr><tr><td>2</td><td>0.9185</td><td>-0.1524</td><td>0.4939</td><td>4.3865</td><td>0.4472</td></tr><tr><td>3</td><td>2.3535</td><td>-1.6182</td><td>-0.0457</td><td>5.0955</td><td>1.3665</td></tr><tr><td>4</td><td>1.3222</td><td>-0.4550</td><td>0.3596</td><td>4.1309</td><td>0.2648</td></tr><tr><td>5</td><td>1.9760</td><td>-1.1439</td><td>0.1104</td><td>4.3928</td><td>0.6198</td></tr></tbody></table><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/2128741640.png" alt="image-2.png" title="image-2.png" style=""></p><h2><strong>Conclusion</strong></h2><p>Instrumental Variable (IV) estimation effectively reduces noise influence, but its performance depends on noise variance. Under moderate noise (4.0 variance), parameters converge smoothly toward the true values, with minor deviations. However, in high noise conditions (9.0 variance), convergence is slower, more erratic, and introduces bias, especially in $b_1$. The cost function decreases in both cases, but higher noise results in a consistently higher cost, indicating less accurate parameter estimates. Overall, IV estimation performs well under controlled noise but struggles with excessive noise.</p><h1>Question 2) Steepest Descent, LMS and RLS</h1><h2><strong>1. Problem Setup</strong></h2><p>The given system is represented in the Z-domain as:</p><p>$$
\frac{d(z)}{x(z)} = \frac{0.2}{z - 0.8}.
$$</p><p>Rearranging:</p><p>$$
zd(z) - 0.8d(z) = 0.2x(z).
$$</p><p>Converting to the time domain:</p><p>$$
d(n + 1) = 0.2x(n) + 0.8d(n).
$$</p><p>Assume that the input signal $x(n)$ is <strong>white noise</strong> with variance $\text{var}[x(n)] = 1$.</p><p>The parameter vector to estimate is:</p><p>$$
\mathbf{H} =
\begin{bmatrix} a \\ b \end{bmatrix}.
$$</p><h2><strong>2. Wiener--Hopf Solution</strong></h2><p>To find the optimal parameters, we use the <strong>Wiener--Hopf equation</strong>:</p><p>$$
\mathbf{H}_{\text{optimal}} = \mathbf{R}^{-1} \mathbf{p}.
$$</p><p>where:</p><ul><li><strong>Cross-correlation vector</strong>:</li></ul><p>$$
\mathbf{p} = \mathbb{E}[ d(n) X(n) ],
$$</p><p>where $d(n)$ is the <strong>desired output</strong>, and $X(n)$ is the <strong>input vector</strong>.</p><ul><li><strong>Autocorrelation matrix</strong>:</li></ul><p>$$
\mathbf{R} = \mathbb{E} [ X(n) X(n)^T ].
$$</p><p>For this system:</p><ul><li>The input signals at time $n$ are $x(n)$ and $d(n)$, so the input vector is:</li></ul><p>$$
X(n) = \begin{bmatrix} d(n) \\ x(n) \end{bmatrix}.
$$</p><ul><li><p>The cross-correlation vector is:</p><p>$$
\mathbf{p} = \mathbb{E} \begin{bmatrix} d(n+1) d(n) \\ d(n+1) x(n) \end{bmatrix}.
$$</p></li><li><p>The autocorrelation matrix is:</p><p>$$
\mathbf{R} =
\begin{bmatrix}
\mathbb{E}[ d^2(n) ] &amp; \mathbb{E}[ d(n)x(n) ] \\[4pt]
\mathbb{E}[ x(n)d(n) ] &amp; \mathbb{E}[ x^2(n) ]
\end{bmatrix}.
$$</p></li></ul><p>With these expressions, we can simulate the system and precompute $\mathbf{p}$ and $\mathbf{R}$ for further calculations.</p><p>Below is a straightforward way to carry out the <strong>Wiener--Hopf</strong> solution in closed form for the system</p><p>$$
d(n+1) \;=\; 0.8\,d(n)\;+\;0.2\,x(n),
$$</p><p>with $x(n)$ white‐noise input of zero mean and variance $\sigma_x^2=1$. We seek the parameter vector</p><p>$$
\mathbf{H} =
\begin{bmatrix}
a\\[3pt]
b
\end{bmatrix}
$$</p><p>that minimizes the mean‐square error between the "desired" $d(n+1)$ and the model output $\hat{d}(n+1) = a\,d(n)+b\,x(n)$.</p><h2>3. Autocorrelation Matrix $\mathbf{R}$</h2><p>We define the input vector</p><p>$$
X(n) \;=\; 
\begin{bmatrix}
d(n)\\[4pt]
x(n)
\end{bmatrix}.
$$</p><p>Then the autocorrelation matrix is</p><p>$$
\mathbf{R}
\;=\;
\mathbb{E}\bigl[\,X(n)\,X(n)^\mathsf{T}\bigr]
\;=\;
\begin{bmatrix}
\mathbb{E}[\,d(n)^2\,] &amp; \mathbb{E}[\,d(n)\,x(n)\,]\\[3pt]
\mathbb{E}[\,x(n)\,d(n)\,] &amp; \mathbb{E}[\,x(n)^2\,]
\end{bmatrix}.
$$</p><ol><li><p><strong>Compute</strong> $\mathbb{E}[x(n)^2]$.<br>Since $x(n)$ is white noise with variance 1,</p><p>$$
\mathbb{E}[\,x(n)^2\,] \;=\; 1.
$$</p></li><li><p><strong>Compute</strong> $\mathbb{E}[\,d(n)\,x(n)\,]$.<br>Because $d(n)$ depends on $x(n-1), x(n-2), \ldots$, and $x(n)$ is uncorrelated with past values of itself, it follows that $d(n)$ is uncorrelated with $x(n)$. Thus,</p><p>$$
\mathbb{E}[\,d(n)\,x(n)\,] \;=\; 0.
$$</p></li><li><p><strong>Compute</strong> $\mathbb{E}[\,d(n)^2\,]$.<br>Denote $\sigma_d^2 = \mathbb{E}[\,d(n)^2\,]$. From the difference equation,</p><p>$$
d(n+1) \;=\; 0.8 \, d(n) \;+\; 0.2 \, x(n).
$$</p><p>In steady state, using the fact that $d(n)$ is uncorrelated with $x(n)$ and</p><p>$\sigma_x^2=1$,</p><p>$$
\sigma_d^2
\;=\;
\mathbb{E}[\,d(n)^2\,]
\;=\;
0.8^2\,\sigma_d^2 \;+\; 0.2^2\,\underbrace{\mathbb{E}[\,x(n)^2\,]}_{1}.
$$</p><p>Hence</p><p>$$
\sigma_d^2 
\;=\;
0.64\,\sigma_d^2 \;+\; 0.04,
\quad\Longrightarrow\quad
0.36\,\sigma_d^2 \;=\; 0.04
\quad\Longrightarrow\quad
\sigma_d^2 \;=\; \frac{0.04}{0.36} \;=\; \tfrac{1}{9}.
$$</p><p>Therefore,</p><p>$$
\mathbb{E}[\,d(n)^2\,] \;=\; \frac{1}{9}.
$$</p></li></ol><p>Putting it all together,</p><p>$$
\mathbf{R}
\;=\;
\begin{bmatrix}
\frac{1}{9} &amp; 0\\[4pt]
0 &amp; 1
\end{bmatrix}.
$$</p><h2>2. Cross‐Correlation Vector $\mathbf{p}$</h2><p>We need</p><p>$$
\mathbf{p}
\;=\;
\mathbb{E}\bigl[\,
d(n+1)\,X(n)
\bigr]
\;=\;
\begin{bmatrix}
\mathbb{E}[\,d(n+1)\,d(n)\,]\\[4pt]
\mathbb{E}[\,d(n+1)\,x(n)\,]
\end{bmatrix}.
$$</p><p>Use $d(n+1) = 0.8\,d(n) + 0.2\,x(n)$:</p><ol><li><p><strong>First element</strong> $\mathbb{E}[\,d(n+1)\,d(n)\,]$:</p><p>$$
\mathbb{E}[\,d(n+1)\,d(n)\,]
\;=\;
\mathbb{E}\bigl[\,(\,0.8\,d(n) + 0.2\,x(n)\,)\,d(n)\bigr]
\;=\;
0.8\,\mathbb{E}[\,d(n)^2\,] \;+\; 0.2\,\mathbb{E}[\,x(n)\,d(n)\,].
$$</p><p>Since $\mathbb{E}[\,x(n)\,d(n)\,] = 0$, we get</p><p>$$
\mathbb{E}[\,d(n+1)\,d(n)\,]
\;=\;
0.8 \times \frac{1}{9}
\;=\;
\frac{0.8}{9}
\;=\;
\frac{4}{45}.
$$</p></li><li><p><strong>Second element</strong> $\mathbb{E}[\,d(n+1)\,x(n)\,]$:</p><p>$$
\mathbb{E}[\,d(n+1)\,x(n)\,]
\;=\;
\mathbb{E}\bigl[\,(0.8\,d(n) + 0.2\,x(n))\,x(n)\bigr]
\;=\;
0.8\,\underbrace{\mathbb{E}[\,d(n)\,x(n)\,]}_{0} \;+\; 0.2\,\mathbb{E}[\,x(n)^2\,]
\;=\;
0.2 \times 1
\;=\;
0.2.
$$</p></li></ol><p>Hence</p><p>$$
\mathbf{p}
\;=\;
\begin{bmatrix}
\frac{4}{45}\\[3pt]
0.2
\end{bmatrix}.
$$</p><h2>4. Wiener--Hopf Solution</h2><p>The optimal parameter vector $\mathbf{H}_{\text{optimal}}$ is given by</p><p>$$
\mathbf{H}_{\text{optimal}}
\;=\;
\mathbf{R}^{-1}\,\mathbf{p}.
$$</p><p>$$
\mathbf{R}
\;=\;
\begin{bmatrix}
\frac{1}{9} &amp; 0\\[3pt]
0 &amp; 1
\end{bmatrix}
\quad\Longrightarrow\quad
\mathbf{R}^{-1}
\;=\;
\begin{bmatrix}
9 &amp; 0\\[3pt]
0 &amp; 1
\end{bmatrix}.
$$</p><p>Therefore,</p><p>$$
\mathbf{H}_{\text{optimal}}
\;=\;
\begin{bmatrix}
9 &amp; 0\\[3pt]
0 &amp; 1
\end{bmatrix}
\begin{bmatrix}
\frac{4}{45}\\[3pt]
0.2
\end{bmatrix}
\;=\;
\begin{bmatrix}
9 \times \frac{4}{45}\\[3pt]
1 \times 0.2
\end{bmatrix}
\;=\;
\begin{bmatrix}
\frac{36}{45}\\[3pt]
0.2
\end{bmatrix}
\;=\;
\begin{bmatrix}
0.8\\[3pt]
0.2
\end{bmatrix}.
$$</p><p>Hence the Wiener--Hopf solution for this system is</p><p>$$
\boxed{
a \;=\; 0.8,
\quad
b \;=\; 0.2.
}
$$</p><h2>5.Steepest Descent Algorithm</h2><h3><strong>Matlab Code</strong></h3><pre><code class="lang-matlab">clear; clc; close all;

%% Generate data
N = 10000;                % total samples
x = randn(N,1);        % white noise input, var=1
d = zeros(N,1);        % desired signal initialization
d(1) = 0;              % initial condition

% True system: d(n+1) = 0.8*d(n) + 0.2*x(n)
for k = 1:N-1
    d(k+1) = 0.8*d(k) + 0.2*x(k);
end

R = [1/9   0;
     0     1  ];  
p = [0.8/9;  0.2];
d2 = 1/9;
H_opt = inv(R)*p;   % -&gt; [0.8 ; 0.2]

%% Initialize Steepest Descent
alpha   = 0.01;      % VERY TINY STEP size
H_hat   = [0; 0];   % Initial START [a; b]

a_history = zeros(1,N);
b_history = zeros(1,N);
e_array   = zeros(1,N);   % store instantaneous error

% Store initial values
a_history(1) = H_hat(1);
b_history(1) = H_hat(2);

%% Steepest Descent Loop
for n = 2:N
    e_array(n) = d(n) - H_hat(1)*d(n-1) - H_hat(2)*x(n-1);
    gradient = -2*p + 2*R*H_hat;
    H_hat = H_hat - alpha * gradient;
  
    a_history(n) = H_hat(1);
    b_history(n) = H_hat(2);
end

fprintf('Final estimated a = %.4f\n', a_history(end));
fprintf('Final estimated b = %.4f\n', b_history(end));

%% Contour Plot of J(a,b) + Parameter Path
na = 60;  % grid size for a-axis
nb = 60;  % grid size for b-axis
a_range = linspace(-0.1,1.2,na);
b_range = linspace(-0.1,0.4, nb);
J_grid  = zeros(na, nb);

% Compute cost function grid
for i = 1:na
    for j = 1:nb
        A = [a_range(i); b_range(j)];
        J_grid(i,j) = d2 - 2*(A.'*p) + (A.'*R*A);
    end
end

figure;
contour(a_range, b_range, J_grid.', 25); hold on; grid on; colorbar;
xlabel('a'); ylabel('b');
title('Contour of J(a,b) + Steepest Descent Trajectory');
% Plot the parameter trajectory
plot(a_history, b_history, 'r-','LineWidth',1.2,'MarkerSize',2);
% Mark final estimate
plot(a_history(end), b_history(end), 'ks','MarkerSize',6,'LineWidth',1.5);
% Mark true Wiener solution
plot(H_opt(1), H_opt(2), 'kx','MarkerSize',8,'LineWidth',2);
legend('Cost contours','Path','Final estimate','True solution','Location','best');
hold off;

%% plot a&amp;b est figure

figure;
tiledlayout(2,1, 'TileSpacing', 'Compact', 'Padding', 'Compact');

% a_est plot
nexttile;
plot(a_history,'-','MarkerSize',2); hold on; 
yline(H_opt(1),'r--','LineWidth',1.5);
xlabel('Iteration'); ylabel('a_{est}');
legend('a_{est}','true a=0.8','Location','best');
title('a_{est} vs. Iteration (Full)');
grid on;

% b_est plot
nexttile;
plot(b_history,'-','MarkerSize',2); hold on; 
yline(H_opt(2),'r--','LineWidth',1.5);
xlabel('Iteration'); ylabel('b_{est}');
legend('b_{est}','true b=0.2','Location','best');
title('b_{est} vs. Iteration (Full)');
grid on;</code></pre><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/3555226793.png" alt="Q2_1_1.png" title="Q2_1_1.png" style=""></p><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/2491013078.png" alt="Q2_1_2.png" title="Q2_1_2.png" style=""></p><h2>6.LMS Algorithm</h2><p>The parameters are updated using the Least Mean Squares (LMS) rule:</p><p>$$
e(n) = d(n) - [a_{\text{est}} d(n-1) + b_{\text{est}} x(n-1)]
$$</p><p>$$
H_{\text{hat}} = H_{\text{hat}} + \mu \cdot e(n) \cdot 
\begin{bmatrix} d(n-1) \\ x(n-1) \end{bmatrix}
$$</p><pre><code class="lang-Matlab">for n = 2:N
  
    % Input vector at time n (one-step-behind signals)
    Xn = [ d(n-1); x(n-1) ];
  
    % Predicted output
    y_hat = H_hat.' * Xn;  % = a_est*d(n-1) + b_est*x(n-1)
  
    % Error
    e_array(n) = d(n) - y_hat;
  
    % LMS parameter update
    H_hat = H_hat + mu * e_array(n) * Xn;
  
    % Store
    a_history(n) = H_hat(1);
    b_history(n) = H_hat(2);
end</code></pre><h3>Plots</h3><p>We will plot the following:</p><ol><li><strong>Parameter Trajectory</strong>: $(a_{\text{est}}, b_{\text{est}})$</li><li><p><strong>Cost Surface Contour</strong>:</p><ul><li>The contour $J(a,b)$ derived from the Wiener formula.</li><li>This allows visualization of how LMS moves within the cost surface.</li></ul></li></ol><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/1525317288.png" alt="Q2_1_3.png" title="Q2_1_3.png" style=""></p><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/819737800.png" alt="Q2_1_4.png" title="Q2_1_4.png" style=""></p><h2>7.Standard Recursive Least Squares Algorithm(λ = 1.0)</h2><p>We use the classic RLS recursion:</p><ol><li><p><strong>Input Vector:</strong></p><p>$$
X(n) = \begin{bmatrix} d(n-1) \\ x(n-1) \end{bmatrix}
$$</p></li><li><p><strong>Predicted Output:</strong></p><p>$$
\hat{y}(n) = H_{\text{hat}}(n-1)^T X(n)
$$</p></li><li><p><strong>Error Calculation:</strong></p><p>$$
e(n) = d(n) - \hat{y}(n)
$$</p></li><li><p><strong>Gain Vector:</strong></p><p>$$
K(n) = \frac{P(n-1) X(n)}{\lambda + X(n)^T P(n-1) X(n)}
$$</p></li><li><p><strong>Parameter Update:</strong></p><p>$$
H_{\text{hat}}(n) = H_{\text{hat}}(n-1) + K(n) e(n)
$$</p></li><li><p><strong>Covariance Matrix Update:</strong></p><p>$$
P(n) = \frac{P(n-1) - K(n) X(n)^T P(n-1)}{\lambda}
$$</p></li></ol><pre><code class="lang-Matlab">%% 4. RLS LOOP
for n = 2:N
  
    % 1) Current regressor: X(n) = [ d(n-1); x(n-1) ]
    Xn = [ d(n-1); x(n-1) ];
  
    % 2) Predicted output
    y_hat = H_hat.' * Xn;
  
    % 3) Error
    e_array(n) = d(n) - y_hat;
  
    % 4) Gain vector
    denom = lambda + Xn.' * P * Xn;   % scalar
    K = (P * Xn) / denom;
  
    % 5) Parameter update
    H_hat = H_hat + K * e_array(n);
  
    % 6) Covariance update
    P = (P - K * (Xn.' * P)) / lambda;
  
    % Store results
    a_history(n) = H_hat(1);
    b_history(n) = H_hat(2);
end</code></pre><h3>Plots</h3><p>We visualize:</p><ul><li>The <strong>parameter trajectory</strong> $(a_{\text{est}}, b_{\text{est}})$.</li><li>The <strong>cost surface contour</strong> $J(a, b)$ from the Wiener solution to compare the behavior of the RLS algorithm.</li></ul><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/1703039230.png" alt="Q2_1_5.png" title="Q2_1_5.png" style=""></p><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/3237752600.png" alt="Q2_1_6.png" title="Q2_1_6.png" style=""></p><h2>8.Comment on the Difference</h2><h4><strong>SGD (Stochastic Gradient Descent)</strong></h4><ul><li><strong>Performance</strong>: The core idea of SGD is to adjust the parameters by calculating the gradient of the error (that is, the direction in which the error changes the fastest), so that the parameters are updated in the direction of reducing the error. This method is very efficient in most cases, especially when the amount of data is large.</li><li><strong>Complexity</strong>: The calculation of SGD mainly relies on the multiplication of matrices and vectors. If the data scale is small (such as low dimensions of parameters and data), the amount of calculation is not large; but if the data scale is large (such as high-dimensional data or large-scale matrices), the calculation and storage will become very laborious. Therefore, although SGD itself is not complex, calculation and storage may become a bottleneck when processing large-scale data, especially when the signal is unstable (such as the data distribution changes over time).</li><li><strong>Implementation issues</strong>: SGD needs to store some data in advance (such as vector p and matrix R), which is feasible when the signal is stable. But if the signal is unstable (such as the data distribution changes over time) or the system characteristics change dynamically, the pre-stored data is no longer applicable.</li></ul><h4><strong>LMS (Least Mean Squares)</strong></h4><ul><li><strong>Performance</strong>: LMS is a simpler gradient descent method, especially suitable for situations where data cannot be stored in advance (such as unstable signals or high-dimensional data). It does not directly calculate the entire error surface, but updates the parameters by estimating the current error, so it is called a "dirty gradient".</li><li><strong>Complexity</strong>: LMS is very simple to calculate because it only uses a fixed learning rate to update the parameters, avoiding complex matrix operations. This makes it very efficient and suitable for real-time processing.</li><li><strong>Implementation issues</strong>: The main problem with LMS is that it is very sensitive to the choice of learning rate. If the learning rate is set too small, the convergence speed will be slow; if it is set too large, it may cause parameter oscillation or even failure to converge.</li></ul><h4><strong>RLS (Recursive Least Squares)</strong></h4><ul><li><strong>Performance</strong>: RLS is a very fast algorithm because it uses the inverse information of the autocorrelation matrix to adjust the parameters more accurately. It usually converges much faster than LMS and SGD.</li><li><strong>Complexity</strong>: The disadvantage of RLS is that it has a high computational complexity because it requires matrix operations, such as matrix inversion. These operations require high computing resources, especially when the data dimension is large.</li><li><strong>Implementation issues</strong>: Due to the high computational complexity of RLS, it is difficult to implement on hardware with limited resources. In addition, RLS has high requirements for numerical stability, and matrix inversion may cause numerical instability (such as "numerical expansion" phenomenon), which will affect the performance of the algorithm.</li></ul><h1>Question 3</h1><h2>a)</h2><h4>Matlab Code</h4><pre><code class="lang-Matlab">% a) Recursive Least Squares (RLS) for system identification

clear; clc; close all;
rng(109);
N = 400;
sigma_e = 2;  % Standard deviation of noise
lambda_values = [1.0, 0.95];  % Forgetting factors

theta_true = [1.5; -0.7; 1.0];

y = zeros(N,1);
u = 2 * (rand(N,1) &gt; 0.5) - 1; % Random binary input signal (±1)

% Simulate the system
for k = 3:N
    e = sigma_e * randn;
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1) + e;
end

% Function to perform RLS estimation
for lambda = lambda_values
    theta_est = zeros(3, N);
    P = 1000 * eye(3); 
    trace_P = zeros(N,1);

for k = 3:N
        phi = [y(k-1); y(k-2); u(k-1)]; % Regression vector
        K = P * phi / (lambda + phi' * P * phi); % Gain
        error = y(k) - phi' * theta_est(:, k-1); % Prediction error
        theta_est(:, k) = theta_est(:, k-1) + K * error; 
        P = (P - (K * phi') * P) / lambda; % Update covariance
        trace_P(k) = trace(P); % Store trace
    end

figure;
    subplot(2,1,1);
    plot(theta_est(1,:), 'r', 'LineWidth', 1.5); hold on;
    plot(theta_est(2,:), 'g', 'LineWidth', 1.5);
    plot(theta_est(3,:), 'b', 'LineWidth', 1.5);
    yline(1.5, 'r--', 'True \theta_1');
    yline(-0.7, 'g--', 'True \theta_2');
    yline(1.0, 'b--', 'True \theta_3');
    xlabel('Time Step');
    ylabel('\theta Estimates');
    legend('\theta_1', '\theta_2', '\theta_3', 'Location', 'best');
    title(['Parameter Estimates (\lambda = ', num2str(lambda), ')']);
    grid on;

subplot(2,1,2);
    plot(trace_P, 'k', 'LineWidth', 1.5);
    xlabel('Time Step');
    ylabel('Trace(P)');
    title(['Trace of Covariance Matrix (\lambda = ', num2str(lambda), ')']);
    grid on;
end</code></pre><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/3276758259.png" alt="Q3_1_1.png" title="Q3_1_1.png" style=""></p><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/1050955172.png" alt="Q3_1_2.png" title="Q3_1_2.png" style=""></p><h3>Is there a difference?</h3><p>Yes!When comparing the results of $\lambda = 1.0$ and $\lambda = 0.95$, we find that they have significant differences in parameter stability and covariance matrix behavior. When $\lambda = 1.0$, the parameter estimates quickly stabilize and closely track the true values, with little fluctuation, and the trace of the covariance matrix also converges smoothly, indicating that the estimation process is stable and consistent. When $\lambda = 0.95$, the parameter estimates show more fluctuations because the algorithm pays more attention to recent data and is more sensitive to new observations. The trace of the covariance matrix also fluctuates more, reflecting the increased uncertainty in the estimation process.</p><h2>b)</h2><pre><code class="lang-Matlab">%% MATLAB Code for Part (b): RLS with Changing Input Signal

clear all; close all; clc;
rng(109);
N = 400;  
sigma_e = 2;   
y = zeros(N,1);  
u = zeros(N,1);

%% Generate the Input Signal u(k)
% For k = 1 to 99, u(k) is random binary: +1 or -1
u(1:99) = 2*(rand(99,1) &gt; 0.5) - 1;

% For k = 100 to 300, set u(k) = +1.0
u(100:300) = 0;

% For k = 301 to 400, u(k) returns to random binary: +1 or -1
u(301:400) = 2*(rand(100,1) &gt; 0.5) - 1;

%% Simulate the System
y(1:2) = 0;
for k = 3:N
    e = sigma_e * randn;   
    y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1) + e;
end

%% RLS Algorithm Setup
lambda = 0.95;  
n_params = 3;  
theta_est = zeros(n_params, N);   
P = 1000 * eye(n_params);   
traceP = zeros(N,1);

%% Run the RLS Algorithm
for k = 3:N
    phi = [y(k-1); y(k-2); u(k-1)];
    K = P * phi / (lambda + phi' * P * phi);
    error = y(k) - phi' * theta_est(:, k-1);
    theta_est(:, k) = theta_est(:, k-1) + K * error;
    P = (P - K * phi' * P) / lambda;
    traceP(k) = trace(P);
end

time = 1:N;  % Time vector

figure;
subplot(2,1,1);
plot(time, theta_est(1,:), 'r-', 'LineWidth', 2); hold on;
plot(time, theta_est(2,:), 'b-', 'LineWidth', 2);
plot(time, theta_est(3,:), 'g-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Parameter Estimates');
legend('\theta_1','\theta_2','\theta_3','Location','Best');
title('RLS Parameter Estimates with \lambda = 0.95');
grid on;

subplot(2,1,2);
plot(time, traceP, 'k-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Trace of Covariance Matrix P');
title('Trace of Covariance Matrix P over Time');
grid on;

clear all; close all; clc;
rng(109);
N = 400;              % Total number of data points
sigma_e = 2;          % Standard deviation of noise (sqrt(variance)=2)
y = zeros(N,1);       % Pre-allocate output vector
u = zeros(N,1);       % Pre-allocate input vector

%% Generate the Input Signal u(k)
% For k = 1 to 99, u(k) is random binary: +1 or -1
u(1:99) = 2*(rand(99,1) &gt; 0.5) - 1;

% For k = 100 to 300, set u(k) = +1.0
u(100:300) = 0;

% For k = 301 to 400, u(k) returns to random binary: +1 or -1
u(301:400) = 2*(rand(100,1) &gt; 0.5) - 1;

%% Simulate the Time-Varying System
y(1:2) = 0;
for k = 3:N
    e = sigma_e * randn;
  
    if k &lt; 250
        % Before time step 250: original parameters
        y(k) = 1.5*y(k-1) - 0.7*y(k-2) + u(k-1) + e;
    else
        % At and after time step 250: changed parameters
        y(k) = 1.7*y(k-1) - 0.9*y(k-2) + 5*u(k-1) + e;
    end
end

%% RLS Algorithm Setup
lambda = 0.95;   
n_params = 3;   
theta_est = zeros(n_params, N);  
P = 1000 * eye(n_params);   
traceP = zeros(N,1);

%% Plot the Results
time = 1:N;  % Time vector

subplot(2,1,2);
plot(time, traceP, 'k-', 'LineWidth', 2);
xlabel('Time Step k');
ylabel('Trace(P)');
title('Trace of the Covariance Matrix P over Time');
grid on;

fprintf('Final parameter estimates:\n');
fprintf('theta1 = %.4f\n', theta_est(1,end));
fprintf('theta2 = %.4f\n', theta_est(2,end));
fprintf('theta3 = %.4f\n', theta_est(3,end));</code></pre><p><img src="https://wwang7.synology.me:8031/usr/uploads/2025/02/3231098944.png" alt="Q3_3_1.png" title="Q3_3_1.png" style=""></p><h3><strong>Observations on the Results</strong>:</h3><ul><li><p><strong>Parameter Estimates Before k = 250 </strong>:</p><ul><li>The estimated parameters ($\theta_1$, $\theta_2$, and $\theta_3$) are stable and remain relatively constant, showing that the RLS algorithm has successfully converged to an initial solution.</li></ul></li><li><p><strong>$k$ = 250</strong> :</p><ul><li>At $k$ = 250, the true system parameters were modified ($\theta_1$, $\theta_2$, and $\theta_3$). However, the estimated parameters take some time to reflect this change.</li></ul></li><li><p><strong>Excitation at $k$ = 300</strong></p><ul><li>At $k = 300$ the input signal $u(k)$ switches back to a random binary signal, introducing significant variations in the estimated parameters.</li><li>$\theta_3$ (green) is particularly affected, experiencing a sharp increase, indicating that it is more sensitive to changes in the input signal.</li><li>This suggests that $\theta_3$ is closely linked to the input excitation and that the sudden change in signal structure strongly influences it.</li></ul></li><li><p><strong>Covariance Matrix Trace Behavior</strong>:</p><ul><li>The trace of the covariance matrix $P$ increases gradually from $k = 250$ and peaks around $k = 300$, which indicates a period of high uncertainty.</li><li>Once the new parameter values are learned and the algorithm adapts, the trace begins to decrease, showing that confidence in the estimates is being restored.</li></ul></li></ul><h3><strong>Key Information</strong>:</h3><ul><li>The RLS algorithm successfully tracks parameter changes but takes time to adapt.</li><li>The impact of the signal excitation at $k = 300$ is significant, especially for $\theta_3$.</li><li>The covariance trace confirms increased uncertainty after parameter and input changes but stabilizes as the algorithm adapts.</li></ul>

Question 1: The Instrumental Variable Approach

Step 1) Initialization

Step 2) System Simulation

Matlab Code

Step 3) Initial Parameters for IV Iterations

Matlab Code

Step 4: Iterative IV Estimation

Matlab Code

Plot Variance = 4

Plot Variance = 9

Result

Estimated Parameters for Each Iteration (Noise Case: 4.0 Variance)

Estimated Parameters for Each Iteration (Noise Case: 9.0 Variance)

Conclusion

Question 2) Steepest Descent, LMS and RLS

1. Problem Setup

2. Wiener--Hopf Solution

3. Autocorrelation Matrix $\mathbf{R}$

2. Cross‐Correlation Vector $\mathbf{p}$

4. Wiener--Hopf Solution

5.Steepest Descent Algorithm

Matlab Code

6.LMS Algorithm

Plots

7.Standard Recursive Least Squares Algorithm(λ = 1.0)

Plots

8.Comment on the Difference

SGD (Stochastic Gradient Descent)

LMS (Least Mean Squares)

RLS (Recursive Least Squares)

Question 3

a)

Matlab Code

Is there a difference?

b)

c)

Observations on the Results:

Key Information:

One comment

Leave a Comment Cancel reply 使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款

SYSC5401_Assignment 2

Leave a Comment Cancel reply
使用cookie技术保留您的个人信息以便您下次快速评论，继续评论表示您已同意该条款