FDM and CDM pilot (DM-RS) multiplexing among antennas

The link between mobile and network is referred as “Air Interface”. This interface is defined by specification bodies like 3GPP (Third Generation Partnership Project). The air interface defines basic rules or protocols for transmission and reception between mobile and networks. These rules include but not limited to: What frequency spectrum will be used, what modulation method will be used, what all stages information bits will go through before transmitted, what waveform will be used for transmission and reception. 

The “air interface” waveform for 5G-NR is based on Orthogonal Frequency Division Multiplexing (OFDM). This means frequency band over which information needs to be transmitted is firstly divided into multiple narrow bands. These narrow bands are called sub-carriers and are mutually orthogonal. Therefore, multiple modulated symbols can be transmitted in parallel on different sub-carriers.  

In 5G-NR, the smallest time-frequency resource is known as Resource element and is consist of one subcarrier in one OFDM symbol. The transmissions are scheduled in group(s) of 12 subcarriers, known as physical resource block (PRB).

A wireless “channel” is medium over which information is conveyed. This channel is unpredictable because of many factors like multipath and shadow fading, Doppler shift, and time dispersion or delay spread. These factors are all related to variability introduced by mobility of the user and the wide range of environmental conditions that are encountered as a result.

It is important that receiver has information about properties of the communication channel for reliable exchange and detection of information conveyed. These properties of the channel are estimated at the receiver. To facilitate estimation of channel, OFDM systems use reference signals (or pilot symbols). These reference signals are embedded in resource blocks. These are also known as DM-RS (Demodulation reference signal).

When M transmit and N receive antennas are deployed, dedicated pilots for each transmit antenna are required to estimate a total of MN channels. Different pilot modes possible:

• FDM (Frequency Division Multiplexing): where pilots for different antennas occupy different tones and thus orthogonal in frequency domain. Pilot sequences of different antennas can use the same Chu sequence. The length of the Chu sequence, or the number of pilot tones per antenna, is Np/M, where Np is the total number of pilot tones.

No alt text provided for this image

• CDM (Code Division Multiplexing): where pilots for different antennas occupy the same time/frequency resources but separated by different codes. Orthogonal pilot sequences can be constructed from shifted versions of the same Chu sequence (frequency domain CDM), possibly with additional Walsh code separation (time domain CDM). The number of pilot tones per block for any transmit antenna is always Np. The orthogonal pilot sequences can be constructed in two different ways, Chu sequence with different shifts, or Chu sequence with Walsh separation from two reference signal blocks.

  1. Frequency domain CDM: In the first construction, the orthogonality between pilot sequences of different antennas are achieved by exploiting the properties of shifted Chu sequences.
  2. Time domain CDM: In the second approach, each entry of the Chu sequence is further modulated by antenna-dependent Walsh codes.
No alt text provided for this image

Both the schemes has its own advantages and disadvantages as far as channel estimation performance in various scenarios is considered. For example for 4×4 MIMO systems:

  1. MIMO system with CDM pilots consistently outperforms that with FDM pilots.
  2. At higher speed, CDM-f4-t1 scheme outperforms others because it does not rely on the time domain coherence between two long RS to separate the antenna streams.


Parseval’s theorem! Time-domain, frequency domain equivalence.

Parseval’s theorem can be used to show that

Total energy of waveform computed in time domain is equal to the total energy of the waveform’s Fourier Transform computed in the frequency domain.


Total energy of a signal can be calculated by summing power-per-sample across time or spectral power across frequency.

As shown in below equation where x(n) is in time domain and X(k) is Fourier transform of x(n).

Why this is sometimes very useful in wireless communication systems implementation. Time and frequency domain equivalence give flexibility to calculate total energy of waveform in time or frequency domain.

Lets see this with an example in Matlab.

close all;
subplot(1,2,1), area(t,abs(x.^2)),title(‘ Time Domain’);
subplot(1,2,2),area(abs(fx)), title(‘ Frequency Domain’);

Output of the MATLAB script can be seen below. This shows the equivalence in time and frequency domain.

E1_timedomain =
E1_frequdomain =


Youssef Khmou (2021). Parseval’s Theorem : 1D,2D and 3D functions (https://www.mathworks.com/matlabcentral/fileexchange/43041-parseval-s-theorem-1d-2d-and-3d-functions), MATLAB Central File Exchange. Retrieved January 15, 2021.

3GPP Rel-16: 2-step RACH Procedure

A random-access channel (RACH) is so named because it refers to a wireless channel (medium) that may be shared by multiple UEs and used by the UEs to (randomly) access the network for communications. Given below are some of the use cases of RACH in wireless communication system.

  1. RACH is used for call setup and to access the network for data transmissions.
  2. RACH is used for initial access to a network when the UE switches from a radio resource control (RRC) connected idle mode to active mode, or when handing over in RRC connected mode. Moreover, RACH may be used for downlink (DL) and/or uplink (UL) data arrival when the UE is in RRC idle or RRC inactive modes, and when re-establishing a connection with the network.
  3. RACH is used to request uplink scheduling if no dedicated scheduling-request resource has been assigned to UE.

3GPP release 15 defines a four step RACH procedure. Given below is an example four-step RACH procedure.

  1. A first message (MSG1) is sent from the UE to gNB on the physical random-access channel (PRACH). MSG1 includes a RACH preamble. This preamble is designed such that it can be detected even when there is lack of accurate timing information.
  2. gNB responds with a random-access response (RAR) message (MSG2) which includes the identifier (ID) of the RACH preamble, a timing advance (TA), an uplink grant, cell radio network temporary identifier (C-RNTI), and a back off indicator. MSG2 includes a PDCCH communication including control information for a following communication on the PDSCH.
  3. In response to MSG2, MSG3 is transmitted from the UE to gNB on the PUSCH. MSG3 includes a RRC connection request, a tracking area update, and a scheduling request.
  4. The gNB then responds with MSG4 which includes a contention resolution message.

The 3GPP in Release-16 has named legacy Random access procedure as 4-step RA type or Type-1 RA procedure and the new 2-step RA procedure as 2-step RA type or Type-2 RA procedure. As the name implies, the two-step RACH procedure effectively “collapses” the four messages of the four-step RACH procedure into two messages. Below given is an example two-step RACH procedure.

  1. A first enhanced message (msgA) is sent from the UE to gNB. In this case, msgA includes a RACH preamble for random access and a payload over PUSCH, which effectively combines MSG1 and MSG3 described above. The msgA payload, for example, includes the UE-ID and other signaling information (e.g., buffer status report (BSR)) or scheduling request (SR). This message is transmitted repeatedly with step-wise increased power until a response (msgB) is received.
  2. gNB responds with a random access response (RAR) message (msgB) which effectively combines MSG2 and MSG4 described above. For example, msgB includes the ID of the RACH preamble, a timing advance (TA), a back off indicator, a contention resolution messages, UL/DL grant, and a transmit power control (TPC) commands.

Given below are benefits of 2-step RACH procedure

  1. Benefit of 2-step RACH procedure is seen when msgA is detected by gNB quickly without repeated transmissions. otherwise there is an additional overhead (in comparison with 4-step procedure) of transmitting MsgA payload.
  2. Also, in case of operation in unlicensed spectrum, collapsed RACH procedure implies a reduced number of LBT (Listen Before Talk) operations with a corresponding reduction in overhead and delay.
  3. 2-step RACH requires less UE processing compared to 4-step RACH. Hence it has the benefit of power saving especially if a UE is under the scenario with small data traffic which requires the UE to wake up and transmit data intermittently.


  1. Power Saving Techniques for 5G and Beyond: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9112193



Polyphase filters and non data aided timing synchronization!

In case of no data aided synchronisation methods, Early/Late Gate clock recovery method is used to get an estimate of the offset between transmit and receive symbol timing. This method uses 3 samples per symbol – one at the optimal sample time, one that is 1 sample delayed and 1 that is one sample advanced. 

“This approach works by setting up two filterbanks; one filterbank contains the signal’s pulse shaping matched filter (such as a root raised cosine filter), where each branch of the filterbank contains a different phase of the filter. The second filterbank contains the derivatives of the filters in the first filterbank. Thinking of this in the time domain, the first filterbank contains filters that have a sinc shape to them. We want to align the output signal to be sampled at exactly the peak of the sinc shape. The derivative of the sinc contains a zero at the maximum point of the sinc (sinc(0) = 1, sinc(0)’ = 0). Furthermore, the region around the zero point is relatively linear. We make use of this fact to generate the error signal. If the signal out of the derivative filters is d_i[n] for the ith filter, and the output of the matched filter is x_i[n], the error is calculated as: e[n] = (Re{x_i[n]} * Re{d_i[n]} + Im{x_i[n]} * Im{d_i[n]}) / 2.0 This equation averages the error in the real and imaginary parts. There are two reasons we multiply by the signal itself. First, if the symbol could be positive or negative going, but we want the error term to always tell us to go in the same direction depending on which side of the zero point we are on. The sign of x_i[n] adjusts the error term to do this. Second, the magnitude of x_i[n] scales the error term depending on the symbol’s amplitude, so larger signals give us a stronger error term because we have more confidence in that symbol’s value. Using the magnitude of x_i[n] instead of just the sign is especially good for signals with low SNR. The error signal, e[n], gives us a value proportional to how far away from the zero point we are in the derivative signal. “

1. It is said that “the filters corresponding to the early and late gate filters are trivially the polyphase segments (k – 1) and (k + 1) when testing polyphase segment (k).”  what is theory behind this .

2. Suppose I have a 256:1 decimation filter with an input sampling frequency of 256 kHz with total of 256*8 taps (Or 256 filter banks with 8 tap each). These are linear phase FIR filters, each filter bank having a data rate of 1 khz. Consider following two scenario:

i) I start with Polyphase filter bank index 1 and carry on convolving input data and roll over polyphase bank index at 256.

ii) I start with Polyphase filter bank index 2 and roll over polyphase bank index at 256.How does this result in timing adjustment ?  

There are many parts to above questions and it will be good to understand them separately.

a) First to understand how early late gate works in terms of getting a timing estimate ?    At this stage ignore polyphase filters etc.
So you have a simple system which uses a matched filter to transmit a signal and a matched filter to receive a signal. Assume that you are using a matched filter which is box shaped.
Upon correlation you have a triangle like shape. The correct sampling time is the peak and the two samples to
 the left and right can be used to estimate the error.

b) Next concept to understand:    Using a regular filter (no polyphase), we can change the index of the filter and get a timing change.    How to see this:    Normally the sequence is like this:  

output1 =  x0*f0 + x1*f1 + x2*f2 + x3*f3 + x4*f4  

output2 = x1*f0 + x2*f1 + x3*f2 + x4*f3 + x5*f4 
How to make output2 almost same as output1? change the index of the filter:
 new_output2 = x1*f1 + x2*f2 + x3*f3 + x4*f4 +x5*f0

Only the last term is different from what we wanted in terms of timing change.
This error is small and will only last for the symbol where we made the timing correction.

c) Finally let us look at polyphase filters.    A polyphase filter is nothing but a filter. It only has a faster implementation.It is mostly used in decimation/interpolation. This is a extension of the above.

Now extend example above for timing correction using regular filter.

Divide normal filter into polyphase sub filters. So, normal filter { f0 f1 f2 f3 f4 f5} is divided into three polyphase sub filters as P0 ,P1,P2.

P0 = {f0, f3} P1 = {f1, f4} P3 = {f2, f5}. And applied same principle of changing filter bank index. You will get one term difference with timing corrected output. question now is shouldn’t output2 in this example be exactly same as ouput1 ? 

Shouldn’t Y(Advanced) to be exactly same as Y(1) in this example.

D1,D2,D3 are three delay lines.

State 1 : At Time t = -1

P0 = {f0, f3} D1 = {x2, x5} P1 = {f1, f4}D2 = {x1, x4} P3 = {f2, f5} D3 = {x0, x3}Output from each sub filter y0 = f0*x2 + f3*x5 y1 = f1*x1 + f4*x4 y2 = f2*x0 + f5*x3 Final ouputY(-1) = y0 + y1 +y2 = f0*x2 + f3*x5 + f1*x1 + f4*x4 + f2*x0 + f5*x3

State 2 : At Time t = 0

Next sample x6 goes into delay line and oldest sample is flushed out of delay line.

P0 = {f0, f3} D1 = {x1, x4} P1 = {f1, f4}D2 = {x0, x3}P2 = {f2, f5} D3 = {x6, x2}Output from each sub filter

y0 = f0*x1 + f3*x4 y1 = f1*x0 + f4*x3 y2 = f2*x6 + f5*x2 Final ouputY(0) = y0 + y1 +y2 = f0*x1 + f3*x4 + f1*x0 + f4*x3 +f2*x6 + f5*x2

State 3 : At Time t = 1

Next sample x7 goes into delay line and oldest sample is flushed out of delay line.

P0 = {f0, f3} D1 = {x0, x3}P1 = {f1, f4} D2 = {x6, x2}P2 = {f2, f5} D3 = {x7, x1} Output from each sub filter y0 = f0*x0 + f3*x3 y1 = f1*x6 + f4*x2 y2 = f2*x7 + f5*x1

Final ouput Y(1) = y0 + y1 +y2 = f0*x0 + f3*x3 + f1*x6 + f4*x2 + f2*x7 + f5*x1

Change filter bank index If you change filter index at State 2 , you will have following configuration.

P0 becomes P2, P1 becomes P0 and P2 becomes P1.

P0 = {f2, f5} D1 = {x1, x4}P1 = {f0, f3} D2 = {x0, x3} P2 = {f1, f4} D3 = {x6, x2}

Output from each sub filter y0 = f2*x1 + f5*x4 y1 = f0*x0 + f3*x3 y2 = f1*x6 + f4*x2

Final ouputY(Advanced) = y0 + y1 + y2 = f2*x1 + f5*x4 + f0*x0 + f3*x3 +f1*x6 + f4*x2

Y(Advanced) is same as Y(1) except one term is diffrent.

Now is the time to think about Interpolation. Let’s understand this.

You are receiving 10 samples .
After matched filtering you get output samples of y0,y1,y2…y9

When you receive a timing error you want to get samples y0+delta,y1+delta etc…y9+delta. The “delta” is less than a sample.

What are the methods:
A) Take y0,y1,y2,y3…y9. Upsample by a factor of 100 (say). That is put 99 zero’s between y0 and y1 and y2 etc.
Now use a perfect interpolation filter and generate samples between y0 and y1 and between y1 and y2…upto y9. So,you have z0,z1,z2… z900. Decimate this output by 100 at the ‘correct’ phase to get desired output with the timing change.

B) But the operation above is wasteful in that it is generating samples we will throw away.That is we are generating z0 to z900, but we will use only say z2,z102,z202 etc.So why not only generate z2,z102…z202.

Ignore polyphase, how can you do this?

Notice that z2 = some filter operating on y0,y1,y2…y9.The zeros in between do not effect the output. So, z2 = filter2 operating on y0,y1,y2…y9
z3 = filter3 operating on y1,y2,y2…y9
where the coefficients of filter2 and filter3 are different. At this time, there is no glitch.

So what does this mean in terms of an algorithm:
When there is a timing change use a different set of coefficients to generate the output.

When the timing change is so much that you have ‘drop’ an input sample, or ‘zero stuff’ and input sample you have a glitch.

C) finally you can use the matched nyquist filter as an interpolation filter and do the matched filtering and interpolation in one go instead of a two step process described here. And the process of using different filter coefficients is done using a polyphase filter.

~~ cheers ~~ Dheeraj

ZC sequences and application in LTE

Zadoff-chu sequence is a polyphase sequence which is widely used in LTE for Primary synchronization signal,PRACH, PUCCH DMRS,PUSCH DMRS and sounding reference signals(SRS). This is because ZC sequence has the following properties.

  1. The Auto correlation of a prime length ZC sequence with a cyclic shifted version of itself has a zero auto correlation. It means that the auto correlation is nonzero only at one instant which corresponds to the cyclic shift. This also means that two generated sequences are orthogonal to each other. In communication systems use of orthogonal sequence is wide spread and using this property of ZC sequence, orthogonal sequences can be easily generated; just by cyclically shifting a ZC sequence.
  2. Another important property of ZC sequence is its circular crosscorrelation property. It can be stated as follows: “The absolute value of the cyclic crosscorrelation function between any two ZC sequences is constant and equal to 1/sqrt(N_ZC), if |u1u2|is relatively prime with respect to N_ZC.” Where, u1, and u2 are root indices and N_ZC is sequence length.

Attached are octave scripts that illustrate these properties.





Least-Square circle fit and DC-offset estimation

57400total sites visits.

Zero-IF based radio receivers, because of phenomenon called “self-mixing” generate DC offset that can be much greater than the desired signal. Following are undesired effect of DC-offset:

Low level frequency amplifiers can be saturated by large DC offsets before the desired signal is amplified. The DC offset should be removed before the frequency correction of the received baseband signal can be performed, or the signal will turn into a tone at frequency correction with the AFC loop correcting the DC offset.


Therefore, wireless communication systems that utilize Zero-IF radios have to overcome large DC offsets in the baseband signal.

In wireless communication systems where modulated symbols lie on a circle, the center of the circle can be seen as “DC-offset” and radius of the circle can be indicative of the power of the data signals. In case of no DC-offset, center of the circle lie on origin. And, any DC-offset displaces the the coordinates of the center of the circle (As can be seen from below figure).

Therefore, Problem of DC-offset estimation can be seen as finding the coordinates of center of the displaced circle. Once the coordinates of the center point are obtained, they can be subtracted from the received signal, thereby obtaining the corrected DC offset-free signal.

Below is octave script that estimates the center of circle based on Least square method.

% Create data for a circle + noise
th = linspace(0,2*pi,20)';
sigma = R/10;
x = R*cos(th)+randn(size(th))*sigma;
y = R*sin(th)+randn(size(th))*sigma;
o'), title(' measured points')

% Details and derivation of Least square circle fit algorithm
% here https://dtcenter.org/met/users/docs/write_ups/circle_fit.pdf

% coordinates of the barycenter
x_m = mean(x);
y_m = mean(y);

% calculation of the reduced coordinates
u = x - x_m;
v = y - y_m;

% linear system defining the center (uc, vc) in reduced coordinates:
%    Suu * uc +  Suv * vc = (Suuu + Suvv)/2
%    Suv * uc +  Svv * vc = (Suuv + Svvv)/2

uu = u.^2;
vv = v.^2;

Suv  = sum(u.*v);
Suu  = sum(u.^2);
Svv  = sum(v.^2);

Suuv = sum(uu.<em>v);
Suvv = sum(u.</em>vv);
Suuu = sum(u.^3);
Svvv = sum(v.^3);

% Solving the linear system
A = [ Suu, Suv; Suv, Svv];
B = [ (Suuu + Suvv)/2; (Svvv + Suuv)/2 ];
[ss] = A\B;

uc = ss(1);
vc = ss(2);
xc_hat = x_m + uc
yc_hat = y_m + vc

% radius
R_hat     = sqrt((x-xc_hat).^2 + (y-yc_hat).^2);

% reconstruct circle from data
xe = R_hat.*cos(th)+xc_1; ye = R_hat.*sin(th)+yc_1;
measured fitted and true circles')
center (%g , %g );  R=%g',xc,yc,Re))
xlabel x, ylabel y
axis equal
Result, without noise.
Result, with noise