# Crystal clean PCM 8bit samples on the poor PSG

Page 1/14
| 2 | 3 | 4 | 5 | 6

Ok this is time to share a nice routine for playing 8bit PCM samples on the PSG without the ugly spikes that are generated by the previous conventional techniques that play sample by sample each PCM sample using a 3 channels table.

Go to http://map.tni.nl/articles/psg_sample.php for details on the “standard” solution.

In a word, the standard solution associates each 8bit sample to a triplet of channel levels suitably chosen in order to minimize the error of the PSG overall output level with respect to the given sample. As the sum of the 3 channels leads to 608 disjoint values, you would expect a the output quality of a 9bit quantizer, but, unfortunately, the resulting levels are oddly spaced, so the theorical quality corresponds in dB to about 48dB of SNR, which is turn is equivalent to a 8bit uniform quantizer.

Problem:
The “standard” solution has a very annoying problem: the z80 cannot change the 3 PSG channels at the same time, this implies that, during the transition between two successive samples, the PSG passes trough undefined states where some channels are from the previous triplet and some channels are from the triplet corresponding to the next sample.

The result is that, at each transition between two successive samples, the PSG can produce 1 or 2 spikes of short duration. This effect, repeated at each sample transition, becomes a very annoying wideband noise that sensibly reduce the overall quality of the output, far below the optimal 48dB of SNR.

Solution:
The sole solution is to control the PSG trough every transition.

As the z80 can change a single channel at time and each channel has 16 values, the PSG can be modelled at any time as a state machine with 256 states, 16 inputs, and 608 outputs.
The 256 states are the 16*16 values of the old channels (those that don’t change at the current time), the 16 inputs are the allowed values of the current channel, the 608 outputs are the sum of the volume values from the “state” and of the volume value from the current input.
Each sample transition corresponds to three steps of the state machine.
Assume ABC the current values playing and A’B’C’ the new values.
The transitions are:

ABC -> ABC’
ABC’-> AB’C’
AB’C’-> A’B’C’

So, between the level ABC corresponding to the previous sample and the level A’B’C’ corresponding to the next sample, we get :
ABC
ABC’
AB’C’
A’B’C’

Assume x be the sequence of input samples, we want to minimize the mean square error between the x and the output sequence of levels in any single transition, this implies we need to develop a search algorithm that explores all the sequences of state transitions of the PSG corresponding to the allowed outputs, and finds the best sequence of channel levels that minimize the mean square error.

The algorithm is already existing, and is the classical Viterbi algorithm, I do not run into details here, but in case I will go back on this classical topic.

The output of such an algorithm is the sequence of levels for each channel that encodes the input sequence x.

Now the new problem: each triplet is 4*3= 12 bit : a huge waste of space!
Solution run length encoding.
The best solution should be a RLE based on nibbles, as some "detail channels" can have very long bursts of non equal symbols. Due to lack of time and of energy I did as you can see in the following.
I know that the implementation could be improved, but I cannot fit any a better solution in 447 t cycles (8KHz of PCM player).
Enjoy

```
Fs          equ 8192  ;     alternative 8000 or 8192

;------------------
MACRO  PsgW reg,value
ld a,reg    ; select AY channel A volume register ; 7+1
out (#A0),a                                       ; 11+1
ld  a,value                                       ; 7+1
out (#A1),a ; and write back to the AY            ; 11+1
ENDM                                              ; total 36+4 = 40T
;------------------

MACRO WaitXX
if (Fs==8000)
.3 nop          ; 3*5 = 15
else
if  (Fs==8192)
nop             ; 5
else
assert 0
endif
endif
ENDM            ; total 15 or 5

MACRO Wait68
.2 ex (sp),hl   ; 2*20 =40
.4 nop          ; 4*5 = 20
cp 0            ; 8
ENDM            ; total 68

;------------------

MACRO next  reg
ld  a,(hl)      ; 8
.4 rrca         ; 4*5
and 15          ; 8
ld  reg,a       ; 5
ld  a,(hl)      ; 8
and 15          ; 8
exx             ; 5
ld  reg,a       ; 5
exx             ; 5
inc hl          ; 7
ENDM            ; tot = 79

;------------------

OUTPUT wavplay.com

ORG 100h

START:
di
PsgW 0,255
PsgW 1,255
PsgW 2,255
PsgW 3,255
PsgW 4,255
PsgW 5,255
PsgW 6,255
PsgW 7,10111111B

;-------------------------------------
; IN   HL - Encoded sample start address
;      HL'- Sample length (#pcm saples)
;-------------------------------------

ld  hl, SAMPLE+2
ld  bc,#0101
ld  de,bc
exx
ld  hl, (SAMPLE)
ld  c,#A1

.LOOP:
exx             ; 5
dec b           ; 5
ld  a,15         ; 8
and b            ; 5
jp nz,waitA     ; 11
next  b         ; 79
1       dec d           ; 5
ld  a,15         ; 8
and d            ; 5
jp nz,waitB     ; 11
next  d         ; 79
2       dec e           ; 5
ld  a,15         ; 8
and e            ; 5
jp nz,waitC     ; 11
next  e         ; 79
3       exx             ; 5

ld  a,8         ; 8
out (#a0),a     ; 12
inc a           ; 5
out (c),b       ; 14
out (#a0),a     ; 12
inc a           ; 5
out (c),d       ; 14
out (#a0),a     ; 12
out (c),e       ; 14

WaitXX          ; 15 or 5

dec hl          ; 7
ld  a,h         ; 5
or  l           ; 5
jp  nz,.LOOP    ; 11
; tot  = 447T or 437T
ret

waitA   Wait68          ; 68
jp   1B         ; 11
; tot = 79

waitB   Wait68          ; 68
jp   2B         ; 11
; tot = 79

waitC   Wait68          ; 68
jp   3B         ; 11
; tot = 79

;       b7b6b5b4|b3b2b1b0
;       NumTick |LevVolum

SAMPLE:
include "C:\MATLAB6p5\work\out.txt"

FINISH:
```

To whom is interested in going deeper in the way the Viterbi optimization is performed

```[y,Fs,nbit,opt]=wavread('bluetooth_input.wav');

y = (y-min(y)) /(max(y)-min(y)) * 1.15;                % input tra 0 e 1.15

f = 3579545;
fpcm = Fs;

tp = f/fpcm; % 447 = 8000Hz, 437 = 8192Hz
dt1 = (12+5+14)/tp;
dt2 = (12+14)/tp;
dt3 = 1-dt1-dt2;

N = 3*length(y);

x = zeros(1,N);
x(1:3:end) = y;
x(2:3:end-3) = y(1:end-1)*(1-dt1)+y(2:end)*dt1;
x(3:3:end-3) = y(1:end-1)*(1-dt1-dt2)+y(2:end)*(dt1+dt2);
x(end-2:end)= y(end);

n=[0:15]';
vol=2.^(n/2)/2^7.5;
vol(1)=0;

nxtS = uint8(zeros(256,16));
for i=0:15
for j=0:15
for in=0:15
nxtS(i*16+j+1,in+1) = uint8(j*16 + in);
end
end
end

curV = zeros(256,16);
for i=0:15
for j=0:15
for in=0:15
curV(i*16+j+1,in+1) = vol(i+1)+vol(j+1)+vol(in+1);
end
end
end

Stt = uint8(zeros(256,N));
Itt = uint8(zeros(256,N));

L  = zeros(256,1);
St = uint8(ones(256,1));
It = uint8(ones(256,1));

for t =1:N
Ln = inf*ones(256,1);
if mod(t-1,1000)==0
proc = fix(t/N *100)
end
for cs=0:255
for in=0:15
cv = curV(cs+1,in+1);
ns = double(nxtS(cs+1,in+1));

switch mod(t-1,3)
case 0
Ltst = L(cs+1)+dt1*abs(x(t)-cv)^2;
case 1
Ltst = L(cs+1)+dt2*abs(x(t)-cv)^2;
case 2
Ltst = L(cs+1)+dt3*abs(x(t)-cv)^2;
end

if  Ln(ns+1) >= Ltst
Ln(ns+1) = Ltst;
St(ns+1) = cs;
It(ns+1) = in;
end
end
end
L = Ln;
Stt(:,t) = St;
Itt(:,t) = It;
end

[l,i] = min(L);

P = uint8(zeros(1,N));
I = uint8(zeros(1,N));

P(N) = Stt(i,N);
I(N) = Itt(i,N);
for t = (N-1):-1:1
P(t) = Stt(double(P(t+1))+1,t);
I(t) = Itt(double(P(t+1))+1,t);
end

V = zeros(1,N);
for t = 1:N
V(t) = curV(double(P(t))+1,double(I(t))+1);
end

er = sum(abs(x(1:3:end)-V(1:3:end)).^2)*dt1+...
sum(abs(x(2:3:end)-V(2:3:end)).^2)*dt2+...
sum(abs(x(3:3:end)-V(3:3:end)).^2)*dt3

en = sum(abs(x(1:3:end)).^2)*dt1+...
sum(abs(x(2:3:end)).^2)*dt2+...
sum(abs(x(3:3:end)).^2)*dt3

disp('SNR db=')
SNR = 10*log10(en/er)

j=1;
A = 0;
B = 0;
C = 0;
la = 1;
lb = 1;
lc = 1;
ja=1;
jb=2;
jc=2;
s = [];

for i=1: length (I)
switch mod(i-1,3)
case 0
if A==I(i)
la=la+1;
if la==17
s(ja)=16*16+double(A);
la=1;
ja=j;
j=j+1;
end
else
s(ja)=16*la+double(A);
la=1;
ja=j;
j=j+1;
A=I(i);
end
case 1
if B==I(i)
lb=lb+1;
if lb==17
s(jb)=16*16+double(B);
lb=1;
jb=j;
j=j+1;
end
else
s(jb)=16*lb+double(B);
lb=1;
jb=j;
j=j+1;
B=I(i);
end
case 2
if C==I(i)
lc=lc+1;
if lc==17
s(jc)=16*16+double(C);
lc=1;
jc=j;
j=j+1;
end
else
s(jc)=16*lc+double(C);
lc=1;
jc=j;
j=j+1;
C=I(i);
end
end

end

disp('actual len=')
disp(length(s))

I0 = I(1:3:end);
I1 = I(2:3:end);
I2 = I(3:3:end);

i0 = find(diff(double(I0)));
str0=[];
for j=0:(length(i0)-1)
if j==0
t = I0(1:i0(j+1));
else
t = I0(i0(j)+1:i0(j+1));
end
str0(j+1,1) = length(t);
str0(j+1,2) = I0(i0(j+1));
end

i1 = find(diff(double(I1)));
str1=[];
for j=0:(length(i1)-1)
if j==0
t = I1(1:i1(j+1));
else
t = I1(i1(j)+1:i1(j+1));
end
str1(j+1,1) = length(t);
str1(j+1,2) = I1(i1(j+1));
end

i2 = find(diff(double(I2)));
str2=[];
for j=0:(length(i2)-1)
if j==0
t = I2(1:i2(j+1));
else
t = I2(i2(j)+1:i2(j+1));
end
str2(j+1,1) = length(t);
str2(j+1,2) = I2(i2(j+1));
end
disp('best len=')
disp(length(str0)+length(str1)+length(str2))
disp ('ratio =')
disp(length(s)/(length(str0)+length(str1)+length(str2)))

% i=20 ; [  bitand(s(1:i)',240)/16 bitand(s(1:i)',15) ]
% [ str0(1:10,:) str1(1:10,:) str2(1:10,:) ]

t = zeros(N,1);

t(1:3:end) = 0   +       [1:N/3];
t(2:3:end) = dt1 +       [1:N/3];
t(3:3:end) = dt1 + dt2 + [1:N/3];

figure(1)
subplot(2,1,1)
plot(t,V,t,x,'r',t,x,'.g')
subplot(2,1,2)
plot(t,V-x)

np = fix(sum(fix(s/16))/3)-6;

[fid, MESSAGE] = fopen('out.txt','W');
fprintf(fid,'   dw    %6d\n',np);
fprintf(fid,'   db    %3d\n',s);
fclose(fid);

% ASM replayer simulator

Vp=[];t=1;
ChA=0;ChB=0;ChC=0;

b=1;d=1;e=1;
hl=1;
hlp = np;
while (hlp>0)
b=b-1;
if b==0
b = fix(s(hl)/16);
bp = bitand(s(hl),15);
hl=hl+1;
end
d=d-1;
if  d==0
d = fix(s(hl)/16);
dp = bitand(s(hl),15);
hl=hl+1;
end
e=e-1;
if  e==0
e = fix(s(hl)/16);
ep = bitand(s(hl),15);
hl=hl+1;
end

ChA = bp;
Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
ChB = dp;
Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
ChC = ep;
Vp(t) = vol(ChA+1)+vol(ChB+1)+vol(ChC+1); t=t+1;
hlp = hlp-1;
end

figure(2)
plot(V)
hold on;
plot(Vp,'.r')
```

kewl, I think. 'ave to test it. but sound kewl. thanx Atrag

and I am preparing some others short encodedd wavs

looking forward 2 it!

I have done a good number of tests and I have found that the main problem
of this approach is the size of the data.

My RLE imlementation is very poor as my data (at least some channels) can
be very irregular.
Sometimes the result, wrt an 8bit input, is 1.6 times the input, i.e. it is worst than
not encoding at all, that results 1.5 the original lenght....

I have found also that if I trade off SNR with # transition the size can drastically
decrease to 0.9 times the original sive but the SND tends to reduce of 2-3dB.

I will post soon the new algorithm for trading size with SNR.

Sorry for the delay on putting the files online, it have been extremely busy days... as soon as I recover from my new-years hangover, I'll put 'em in our downloads section! With jltursan we discovered that the msdos pcmenc.exe could not work on all the PC
due to the need of a .dll from matlab.
Probably (also for increasing the processing speed) it would be required a new version
of pcmenc.exe developed directly in C (I compiled the matlab program).

Hi ARTRAG,

Is the ideas in your new PSG sample player something totally new or a refinement of what was discussed in http://www.msx.org/forumtopic663p105.html ? It sounds like its somewhat new in the sense you time it quite well and then using an adaptive way of changing samples. I'm really looking forward to hear the result.

This is a further development more that a refinement.
The approach is completely different from the one we developed (that one was a "sample by sample" encoding,
this one is a "sequence" encoding) but I have planned this solution (the fact that the sequence of PSG levels can
be found with a Viterbi algorithm) when we discovered the "spike" problem.

If you are interested send me an email and I'll send you some stuff to have a
taste of the whole thing.

Page 1/14
| 2 | 3 | 4 | 5 | 6