Fill memory with 16 bits value

ページ 1/2
| 2

By aoineko

Paragon (1135)

aoineko さんの画像

06-03-2023, 09:18

What is the quicker way to fill memory with a 16 bits value?
We can't use the ldir trick here and I can't think about something else than a simple loop with:

ld (HL),C
inc HL
ld (HL),B
inc HL

Any other way?
Self modified code is intersting but it is out of the scope of my use case.


By Micha

Expert (110)

Micha さんの画像

06-03-2023, 09:36

You could try something like:

ld [OldSP],sp
ld sp,hl
push bc
push bc
push bc
push bc

ld sp,[OldSp]

but remember the pushes will fill memory from back to front.

By bore

Master (182)

bore さんの画像

06-03-2023, 09:39

Why can't you use ldir?
Just place DE at HL+2 and go?

By Metalion

Paragon (1628)

Metalion さんの画像

06-03-2023, 10:46

You can still use ldir :

ld hl,address
ld de,value
push hl
ld (hl),e
inc hl
ld (hl),d
inc hl
ex de,hl
pop hl
ld bc,length-2

By aoineko

Paragon (1135)

aoineko さんの画像

06-03-2023, 12:51

Cool. Many thanks!

By santiontanon

Paragon (1831)

santiontanon さんの画像

06-03-2023, 15:53

Micha's push solution is the fastest btw. It's also useful when you need to fill it with an 8bit value and use ldir. The fastest solution is to load the 8bit value twice in a 16bit register and use "push" repeatedly. This was a classic trick used in many games in the past to clear/write to memory fast!

By ro

Scribe (5059)

ro さんの画像

06-03-2023, 16:07

You know, I've seen that "trick" with using the StackPointer before. And just like other tricks, there's a down point on that; namely it becomes harder to read the code. Sure, if you're a very experienced coder it'll be just fine right.

Coding that Z80 is beautiful, and you want to make fast and efficient code. But at the cost of getting dirt in. I am a big fan of clean-code. Clean-code isn't always the fastest. But it's super efficient on reading back the lines and understand what is happening. No need for extra comments explaining what happens.

So, using the SP trick and push stuff in RAM is not my fave way. Having said that, it's an old trick that works very well Smile

2 cents.

By Grauw

Ascended (10820)

Grauw さんの画像

06-03-2023, 17:07

And remember, you can make LDIR significantly faster for large blocks like so.

If the block size is not fixed amount and so you would want to use FastLDIR, its self modifying code could be eliminated (at some extra set-up cost) like so:

    xor a
    sub c
    and 16 - 1
    add a,a
    push hl
    add a,FastLDIR_Loop & 0xFF
    ld l,a
    ld a,0
    adc a,FastLDIR_Loop >> 8
    ld h,a
    ex (sp),hl
    ; ...

By bore

Master (182)

bore さんの画像

06-03-2023, 18:25

I don't think the SP-variant is particularly unreadable compared to any other assembly code.
Just put it in a subroutine with some comments and you're fine.

The problem with it is that you are hogging the SP so any interrupt will use your destination as stack if you don't disable them.
If it is usable to you or not depends on if you can accept the interrupts to be disabled for the duration of the copy.

OTOH you save something like 12 cycles per byte by using push instead of ldi so you can probably restore SP and enable interrupts every 8th byte and still save cycles by using the push-method.

By Prodatron

Paragon (1857)

Prodatron さんの画像

06-03-2023, 19:04

@Grauw, very nice! I just wonder why you need a DI/EI here?

ld (FastLDIR_jumpOffset),a

By theNestruo

Champion (429)

theNestruo さんの画像

06-03-2023, 19:44

Prodatron wrote:

@Grauw, very nice! I just wonder why you need a DI/EI here?
ld (FastLDIR_jumpOffset),a

This is a wild guess (it is not my code) but: ei enables interrupts after the next instruction, so the code within the interrupts disabled section is actually:

    ld (FastLDIR_jumpOffset),a
    jr nz,$  ; self modifying code

If interrupts were enabled, another call to this routine (e.g.: during the interruption) with a different value would overwrite the offset written, making the first call to ldir an incorrect number of bytes.

ページ 1/2
| 2