Fill memory with 16 bits value

Pagina 1/2
| 2

Door aoineko

Paragon (1144)

afbeelding van aoineko

06-03-2023, 09:18

What is the quicker way to fill memory with a 16 bits value?
We can't use the ldir trick here and I can't think about something else than a simple loop with:

ld (HL),C
inc HL
ld (HL),B
inc HL

Any other way?
Self modified code is intersting but it is out of the scope of my use case.

Aangemeld of registreer om reacties te plaatsen

Van Micha

Expert (111)

afbeelding van Micha

06-03-2023, 09:36

You could try something like:

ld [OldSP],sp
ld sp,hl
push bc
push bc
push bc
push bc

ld sp,[OldSp]

but remember the pushes will fill memory from back to front.

Van bore

Master (182)

afbeelding van bore

06-03-2023, 09:39

Why can't you use ldir?
Just place DE at HL+2 and go?

Van Metalion

Paragon (1629)

afbeelding van Metalion

06-03-2023, 10:46

You can still use ldir :

ld hl,address
ld de,value
push hl
ld (hl),e
inc hl
ld (hl),d
inc hl
ex de,hl
pop hl
ld bc,length-2

Van aoineko

Paragon (1144)

afbeelding van aoineko

06-03-2023, 12:51

Cool. Many thanks!

Van santiontanon

Paragon (1833)

afbeelding van santiontanon

06-03-2023, 15:53

Micha's push solution is the fastest btw. It's also useful when you need to fill it with an 8bit value and use ldir. The fastest solution is to load the 8bit value twice in a 16bit register and use "push" repeatedly. This was a classic trick used in many games in the past to clear/write to memory fast!

Van ro

Scribe (5064)

afbeelding van ro

06-03-2023, 16:07

You know, I've seen that "trick" with using the StackPointer before. And just like other tricks, there's a down point on that; namely it becomes harder to read the code. Sure, if you're a very experienced coder it'll be just fine right.

Coding that Z80 is beautiful, and you want to make fast and efficient code. But at the cost of getting dirt in. I am a big fan of clean-code. Clean-code isn't always the fastest. But it's super efficient on reading back the lines and understand what is happening. No need for extra comments explaining what happens.

So, using the SP trick and push stuff in RAM is not my fave way. Having said that, it's an old trick that works very well Smile

2 cents.

Van Grauw

Ascended (10823)

afbeelding van Grauw

06-03-2023, 17:07

And remember, you can make LDIR significantly faster for large blocks like so.

If the block size is not fixed amount and so you would want to use FastLDIR, its self modifying code could be eliminated (at some extra set-up cost) like so:

    xor a
    sub c
    and 16 - 1
    add a,a
    push hl
    add a,FastLDIR_Loop & 0xFF
    ld l,a
    ld a,0
    adc a,FastLDIR_Loop >> 8
    ld h,a
    ex (sp),hl
    ; ...

Van bore

Master (182)

afbeelding van bore

06-03-2023, 18:25

I don't think the SP-variant is particularly unreadable compared to any other assembly code.
Just put it in a subroutine with some comments and you're fine.

The problem with it is that you are hogging the SP so any interrupt will use your destination as stack if you don't disable them.
If it is usable to you or not depends on if you can accept the interrupts to be disabled for the duration of the copy.

OTOH you save something like 12 cycles per byte by using push instead of ldi so you can probably restore SP and enable interrupts every 8th byte and still save cycles by using the push-method.

Van Prodatron

Paragon (1857)

afbeelding van Prodatron

06-03-2023, 19:04

@Grauw, very nice! I just wonder why you need a DI/EI here?

ld (FastLDIR_jumpOffset),a

Van theNestruo

Champion (431)

afbeelding van theNestruo

06-03-2023, 19:44

Prodatron wrote:

@Grauw, very nice! I just wonder why you need a DI/EI here?
ld (FastLDIR_jumpOffset),a

This is a wild guess (it is not my code) but: ei enables interrupts after the next instruction, so the code within the interrupts disabled section is actually:

    ld (FastLDIR_jumpOffset),a
    jr nz,$  ; self modifying code

If interrupts were enabled, another call to this routine (e.g.: during the interruption) with a different value would overwrite the offset written, making the first call to ldir an incorrect number of bytes.

Pagina 1/2
| 2