performance

By rolandve

Champion (372)

rolandve さんの画像

28-01-2023, 09:31

This is what I suspect micro, micro performance improvement.
What is faster. Storing data as a 'char' and casting it to 'int' when required or accept that 'int' is 2xchar and passing that int?
if a cast requires additional cpu instructions, that might be a good reason to accept the added memory usage.

In old articles I read that Xelasoft once made an ascii C compiler optimiser. Where can I find it? Its referred to but you can't download it. I understood that Ascii C only uses 8088 assembly while Z80 has more intelligent solutions for some 8088 instructions combinations.

ログイン/登録して投稿

By st1mpy

Paladin (947)

st1mpy さんの画像

28-01-2023, 11:52

Is it supposed to be 8080

By ducasp

Paladin (712)

ducasp さんの画像

28-01-2023, 13:15

A good and optimized C compiler / linker won't add instructions to a type cast, makes no sense, you are not converting the data, you are just telling to treat the memory in that area as a given type instead of its declared type, so, if you cast an array of 2 chars as int, it will work the same as if it was declared as int.

By rolandve

Champion (372)

rolandve さんの画像

28-01-2023, 16:03

ducasp wrote:

A good and optimized C compiler / linker won't add instructions to a type cast, makes no sense, you are not converting the data, you are just telling to treat the memory in that area as a given type instead of its declared type, so, if you cast an array of 2 chars as int, it will work the same as if it was declared as int.

So when the value I work with is cast to int because the receiving functions expects an int, the processor doesn't also clear another register because (HL?) because of 16 bit requirement, regardless of the fact that A is enough?

By Grauw

Ascended (10821)

Grauw さんの画像

28-01-2023, 17:03

I don’t know if I understand the original question 100% correctly, but;

Casts are not (always) free. For example, for conversion between signed primitive types of different bit widths it performs sign extension.

E.g. when loading a value from an char type global field into an int type local variable, this is what that would look like:

    ld a,(my_value)   ; 14 cycles
    ld l,a            ;  5 cycles
    add a,a           ;  5 cycles     sign
    sbc a,a           ;  5 cycles   extension
    ld h,a            ;  5 cycles

Note that this is efficient hand-written assembly, the C compiler may generate something less efficient.

Whereas a value stored in a global with a matching int type will need just one instruction:

    ld hl,(my_value)  ; 17 cycles

Also for unsigned types, it will probably still be faster to match the types. E.g. when loading a value from an unsigned char type global field into an unsigned int type local variable:

    ld a,(my_value)   ; 14 cycles
    ld l,a            ;  5 cycles
    ld h,0            ;  8 cycles

Whereas a value stored in a global with a matching unsigned int type will need just one instruction:

    ld hl,(my_value)  ; 17 cycles

The casts that are probably free are casts from signed to unsigned types of the same bit-width, and possibly truncations too (casts from 16 to 8 bit values), depending on how freely the compiler allocates registers.

A case where the unsigned char variant could be faster is when using indirect memory access, that is, dereferencing a pointer to a value rather than a global variable, because in those cases the assembly instructions to load from memory can only do 8 bits at a time, so it needs two for a 16-bit value.

For example loading unsigned int from an unsigned char*:

    ld e,(hl)         ; 8 cycles
    ld d,0            ; 8 cycles

Vs loading unsigned int from an unsigned int*:

    ld e,(hl)         ; 8 cycles
    inc hl            ; 7 cycles
    ld d,(hl)         ; 8 cycles

Anyway, a lot also depends on the peculiarities of the compiler, so the best way to know for sure what happens is to generate an assembly listing file for the compilation output and inspect it manually.

rolandve wrote:

So when the value I work with is cast to int because the receiving functions expects an int, the processor doesn't also clear another register because (HL?) because of 16 bit requirement, regardless of the fact that A is enough?

If you pass a char to a function that expects an int, it will move the value to a 16-bit register and sign extend if needed before calling, because it is part of the function signature that the argument is received in hl (or on the stack, it depends on the ABI of the compiler).