Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I am re-learning assembler which I used on very old MS-DOS machines!!!

This is my understanding of what that function should look like. It compiles but crashes with a SIGSEGV when trying to put 0xffffffff in ecx.

The code is run in a VM with 32-bit Debian 9. Any help would be appreciated.

    int getStringLength(const char *pStr){

        int len = 0;
        char *Ptr = pStr;

        __asm__  (
            "movl %1, %%edi
"
            "xor %%al, %%al
"
            "movl 0xffffffff, %%ecx
"
            "repne scasb
"
            "subl %%ecx,%%eax
"
            "movl %%eax,%0"
            :"=r" (len)     /*Output*/
            :"r"(len)       /*Input*/
            :"%eax"         /*Clobbered register*/


    );

        return len;
    }
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
540 views
Welcome To Ask or Share your Answers For Others

1 Answer

The problem with using GCC's inline asm to learn assembly is that you spend half your time learning about how gcc's inline assembly works instead of actually learning assembly. For example here's how I might write this same code:

#include <stdio.h>

int getStringLength(const char *pStr){

    int len;

    __asm__  (
        "repne scasb
"
        "not %%ecx
"
        "dec %%ecx"
        :"=c" (len), "+D"(pStr)     /*Outputs*/
        :"c"(-1), "a"(0)            /*Inputs*/
        /* tell the compiler we read the memory pointed to by pStr,
           with a dummy input so we don't need a "memory" clobber */
        , "m" (*(const struct {char a; char x[];} *) pStr)

    );

    return len;
}

See the compiler's asm output on the Godbolt compiler explorer. The dummy memory-input is the tricky part: see discussion in comments and on the gcc mailing list for the most optimal way to do this which is still safe.

Comparing this with your example

  1. I don't initialize len, since the asm declares it as an output (=c).
  2. There's no need to copy pStr since it is a local variable. By spec, we're already allowed to change it (although since it is const we shouldn't modified the data it points to).
  3. There's no reason to tell the inline asm to put Ptr in eax, only to have your asm move it to edi. I just put the value in edi in the first place. Note that since the value in edi is changing, we can't just declare it as an 'input' (by spec, inline asm must not change the value of inputs). Changing it to a read/write output solves this problem.
  4. There's no need to have the asm zero eax, since you can have the constraints do it for you. As a side benefit, gcc will 'know' that it has 0 in the eax register, and (in optimized builds) it can re-use it (think: checking the length of 2 strings).
  5. I can use the constraints to initialize ecx too. As mentioned, changing the value of inputs is not allowed. But since I define ecx as an output, gcc already knows that I'm changing it.
  6. Since the contents of ecx, eax and edi are all explicitly specified, there's no need to clobber anything anymore.

All of which makes for (slightly) shorter and more efficient code.

But this is ridiculous. How the heck (can I say 'heck' on SO?) are you supposed to know all that?

If the goal is to learn asm, using inline asm is not your best approach (in fact I'd say that inline asm is a bad idea in most cases). I'd recommend that you declare getStringLength as an extern and write it completely in asm then link it with your C code.

That way you learn about parameter passing, return values, preserving registers (along with learning which registers must be preserved and which you can safely use as scratch), stack frames, how to link asm with C, etc, etc, etc. All of which is more useful to know than this gobbledygook for inline asm.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...