gcc garbage
Marco Radaelli

I compiled this simple code (small.c)

int main()
{
 unsigned char x;

 x = 10;
 
 return 0;
}

with 'gcc small.c -osmall", expecting to find the compiled code somewhere in the binary (opening it with an hexeditor).

But surprisingly, when I scrolled at offset 0x400 (where the code in a PE should start) I found a lot of codes much more than I expected. Plus, after them, there's a 'space' of zeroed bytes, then

Quote:

-LIBGCCW32-EH-2-SJLJ-GTHR-MINGW32 w32_sharedptr->size == sizeof(W32_EH_SHARED) %s:%u: failed assertion `%s'
../../gcc/gcc/config/i386/w32-shared-ptr.c GetAtomNameA (atom, s, sizeof(s)) != 0

and again, more forward in the file

Quote:

AddAtomA › ExitProcess ¯ FindAtomA Ü GetAtomNameA ßSetUnhandledExceptionFilter ' __getmainargs < __p__environ > __p__fmode P __set_app_type y _cexit é _iob ^_onexit „_setmode abort atexit 0fflush 9fprintf ?free rmalloc signal P P P P P KERNEL32.dll P P P P P P P P P P P P P P P msvcrt.dll

Quote:

small.c _main  .text  5 .data .bss .file * þÿ gCRTglob.c .text Ð .data .bss .file 2 þÿ gCRTfmode.c .text Ð .data

Quote:

__gnu_exception_handler@4 ___mingw_CRTStartup _mainCRTStartup _WinMainCRTStartup ___do_sjlj_init __pei386_runtime_relocator __fpreset _initialized ___do_global_dtors ___do_global_ctors pseudo-reloc-list.c _w32_atom_suffix _

How do I get rid of them? I tried adding -static but nothing changes.
I also tried some target specific switches (obtained with --target-help) but if I use, i.e., '--disable-auto-import' I get

gcc said:

cc1.exe: error: unrecognized command line option "-fdisable-auto-import"

Where does that 'f' come from?
I'm on Windows using gcc 3.4.2

Evert
Quote:

I found a lot of codes much more than I expected.

Yup.
MinGW needs to do a lot of trickery, for instance with commandline options (wildcard expansion), before it transfers control to your main. I don't think you can get around that.

Quote:

Where does that 'f' come from?

--longargumentname is equivalent to -flongargumentname.

gillius

Yes, the starting point for all programs in Win32 is a function with the same calling conventions as WinMain (for extremely pedantic purposes, this function need not be named WinMain, but must have the same form). In order for MinGW to simulate an "ANSI environment" -- meaning an "int main" environment, it has to generate a stub function of the WinMain form and then convert command line arguments into the argc/argv format before calling main. Also compilers usually generate code to initialize and deinitalize their C/C++ runtime libraries whether or not you use them (unless you use some rarely used options to remove this), so you might be seeing the results for this process.

If you optimized your code, likely you would see absolutely nothing generated for your main function because you never use x, so the compiler can remove it entirely as the code is useless (it defines useless as any code that does not have side-effects).

Peter Hull
Quote:

I don't think you can get around that.

You can, I think, if you write your own crt0.o and use -fno-builtin, -nostdinc, -nostdlib and some other options which I can't remember offhand.

Pete

[edit] For starters, you should link with -s to strip symbols from the exe!

Marco Radaelli
Quote:

MinGW needs to do a lot of trickery, for instance with commandline options (wildcard expansion), before it transfers control to your main. I don't think you can get around that

Comparing the object file with the executable generated, all that comes after the actual instructions. I think it all is added when linking.

There are so many switches... none could help? :-/

Quote:

--longargumentname is equivalent to -flongargumentname.

So why it is an unrecognized option?

[edit] Pete, you posted while I was writing mine :)

-fno-builtin didn't worked, the garbage is still there. I'm not using any function call, how can gcc be so stupid and link with any library at all?

Evert
Quote:

So why it is an unrecognized option?

Are you sure it exists? I can't find it in my manpage (different platform, but still)...

Marco Radaelli
Quote:

Are you sure it exists? I can't find it in my manpage (different platform, but still)...

I attached the help. That switch is in the last lines

Peter Hull

Have a look at this for extreme examples
http://www.ubergeek.org/~breadbox/software/tiny/home.html

Pete

[edit] That option is a linker option, try -Wl,--disable-auto-import
(no space between ',' and '-' )
.. might work :-/

gillius

Peter is right, you can do it. We used GCC as part of a class assignment of writing a machine and OS emulator of some old architecture (I think it might have been VAX or PDP or something). So because we were using Solaris-based GCC to generate the code, we had to include our own crt0.o and set a bunch of flags (we couldn't use any C library calls since we were writing our own OS-kernel, so we had to call those functions manually in ASM).

Anyway, the point is that it is possible.

Now, if your only goal is to see generated ASM output from your program, you do know that there is an option to do this in GCC?

Marco Radaelli
gillius said:

Now, if your only goal is to see generated ASM output from your program, you do know that there is an option to do this in GCC?

Yes i do ;)

These days I'm playing with boot sector code (I don't know if you saw my other thread about some troubles I had). It would be nice if I could write simple programs in C, compile them and then load them directly into memory to be executed. This will need a custom I/O library to link to when I'll use I/O functions (any suggestion/link is welcome, of course :)).

I'd like to not use assembly for all this stuff, it will make things easier.

[edit]

Pete said:

[edit] That option is a linker option, try -Wl,--disable-auto-import
(no space between ',' and '-' )

Yes! Thanks a lot! It didn't removed the garbage but now I can try all those switches :)

[edit2]

I'm surfing Gcc documentation. This is interesting:

gcc docs said:

-masm=dialect
Output asm instructions using selected dialect. Supported choices are intel or att (the default one).

Looks like we can use Intel syntax with gcc :D

I think I got it:
gcc small.c -osmall -nostartfiles -nodefaultlibs -nostdlib
They could be more than I need, but that's not a problem.

This instead is one: the above compile command gives this:

gcc small.c -osmall -nostartfiles -nodefaultlibs -nostdlib said:

C:\Mingw\bin\..\lib\gcc\mingw32\3.4.2\..\..\..\..\mingw32\bin\ld.exe: warning: c
annot find entry symbol _mainCRTStartup; defaulting to 00401000
C:\DOCUME~1\root\IMPOST~1\Temp/ccOsbaaa.o(.text+0x21):small.c: undefined reference to `_alloca'
C:\DOCUME~1\root\IMPOST~1\Temp/ccOsbaaa.o(.text+0x26):small.c: undefined reference to `__main'
collect2: ld returned 1 exit status

While the first is a warning and I think I could provide an _alloca function (I do not know how to atm, but I'll figure it out), what about the undefined '__main'?

[edit3]

I got rid of the undefined '_alloca' and '__main':

1void _alloca(void)
2{
3 return;
4}
5 
6void __main(void)
7{
8 main();
9
10 return;
11}
12 
13int main(void)
14{
15 unsigned char x;
16 
17 x = 10;
18 
19 return 0;
20}

Am I on the right path?

Hrvoje Ban
MSDN said:

<b>void _alloca(size_t size);
Allocates memory on the stack.

GNU Mailing List said:

__main
On some systems, gcc inserts a call to __main() at the start of the
code that it generates for main(). __main() is called to execute
initialization code, in particular constructors for C++ objects with
global or namespace scope.

On systems which use the ELF object file format, there are special
sections in the object file for registering initialization code, and
gcc will use these. The program loader (and/or the C runtime start-up
code) will then ensure that the initialization code gets run, perhaps
even before main() is entered, so there's no need for gcc to call __main()

I was googling for _mainCRTStartup and someone said that you should link with:

-e _mainCRTStartup

Marco Radaelli

Thanks.

Actually I'm trying to make code that will work when directly loaded in RAM (with no OS at all, no libc & friends), in real or protected mode. So, the real issue is to put the right parameters (along with proper instructions) to make gcc produce the right code.

I resolved those undefined troubles, when I'll be home I'll post the code.

I have troubles with the output: using strip I removed all the headers and symbols etc, now I have just the code (which I still need to undestard how to relocate, but that's another problem) with a lot of 0x00 at the end (and some 0xFF).

While this could be worked around with a 'jmp $' (or 'jmp .' in as, if that's the same thing :)) it is not nice.

I just remembered another issue: I was fighting with gcc with the parameter '--oformat', which wants a bfdformat. I didn't understood how to use it, I want it to output a binary file. Objdump -info says I support 'binary' which works fine with strip ('strip --output-target=binary').

Any idea?

[edit]

Here's the code that doesn't produce the warnings about undefined references

1asm(".psize 0\n"); /* Avoid page breaks in the listing file */
2 
3void mainCRTStartup(void) { __main(); return; }
4void _alloca(void) { return; }
5 
6int __main(int argc, char *argv[])
7{
8 asm(
9 
10 "mov ax, 0x0000\n"
11 "mov al,'H'\n"
12 
13 "mov bh, 0x00\n"
14 "mov bl, 0x07\n"
15 "mov ah, 0x0e\n"
16 
17 "int 0x10\n"
18 "jmp .\n"
19 );
20 
21 return 0;
22}

Thread #489307. Printed from Allegro.cc