Allegro.cc - Online Community

Allegro.cc Forums » The Depot » LibGC: a Garbage Collection Lib for C++

This thread is locked; no one can reply to it. rss feed Print
LibGC: a Garbage Collection Lib for C++
axilmar
Member #1,204
April 2001

I've made a small garbage collection lib for C++. I will be more than happy if anyone tests it and reports to me if it works (i've tried some tests, but I have to be sure). It is very easy to use. It uses the mark and sweep algorithm (it does not use reference counting).

You can find it here. ;D

gillius
Member #119
April 2000

I took a good look through all of the code. I at least skimmed over all of it if not read it in depth. The parts I'm fuzzy on is how roots work, marking after objects are deleted, and the business with trying to magically detect contained GCPointers in GCObjects.

One thing you mentioned is that reference counting is slow. I'm not sure I totally agree with that. You are managing all of these linked lists, and in addition to that running mark and sweep operations, which can cause page faults. I do agree that copying GCPointers are faster than reference counted pointers (the implementation I am thinking of when I say reference counted pointers is boost::shared_ptr). I believe that allocations of GCObjects is much slower than allocation with ref counting.

I believe that a lot of times, pointers are not copied that much outside of the objects that contain them. I suppose what I am saying is that I'm not sure the quick copy of the GCObject beats the quick creation of the reference counted pointer, because copies of managed pointers is not overly often, and using an interlocked increment routine is not as slow as you might think.

Also true garbage collection has shown in some cases to be faster in mutli-thread environments (than explicit management), but I belive this breaks down when the GC heap is greater than the size of memory allocated to the process. In this case, traversing all of the objects will cause a lot of page faulting. In reference counting, destruction is deterministic, and more often occurs to objects as they reside in memory, but this is simply a guess. Regardless, I'd expect the full GC to be more suceptible to collections pauses than reference counting.

Of course, the GC library you have handles cyclic references. If one has circular references in their design, this library of course beats ref counting hands down.

Another comment: in the operator -> for GCPointer, you throw if it is a NULL pointer. I disagree with this, at least when it is not customizeable. Why do I say this? Throwing an exception makes it harder to debug -- the stack trace is handled and the debugger didn't pick it up, unless you've happened to customize your debugger to stop on GCNullPointerException for that run of the application. In a release mode, throwing an exception can be a better thing as it will call destructors on the way out were it not caught.

Another thing is that it increases overhead by adding a check before every access of the pointer, when this check is much better and faster done in hardware. In Windows, structured exceptions are converted to C++ exceptions in MSVC, not sure about GCC. In UNIX, you can catch SIGSEV, although I'm not sure what you can do to the thread when you get that signal, but if you are just recording the error, you can do it there.

That's not to say that the exception throwing is pointless, but perhaps it is better done with a preprocessor parameter, so that one can use SEH or SEH->C++ conversion if their environment supports it if they want to avoid the overhead of the check, or they can disable it if that makes it easier to work with the debugger.

Gillius
Gillius's Programming -- http://gillius.org/

axilmar
Member #1,204
April 2001

Quote:

I took a good look through all of the code

Very many thanks, Gillius.

Quote:

The parts I'm fuzzy on is how roots work, marking after objects are deleted, and the business with trying to magically detect contained GCPointers in GCObjects.

About the root set: the root set is the set of pointers that are global or on the stack and that are not members of any object. Scanning begins from this set. Every object not reachable from this set is discarded.

About marking: there is a flag on each object that indicates if it belongs in the current garbage collection cycle, and also there is the global garbage collection cycle flag. When the tree of objects is scanned, the global flag is copied to the flag of each object, and then all objects with the wrong value in the flag are deleted.

About detecting pointers that belong to objects: when a GCObject is created, the memory region that is allocated is kept in global variables. Then, the GCPointer looks if it is within that memory region, and if it does, it registers itself with the allocated GCObject or root set.

Quote:

One thing you mentioned is that reference counting is slow.

I've read a study somewhere that ref counting is generally slower than garbage collection (here is a quick link I googled out). Of course my solution could be worse than reference counting. I've done the following test: I first create a certain number of objects on the heap, all non releasable since they are referenced on the stack, then run a manual garbage collection. The results were (seconds that it took for garbage collection to finish):

10000 objects: 1 second
100000 objects: 10 seconds
1000000 objects: 23 seconds

On my Athlon 650 with SDRAM.

Of course I still can't say if this is faster or slower than ref counting, but ref counting has serious problems that far outweight any benefits.

LibGC does a list insertion at the end of the list each time a GCObject is created and each time a GCPointer is created. As long as not many objects are created continuously, it will not be a problem. Fortunately, and unlike Java, C++ permits stack allocated objects, so there is no need to constantly allocate very small objects that represent primitives.

Quote:

I believe that a lot of times, pointers are not copied that much outside of the objects that contain them. I suppose what I am saying is that I'm not sure the quick copy of the GCObject beats the quick creation of the reference counted pointer, because copies of managed pointers is not overly often, and using an interlocked increment routine is not as slow as you might think

In a single threaded environment, it is better to use ref counting only for member pointers. But on multithreaded environments, it is a problem: a thread other than the current one might delete an object or objects because of using shared ptrs because the current thread did not use shared ptrs on the stack. But if you use shared ptrs on the stack, then the number of assignments is extremely large, and then reference counting is certainly slower.

Quote:

Another comment: in the operator -> for GCPointer, you throw if it is a NULL pointer. I disagree with this, at least when it is not customizeable.

Feel free to change it as it suits your needs!

Quote:

Another thing is that it increases overhead by adding a check before every access of the pointer

Sure it does, but are you sure that the pointed object is not in some not swapped out memory? A great source of trouble comes from accessing memory that it already has been deleted but it has not been swapped out. There is not gonna be an exception then, bringing chaos to the system.

Quote:

That's not to say that the exception throwing is pointless, but perhaps it is better done with a preprocessor parameter

Since there are only two source code files to use, the user of this small lib can make any changes he/she likes. I feel that there is no point doing any further work from my part. My intent is to use this mechanism for doing a good gui lib. I've run in many troubles using reference counting anyway.

Thanks for the discussion. It was really valuable.

gillius
Member #119
April 2000

For some reason, those times sound slow to me when I am thinking about garbage collection in Java. I work on embedded systems running Java, and garbage collection is important to us.

Modern Java uses generational garbage collection, which works on the assumption that objects typically die young or die very old. Meaning that recently created objects are very likely to die, and objects that are old are likely to live. This gets around most of the problem of temporary objects used in the method.

I won't go into more detail about generational garbage collection in case you already know about it, but you can research it or ask me if you want to know more.

But the point is that for our environment, we've set the youngest generation size to be 3MB. Our system is a 300mhz Geode (x86 clone) with 128 megs of SDRAM. Minor garbage collects happen constantly, so this handles 3 megs of data, most of it which is thrown out. I estimate 100,000 objects (average object size of 30, a guess) in this 3MB range, and our collections happen fast enough we cannot discern them on our CPU graphs, so they are taking probably 100ms or less.

I'm not sure how our JVM does this, but it must as we see around 3 megs of memory freed up and the system is doing all of its work.

In your example, 100k objects took 10 seconds to collect. 100k objects sound like a lot, but 30 bytes per object doesn't sound all that crazy, either (esp in Java when you consider primitive objects), and 3MBs of memory is not all that much memory. 10 seconds sounds like a lot to collect 3 megs of memory.

What I don't get though, is why your test takes so long. If you are making x number of objects, then dumping ALL references, then running a GC, then the graph of traversable nodes has a size of 0 meaning it should take you 0 time to determine that all objects are dead, and all you have to do is delete the objects. Is that correct? Does it really take 10 seconds to delete 100k objects? What compiler/settings are you using? If you are running debug, I know some systems, in particular I know MSVC, does a lot of memory/bounds checking when compiling in debug mode.

Gillius
Gillius's Programming -- http://gillius.org/

axilmar
Member #1,204
April 2001

Quote:

in case you already know about it

I've read all there is about garbage collection, including generational GCs (and of course the Java one). Java is forced to use a generational GC, since it does not have templates and every little primitive must be allocated as an Object.

It does a very good work though. In the last defense application that I worked on, although the system has thousands of targets, the processing time is very good. Swing is slow, though.

Quote:

In your example, 100k objects took 10 seconds to collect. 100k objects sound like a lot, but 30 bytes per object doesn't sound all that crazy, either (esp in Java when you consider primitive objects), and 3MBs of memory is not all that much memory. 10 seconds sounds like a lot to collect 3 megs of memory.

As I explain in the docs, LibGC does not work like a traditional GC. In a traditional GC, memory is allocated by incrementing an allocated memory counter; if that exceeds a certain amount, the garbage collector kicks in, copies the marked objects to new memory, and discards all allocated memory at once. Thus it can discard 3MB in a very small amount of time.

My GC is not like that at all. It does not have a separate thread, nor it does copying. It simply maintains a list of objects and pointers, and when allocated memory is doubled, it marks the objects (through the pointers) and deletes unmarked objects. It uses the C++ memory manager, so it takes 10 secs to free 100K objects, because it calls the stdlib free() function 100K times.

It's not very sophisticated, but what do you expect in 200 lines of code ? :-)

I found it tremendously helpful though. I am making a Model-View-Controller framework, and one part of it is a signals and slots library. Without LibGC it is very complicated. With LibGC, it is very easy, as I forget about memory management at all (the problem is that when an object that it is deleted has slots pointing to it, the slots must be deleted, but slots should also be shared between signals, and also not be deleted when there is a stack allocated object with the slot in it and other similar problems).

The MVC framework also progresses much more quickly since I don't have to catter for deleting objects any more.

So, to summarize: it takes that long because it calls 'free()' so many times. But I don't mind, because for the applications I want to write, object creation will not be that often. The usual that there will be is a few hundred objects, at most.

CGamesPlay
Member #2,559
July 2002
avatar

Being as that I haven't read your code at all, could you explain this to me?

A 'circular reference' constitutes any two objects both having pointers to each other? (Node has children and parent pointers) Your GC handles that?

What order is that on your tests? It looks less than n, but I can't do any math :P

--
Tomasu: Every time you read this: hugging!

Ryan Patterson - <http://cgamesplay.com/>

axilmar
Member #1,204
April 2001

Quote:

A 'circular reference' constitutes any two objects both having pointers to each other? (Node has children and parent pointers)

Yes, a circular reference is exactly that. Or object A points to B, object B points to C and object C points to object A.

Quote:

Your GC handles that?

Yes it does.

Quote:

What order is that on your tests?

I haven't done any tests with circularly referenced objects, but I don't think it plays any significant role, because it is the same having N objects that are referenced from the root set with having N/2 objects referenced from the root set and the other half referenced by those objects. The mark routine works recursively, but it never touches an object twice.

I've just run a test with 10K objects, referenced from the stack and having a pointer to the previous object, but I don't see any difference, visually, with the simple test.

EDIT: I already finished my MVC framework, in just one day, thanks to garbage collection.

CGamesPlay
Member #2,559
July 2002
avatar

And I haven't seen any reference to it, but what kind of license is there? My project would use the zlib/libpng license (Allegro license + you must acredit where you got the code from); would they be compatible?

[edit]
Oh, and does "1 second" mean "about one second" or "more than one second but less than two seconds" or "1 second" or "less than one second"?

--
Tomasu: Every time you read this: hugging!

Ryan Patterson - <http://cgamesplay.com/>

gillius
Member #119
April 2000

The header in his source claims the code to be LGPL.

According to the big snafu with XFree86, licensing requiring credit is not (L)GPL compatible.

In general, for GUIs, garbage collection is really nice. Actually it's the MVC pattern where GC is very nice -- that goes beyond GUIs actually. In fact that's why I switched to refcounting in my network library -- in an MT environment with objects and listener registration, memory management was insane. I was spending 50% of my time on it. I got rid of most of that by switching to boost::shared_ptr. I'm not sure if it is worth switching to full GC.

axilmar: do you have any plans on trying to implement copying GC or other GC algorithms? Just out of curiousity.

Gillius
Gillius's Programming -- http://gillius.org/

axilmar
Member #1,204
April 2001

Quote:

what kind of license is there?

Quote:

According to the big snafu with XFree86, licensing requiring credit is not (L)GPL compatible.

LGPL. But that's typical. If you want to use LibGC, go ahead and use it. Just don't claim it to be yours (if you use my code, that is). It is not something that you can't do yourself. If you thing that you can't use it due to the licence, let me tell you how to do it yourself.

Quote:

Oh, and does "1 second" mean "about one second" or "more than one second but less than two seconds" or "1 second" or "less than one second"?

Less than one second.

Quote:

do you have any plans on trying to implement copying GC or other GC algorithms? Just out of curiousity

I don't, for the moment.

EDIT:

Gilius, I found a way to do an efficient GC (copying) for all types of data in C++, without linked lists, without affecting the byte size of a pointer and without classes having to inherit from special class. The only requirement is to use the template ptr class instead of *. For example:

ptr<int> data = new int[10];
ptr<Foo> foo1 = new Foo;

But I have another problem: I don't know, in Windows, how to initially allocate the GC memory pool. Which is the way to go ? do I use 'mmap'?

What I specifically want to do is to initially allocate a small amount of memory, and then be able to grow it but without the memory to be relocated. Within this memory, I will implement a copying collector.

EDIT2: never mind, I found it. First you reserve pages with VirtualAlloc and then, as memory grows, you commit the pages with VirtualAlloc again.

Now the only problem that remains is how to call destructors without having classes inherit from a special class. Is that even necessary though ?

Go to: