basic dx_ball project(like breakout)

basic dx_ball project(like breakout)

Raidho36

Member #14,628

October 2012

>It's a standard technique.
Yes I know that, what I meant is that it is actually unnecessary if the size is only increased by one unit not by factor two.
>Don't know what you're talking about.
Maybe I misused some term or even made up one. I meant that power of two is growing not by one each time, but by some larger value which grows every iteration, too. See this:

for (i = old_size; i < new_size; i++) {
    _al_vector_alloc_back(&queue->events);
}

And then this:
char *new_items = al_realloc(vec->_items, 2 * vec->_size * vec->_itemsize);
Assuming that vector is supposed to double the size, it makes for a multiple size doubling whenever new size exceeds 2. Say, for new size 4 allocation would be called 2 times, each one doubles the vector size, effectively increasing it's size 8 times, which would be 16 units. Next iteration calls allocation 8 times, 256 times size increase which makes 4096 units. And so on.
>It allocates a slot for the caller to fill.
Ok, but it does not increases this counter when allocates an event. Appending an array would increase it, but it isn't called anywhere. What's the use then?
>the average length of an event queue is zero
I believe that peak length is what matters.

Being serious is stupid, I'm done with it.

Peter Wang

Member #23

April 2000

Raidho36 said:

Yes I know that, what I meant is that it is actually unnecessary if the size is only increased by one unit not by factor two.

You don't know ahead of time how many will be eventually required. Of course you can choose to grow the vector by only one each time, if you are trying to minimise space (never mind roundup applied by the memory allocator or increased chance of heap fragmentation).

Quote:

for (i = old_size; i < new_size; i++) {
    _al_vector_alloc_back(&queue->events);
}

This code is a bit silly, but all it does is grow the vector by (new_size - old_size) slots. I think I was lazy to add a separate function to allocate multiple at once. I doubt it would make any difference in practice.

Raidho36 said:

char *new_items = al_realloc(vec->_items, 2 * vec->_size * vec->_itemsize);

You missed the if (vec->_unused == 0) condition directly above it.

Raidho36 said:

Ok, but it does not increases this counter when allocates an event.

What counter? I think you're missing the fact that the event queue is implemented as a circular array on top of an underlying _AL_VECTOR.

Quote:

I believe that peak length is what matters.

The point is, event queues are typically very short. A single image or sound sample will generally take more space than any event queue.

Raidho36

Member #14,628

October 2012

>never mind roundup applied by the memory allocator or increased chance of heap fragmentation
Actually those two are avoided another way than reallocating to twice as large. It is only necessary to reallocate whatsoever if you want random access by offset, like, just calculate and dereference it. Instead, you could've returned an address where data is pushed in and don't do any calculations at all when fetching it, and when size is exceeded another data storage is created. Access by ID wouldn't be significantly slower than real random access since only requires an extra couple of very simple operations, and you don't use it much anyway because you got direct persistent pointers as explained above.
>event queues are typically very short
Obviously. Even a huge load of events won't exceed space required to store 32bpp 64x64 image. But that's not the point.
>You missed the if (vec->_unused == 0) condition directly above it.
Wait, I think I got it. Whenever unused slots are zero, vector size is doubled. Then unused is an inverse of array_size if it was there, and the size is to know how much is occupied. However, each allocation "increases" vector length by just one. So in events queue code to actually double it's size allocation is called old times. I think it makes some sense now, but really, what were you smoking when came up with such vague and awkward design? And couldn't an events queue be actually a queue? That would make more sense than emulating a queue through a vector. And it would probably work faster for being more specific. There could be even special queue events_queue_queue specifically to handle events fastest way possible. But okay, whatever. I would probably just suggest use of my data structures library once I finish it. It also could've be an addon, that would be handy allegro feature.

Being serious is stupid, I'm done with it.

Peter Wang

Member #23

April 2000

Raidho36 said:

Actually those two are avoided another way than reallocating to twice as large. It is only necessary to reallocate whatsoever if you want random access by offset, like, just calculate and dereference it. Instead, you could've returned an address where data is pushed in and don't do any calculations at all when fetching it, and when size is exceeded another data storage is created. Access by ID wouldn't be significantly slower than real random access since only requires an extra couple of very simple operations, and you don't use it much anyway because you got direct persistent pointers as explained above.

I think you're thinking of a segmented array? That's way more complicated than we need and has a higher constant factor.

Quote:

But that's not the point

Then what is? You agree that the space is negligible, and by your initial (incorrect) assertion that memory is allocated per event, I take it you agree that event queues should be optimised for speed.

Quote:

And couldn't an events queue be actually a queue?

A queue is an abstract data type, which still needs to backed by a concrete data structure. A circular array is a common choice (another being a linked list, which we previously used). A generic dynamically-sized vector is nothing strange either. Maybe the specifics aren't to your taste but that's all.

Raidho36

Member #14,628

October 2012

Peter Wang said:

That's way more complicated than we need and has a higher constant factor.

It is not any more complicated than vector. It won't provide random access, instead it would provide persistent pointer, which is faster to fetch cache miss or not. You can do this with regular arrays, too, but ability to maintain physical addresses on resize gives huge benefit. Unless you use such array not by it's purpose, of course, which is storing data when address persistence is critical. For queues you're not interested in locations, you only care for head and tail. Therefore you can even throw away any indexing whatsoever and only use head and tail pointers. I don't know for you, but I think it's simplier and more straightforward than current implementation of queue on top of vector.

Quote:

You agree that the space is negligible

It doesn't mean it could be simply wasted. Now I see it doesn't actually wasted, it's just there is kind of indistinct way of allocating the space for another event.

Quote:

initial (incorrect) assertion

Sorry about that, it's just a)for the first time I was reading the code like I read advertisment booklet on the elevator wall and b)there's no comments to explain such a knotted up method of completing the task. But yes, speed is important.

Quote:

A queue is an abstract data type, which still needs to backed by a concrete data structure.

A queue is not a data type for that matter. The only backing it needs is a block of memory.

Quote:

A circular array is a common choice (another being a linked list, which we previously used)

Linked lists are superflous for that task, and being flawy themselves. Circular vector is already better, but it could still be better than that. There could be features traded over performance. It could be optimized queue as I suggested before. It is not so much different from linked list actually. My vision of it is that list base structure would contain a couple of pointers and basic data about suggested sizes (or hard-coded in with type-specific optimizations); individual cells don't have next`/`previous (only actual data) pointers and are allocated in large groups, which do have a single next pointer that would point to next chunk, last of which would point to first, naturally looping them. It is not possible to perform insertions with such approach, but resizing goes super easy, overflow detection and resolving overhead is next to zero and does not require large contiguous space to work under extreme data load. I also thought of coding in function table that would resolve specific data sizes (1,2,4,8,16) naturally therefore faster, but I haven't tested it actually and since it's not a direct function call it cannot be really optimized, so it could be even slower that way than using memcpy for everything. But I honestly didn't thought so much on this one, it's just the best I came up with so far. But even as it is, it's already much more straightforward: just push in an event and then pop it (or peek or discard or whatever).

Being serious is stupid, I'm done with it.

gnolam

Member #2,030

March 2002

FFS, learn to use <quote></quote> already. >:(

--
Move to the Democratic People's Republic of Vivendi Universal (formerly known as Sweden) - officially democracy- and privacy-free since 2008-06-18!

Raidho36

Member #14,628

October 2012

Formatting quick help says that I can use right angle brackets for quoting. It doesn't says though that following whitespace is mandatory, which is not in most other places.

Either way, I'm not pushing anything. I realize that even with large speed increase, something about 50% or bigger, it's not that significant for the game anyway so no reason to rewrite it unless there's no more important tasks to accomplish. I'm just doing some discussion.

Being serious is stupid, I'm done with it.

Peter Wang

Member #23

April 2000

Raidho36 said:

It is not any more complicated than vector.

I'm not sure, but I think in your last paragraph you were describing a linked list where each list node contains multiple elements. (a "fat list", but Google does not attest to that terminology.) I can see how it would work, but I don't agree it's any simpler or would be any faster.

Quote:

A queue is not a data type for that matter.

Abstract data type.

Raidho36

Member #14,628

October 2012

Peter Wang said:

I'm not sure, but I think in your last paragraph you were describing a linked list where each list node contains multiple elements.

A list that would store multiple elements in a single cell would have enormous computational problems to solve on random insertion and space waste on random deletion. That's not how well designed data structure works. Though yes, using it as a base it is easy to create a queue with many benefits and no performance loss.

Quote:

I don't agree it's any simpler or would be any faster

It will simplify and clearify events.c code as implementation would be within another unit. About speed - it actually will. In your "queue" you use reference function often. All it does is gives you a tail pointer, which you could've simply stored in data structure header. Please don't tell me that calculating a pointer by an offset multiplied by unit size is any faster than just keeping that pointer.

Also, about memcpy. It's really not needed for anything that has solid type, a simple macro with GNU extension will do perfectly. Something like this I guess:
#define push(q,v) (*((__typeof__(v)*)(q)->head)=(v),_push_void(q))
#define pop(q,v) (_pop_void(q),(v)=*((__typeof__(v)*)(q)->tail))

The linked list in misc folder doesn't actually contain any data (even though supposed to), only pointers, so for everything you push in you're supposed to preallocate some long living enough space. Am I get that right?

Another thing is bstrlib. All it does is virtually adds up fool-proofing on top of standard C string library. Oh, and lets you know string length in one operation, that's really handy. There's some other handy features, too, which I appreciate, but it also does a lot of work that not needed, which basically dictated by language the library is written in. Say, in C one would check data validity in advance of using it rather than hope that things would go all right on their own. In fact, C dictates such programming way that in a good program it is safe to assume that incoming data is always valid with exception to arbitrary external input. It also adds up inconsistency: some function accept ustr and others just char*. And even if using unicode, why it would be UTF-8 and not UTF-16/32 simply using wide characters? UTF-8 does saves up some space (50% ascii-only and about 20% most of the time), but introduces ridiculously huge processing overhead that cannot be surpassed by design for every single string operation except a couple of operations that affect entire string at once, like, not a part of it. Neither of UTF-16/32 suffer from this, they're just like regular C strings, only use wider characters. I mean seriously, if the header stores string length and you could've just fetch it, why would you even write that code below?

#SelectExpand
  1size_t al_ustr_length(const ALLEGRO_USTR *us)
  2{
  3   int pos = 0;
  4   int c = 0;
  5
  6   while (al_ustr_next(us, &pos))
  7      c++;
  8
  9   return c;
 10}

That seems very inconsiderate to me. Though having a built-in function to convert strings to UTF-8 to write them to UTF-8 encoded text files is nice.

Being serious is stupid, I'm done with it.

Luiji99

Member #12,254

September 2010

Wouldn't cubic progression mean that the size would be cubed every expansion? It seems more like square progression if such a term exists.

Programming should be fun. That's why I hate Java.

Peter Wang

Member #23

April 2000

raidho36: Sorry, I really find your writing very unclear. If you want to discuss the data structure further I think you need to provide code.

A couple of things:

Raidho36 said:

Also, about memcpy. It's really not needed for anything that has solid type

Sure, as we do in copy_event. It's not that clear cut though, as modern compilers will specialise certain memcpy calls where the size argument is a constant. Or the other way, a structure copy will be implemented by a memcpy call.

Quote:

And even if using unicode, why it would be UTF-8 and not UTF-16/32 simply using wide characters?

Compatibility with C strings.

Quote:

introduces ridiculously huge processing overhead

I don't find it so. UTF-16 is variable width anyway, even if many programs don't handle it properly, and often you end up having to convert to/from UTF-8. UTF-32 is too wasteful.

(The Allegro UTF-8 routines are not very optimised so don't take that as the benchmark. It's possible to do better.)

Quote:

I mean seriously, if the header stores string length and you could've just fetch it, why would you even write that code below?

You need that for UTF-16 as well.

I find the ustr functions useful sometimes. If you don't like it, don't use it.

By the way:

Quote:

UTF-8 does saves up some space (50% ascii-only and about 20% most of the time)

ASCII really dominates. You'd think a page full of Chinese characters would be more efficiently represented in UTF-16 than UTF-8. Here's the numbers from http://tw.yahoo.com/index.html

290062 bytes (UTF-8)
574982 bytes (UTF-16)

Turns out the content of the page is tiny compared to the HTML markup and Javascript.

Raidho36

Member #14,628

October 2012

Peter Wang said:

Sorry, I really find your writing very unclear.

If you speaking of me describing mentioned fat list problems, it was my misunderstanding of concept of using large chunk of memory to store nodes. Supposedly they're stored just like regular nodes, but instead of allocating space per-node, space is pre-allocated for multiple nodes, while my thought at the moment was a "list" that doesn't use next/previous pointers and stores data in a plain array fashion. Allocating more space at once is a most obvious optimization to first obvious implementation of linked list. With such approach you will need to index your memory cells usage though.

Quote:

modern compilers will specialise certain memcpy calls where the size argument is a constant

In your list, it is a variable. Just because a variable doesn't change, it doesn't makes it a constant. Keyword const doesn't forces anything to be constant either, just forbids direct value modification, and variables declared as constant don't count as actually constant. Not like you don't already know any of this. Anyway, in your code copying a variable into the list doesn't involve any unit size values explicitly. But really, for an abstract list you wouldn't write "copy this" and "copy that" functions for every single type you would store in, right? Though that is one of acceptable solutions. What's your thoughts on suggested macro? I mean apart from no thread safety (which is only a couple of extra functions) and obvious remark that it won't work with many compilers. It could be always extended with writing in explicitly a variable type into third argument to avoid using "`typeof`", which I find exceptionally handy. Though using expressions as a typeof argument tend to produce `int`s and `double`s, but when passing a variable it works perfectly.

Quote:

Compatibility with C strings.

All of unicode is backwards compatible with C strings (unless you plain copy C string into unicode string without doing a conversion of course, which is simply silly) and none of unicode is forward-compatible with them, but wide characters are constant-sized while UTF-8 is variable-sized, therefore prior are handled better. I mean yes, UTF-8 is a unicode text files encoding standard de-facto, but for internal data storing it's better to use already decoded strings. It's not even compatible with system unicode functions and needs to be decoded to constant-size encoding before use (AFAIK).

Quote:

UTF-16 is variable width anyway

Oh, now I remember. Yes, true. There's just a bit of confusion for internal unicode storage that would occupy 4 bytes on linux but only 2 bytes on windows. That gave me an (apparently wrong) idea that windows' internal unicode strings storage is equal to UTF-16 encoding. Now that is cleared up, I'd like to say that all that time I actually meant simply wide character strings which represent unicode characters. UTF encoding has nothing to do with it, except that 4-byte wide character string precisely matches UTF-32 encoded string. Sorry for confusing you.

Quote:

UTF-32 is too wasteful

Oh come on, using up up to extra 3 bytes of memory per character is wasteful and executing loads of code on every single string operation is not? Besides, just a few posts earlier you said that such sizes (similar to what a set of strings would take) are negilible in comparison to what a texture or a sound would take. You don't store a compressed PNG file into memory "as is" and decompress it each time to access it's pixel data just because a plain bitmap in native pixel format is "wasteful", right? Why even consider opposite approach with strings then? And the size difference isn't even that significant unless you encode, like, an entire english Wikipedia with it. Plus, 2 bytes is the most common size of UTF-8 character (when used for multicultural characters) anyway, which still would store several magnitudes (of two) smaller character range than wide character. That effectively equalizes 3-byte UTF-8 character to 2-byte wide character. Of course with linux, wide character would occupy 4 bytes no matter what, so wide character string would never consume less space than UTF-8 string in practice. Moreover, two highest bytes of wide character would probably never used. Still, wide character string offers to operate strings using something as simple as wcsncat and even memcpy, which is a perfectly good trade of memory over performance. You know, memory/processing rarity was extremely biased towards memory back in the times, but now it is completely opposite, so nothing is wrong with giving away some memory space that could've instead been saved in cost of lots of computations. Seriously, modern hardware offers 8 GB of ram as a common value, and several gigabytes of swap on top of that. Processing power is large, too, a magnitude of tens of gigahertz though. Still, there's always a meaningful use of it rather than constantly grinding the data over and over whenever it is necessary to perform any operation on it.

Quote:

I find the ustr functions useful sometimes.

I already mentioned there are useful and handy ustr functions. It's just the string themselves doesn't have to be encoded for internal storage. Using wide character strings for that is an optimal choice.

Quote:

ASCII really dominates
Turns out the content of the page is tiny compared to the HTML markup and Javascript.

That's exactly the reason why your argument is invalid: you display statistics for a web page source, not for a text that it actually contains. You could've as well suggested plain XML file to form such statistics and effeciency conclusions. And even if actually ise XML, it's nevertheless faster to modify it if it's represented in plain fashion rather than variable-width encoded, and it won't hurt anybody if data was internally stored as wide character string and only converted to UTF-8 just before flushing. In terms of data stream bandwidth decrease, you should instead pay attention to data compression rather than layout optimization. Prior typically achieves many times better compression for text files, especially those that repeat themselves most of the time, such as XML.

Most of UTF-8 handling code doesn't exploit UTF-8 handling techniques. That's odd to see right next to fairly smooth vector data structure implementation.

Once again, just discussion.

Being serious is stupid, I'm done with it.

Peter Wang

Member #23

April 2000

Raidho36 said:

What's your thoughts on suggested macro?

I'm not sure where you want to use it, but it's probably unnecessary. Like I said, the compiler specialises memcpy with a constant size argument so you can write it portably and efficiently as memcpy(dst, src, sizeof(TYPE)); or memcpy(dst, src, sizeof(*src));

Quote:

All of unicode is backwards compatible with C strings

Specifically, we are talking about byte-oriented strings terminated by a NUL byte (no interior zeroes). You can pass pointers to UTF-8 encoded strings to functions expecting 8-bit strings and usually get something sensible, e.g. fprintf to write out strings to a file, strlen to count the size, strchr/strrchr to search for ASCII character or UTF-8 code units. And UTF-8 is fully general, so you if use that for everything you don't need to provide both 8- and 16-bit versions of functions with string arguments.

Quote:

It's not even compatible with system unicode functions and needs to be decoded to constant-size encoding before use (AFAIK).

Depends on the system. I guess you mean file system paths and the like.

The Unix world got Unicode support late and uses UTF-8 encoded strings pretty much everywhere. Microsoft and Java got Unicode earlier (defined it, I believe) when it was still a 16-bit code... until it wasn't, and they ended up with a variable-width encoding anyway. If they could, I wonder if they would choose UTF-16 again? Not sure.

And you generally don't need to decode to a constant size encoding before use.

Quote:

Oh come on, using up up to extra 3 bytes of memory per character is wasteful and executing loads of code on every single string operation is not? Besides, just a few posts earlier you said that such sizes

You come on. Strings get pretty big, and you have a lot of them. Just that single web page was 290k, and over 1MB in UTF-32. Whereas a typical Allegro program has a handful of event queues and a peak length of, what, four elements each?

Are there any systems that use UTF-32 exclusively? The only one that I know of that tried was Python3 (on Unix), which has now adopted flexible string representations.

Executing loads of code: not true. Quite often the algorithm is exactly the same as for ASCII, or you end up taking the ASCII fast path because all the interesting characters are in the ASCII range.

Quote:

That's exactly the reason why your argument is invalid: you display statistics for a web page source, not for a text that it actually contains.

Just an illustration. Strings in the real world are "marked up", and the markup is ASCII.

Anyway, that's it from me. You can find opinions for UTF-8 and UTF-16 both all over the net, but few supporters of UTF-32.

Raidho36

Member #14,628

October 2012

Peter Wang said:

I'm not sure where you want to use it

Everywhere outside definition unit to push data into the queue. So you just pass to this macro your queue and your variable, and the macro stores the data into queue taking into account size of variable supplied in a natural fashion.

Quote:

memcpy(dst, src, sizeof(TYPE))
sizeof(*src)

The only problem is that variable type is a virtual entity and you can't store it or even compare it to anything, and that the functions substitute argument types, so this trick can only be used within such macro. It is moot that variable sized memcpy is slower than constant sized one. A little research shows that memcpy will most likely be as fast as direct assignment anyway, and probably even faster under certain circumstances.

Quote:

You come on.
suggests another XML example
Strings in the real world are "marked up"

No you come on with XML. I already told you that it's not a valid argument because game texts are not XML, nothing even close to markupped text with exception to some rare and specific cases. Whereever you've seen that over 90% of the game text data was the huge heap of system-related or any other keywords? I've seen that 90% of the game text data was a text to display "as is" which is the case with allegro, provided it doesn't support formatting anyway and doing it manually isn't what many do. I mean literally, on that web page you mentioned the meaningful text / formatting data relation was right next to zero. I suggest you to cut that out and be realistic about in-game string data. Another thing is that game is (typically) not a novel to contain any more than 512 kilobytes of text total. I'm not talkning about storing levels in text files, they aren't stored in the ram as a text any longer that is necessary to load a level. More likely there's just a single string of a file at any given moment. Neither entirely stored in memory for long time any other configuration files. They are being loaded and read (or write) and immediately discareded. Not like one-pass destinated text is even relevant on that topic, but still.

Well, even that. Let us assume that in some program I'd want to display 256 HTML pages simultaneously, like, in a 16x16 grid, and they're all animated and everything, so I can't just render them and discard into hard disc cache. And not just any page, but those that include loads of formatting that they would be something about 1-2MB with 4-byte wide character strings. That would take up something about 512MB. 512 vs ~130 is considerable, that's true. But then imagine that there's a script on that page that would modify DOM. Say, it would replace some footer DIV innerHTML. With UTF-8 you can't just jump there and do your thing. Instead, you grind through 250000 characters before you find the right place to even start doing it. Does the save of space worth perpetual slowness? Of course real browser stores data in a different fashion so there will be a tree of DOM elements where each one contains little to no text data at all, which is generated in a single pass anyway, so unavailability of random access isn't much relevant, but you get the idea. Again with sprites: you don't hack the video driver over waste of space because when you load 34x5 sprite you end up with 64x64 surface and because you only use 3 bits per texel but internally texture will have 32 bits per texel anyway? Сonsider how much space could've been saved on that, a lot, right? So best off with software rendering only; screw slowness and screw hardware acceleration, better off without it at all because it won't provide optimized enough data layout.

Once again, real game won't store this much data in the RAM, at any given moment most of the data will be on the hard drive and in there it could easily be both UTF-8 and compressed, volia, no problem. But in the game memory it's best to store data plain accessible. I don't mean UTF-32 or anything, I mean wide character strings.

Quote:

Quite often the algorithm is exactly the same as for ASCII

Implying operations that involve counting characters one by one. Appending a string to the end of another string does not involve such with C strings, wide or standard. Neither does copying a string section. Not to mention string comparison. Speaking of hardware acceleraiton, there is an assembly instructions for that, and I hardly doubt strcpy in stdlib is compiled with them.

Quote:

You can find opinions for UTF-8 and UTF-16 both all over the net, but few supporters of UTF-32.

I do not claim that there's something wrong with UTF-8 nor that UTF-32 is the single most epic awesome encoding. I say that internally keep an encoded data when you can store it unencoded for best performance is a bad practice and having wide character strings to store unicode strings is a subject for considering, which apparently have been thrown away without such.

Being serious is stupid, I'm done with it.