editting a file
William Labbett

hi,

Got a quetion about files. Say I open a file in binary mode and I it's an array of structs.

If I wanted to delete the last struct so that file is smaller, how would I go about it ?

My struct is F_REP so the whole file is

sizeof(int) + sizeof(F_REP) * n /*** EIDT ****/ the int is the value of n

The only way I can think of is create a temporary array, write all the structs to it then open the file in write mode so the current contents are discarded.

Thomas Fjellstrom

stdio includes the ftruncate function.

Quote:

4.4BSD, SVr4 (these function calls first appeared in BSD 4.2). POSIX 1003.1-1996 has ftruncate. POSIX 1003.1-2001 also has truncate, as an XSI extension.

It lets you reduce the size of a file.

William Labbett

Thanks Thomas.

How do I get the handle for the file ?

I've looked in K&R. It says there's an fd member in the FILE struct but I tried it and got a compiler error.

torhu

I don't think there's a portable way to do this. Except for reading the file and writing it out again like you suggested yourself.

William Labbett

Alrighty. Thanks Torhu, I'll try that.

Thomas Fjellstrom

How do I get the handle for the file ?

fileno :)

William Labbett

Excellent - got it working. Thanks.

Matthew Leverton

Note that this is not portable:

MY_STRUCT foo;
al_fwrite(fp, &foo, sizeof(MY_STRUCT));

Different architectures could have different sizes of integers, etc. So while you can write and read back on the same machine, if you go from 32-bit to 64-bit or PPC to Intel, it might break.

William Labbett

I suppose the same applies to fwrite and fread.

Does that mean it might be an idea to check sizeof(int) et al at the beginning of the prog so as to abort if the architecture isn't suitable ?

Thomas Fjellstrom

No, just write known sized integers to the file, in a known endian format.

Matthew Leverton

If you are using Allegro, it has things like al_fwrite32le() to help.

William Labbett

So maybe I should switch to using ALLEGRO_FILE ? Hhhhmmmm. Just as I was getting to the end of coding my editor. Guess I'll have to go that extra mile.

Matthew Leverton

Yes, and you should save each field independently:

for (i = 0; i < n; ++i)
{
  al_fwrite32le(fp, foo[i]->int1);
  al_fwrite32le(fp, foo[i]->int2);
// etc...
}

Then when you read, do it in the same order, using the read function.

Another advantage of this is that you can change your struct easily without invalidating all of your saved files. (Because your loader could check a version number stored in the file and adjust itself accordingly.)

William Labbett

Okay. Thanks :)

BTW, do I need that single integer at the beginning of the file which stores the number of structures in the file or can get that buy getting the size of file somehow and dividing it by the size of the structure ?

Matthew Leverton

You could either have a number at the beginning or you could check for al_feof() and stop when you reach it.

William Labbett

The idea of the program being able to cope with different versions of the file sounds worth pursuing (I had that problem as this is the second version of the program I'm writing).

I suppose I'd read the file differently for different versions.

Then I'd store a variable in a config file which determines which version of the file to save. That way I could load older versions, and convert them to the up-to-date version.

Does that sound sensible ?

You could either have a number at the beginning or you could check for al_feof() and stop when you reach it.

Would I still be able to get the number of structures in the file that way ?

Matthew Leverton

Would I still be able to get the number of structures in the file that way ?

If you needed to know up front (to create the memory), then it would be simplest to store the size as an integer at the beginning.

Quote:

Then I'd store a variable in a config file which determines which version of the file to save.

You really don't have to keep any details on what version to save. The program could always save the latest version. So the writing code could be simple.

However, the reading code may need to do something slightly different depending on which version it is reading. But normally it looks something like:

if (version > 10) {
  foo->x = al_fread32le(fp);
  foo->y = al_fread32le(fp);
}
else {
  foo->x = 0;
  foo->y = 0;
}

William Labbett

Right. Thanks for all of that. It should keep me busy for a good while.

OnlineCop

How frequently do you plan to add/remove data from the file?

If infrequent, keep up what you're doing. If you think, however, that reading/rewriting will be a major portion of it, you could also weigh the computational hit of writing a kind of byte index at the beginning of your file.

custom_record#SelectExpand
1 3 // 3 records follow 2 5 // offset of 1st record 3 9 // offset of 2nd record 4 11 // offset of 3rd record 5 data1a 6 data1b 7 data1c 8 data1d 9 data2a 10 data2b 11 data3a 12 data3b 13 data3c 14 data3d 15 data3e

Then you could jump immediately to the data you need. This is a lot more complex, and if you don't do it right, you'll corrupt the entire file ("FAT table? I don't need this; let's erase it and see what... hey, what the..?"), so proceed carefully if you do want to do this.

Advantages: Faster sinital seek time to get to the record you want
Disadvantages: Remove a record in the middle, and you've essentially got to load the entire glossary, modify it (correctly) after the record is removed, write that back out to the file on save (where you MAY have to shift ALL of your elements up several bytes if you remove 1+ indices). But reads would be faster...

William Labbett

I know I'll be reading writing and rewriting a lot but I think I'll be okay without the byte index since all the structures are the same size so I just need to al_fseek to start of the struct when reading/writing it.

Does that sound okay ?

/*** EDIT *****/

I've got another query hopefully someone can help with :

I was thinking that since I'm only going to be using my file editor on my machine it won't matter if the size of types is different on another, so I thought maybe I didn't need to change the code.

Then I thought, what about when the game runs on another machine ?

My struct looks like this :

#SelectExpand
1struct F_REP 2{ 3 4 char png_filename[ MAX_LENGTH_FOR_FRD_STRING ]; 5 6 int type; 7 8 int squares_wide_on_map; 9 int squares_high_on_map; 10 11 int x_offset; /* the x offset from the top left square indicating where the feature is positioned */ 12 int y_offset; 13 14 15 16 int num_layers_or_frames; 17 18 int ticks_per_frame; 19 20 int line_in_animation_specs_file; 21 22 int altitude; 23 24 25 int feature_has_shadow; 26 27 int shadow_x_offset; 28 int shadow_y_offset; 29 30 int feature_has_collision_map; 31 32 int c_x_offset; 33 int c_y_offset; 34 35 int has_mask; 36 37 38 39 40};

Could this potentially cause problems when

I do

fread( *array_of_F_REP, sizeof(F_REP), num_reps, file);

?

So what I'm really asking is Do I really need to rewrite the code ?

bamccaig

The int data type varies in size from compiler and architecture to compiler and architecture. That means that sizeof(F_REP) will also vary. If the files are written on a machine where int is 4 bytes, but read on a machine where int is 8 bytes, then you'll read incorrectly and the data will all be wrong. Additionally, from architecture to architecture, the byte order of data types varies. On some machines, bytes appear as you would expect: ABCD. On others, the bytes are reversed, DCBA. Again, if you write on one and read on another then the data will be wrong.

This is the same problem seen with network programming. If you write a networked program that you want to be able to speak to any computer of any architecture then you need to define a protocol to communicate multi-byte data with. The common convention is to send multi-byte values as Big Endian (most significant byte first), which translates to the last byte first. This is referred to as network byte order and is a good way to store data in the file system as well. The sender/writer needs to convert multi-byte data from host byte order (whichever byte order the machine uses) to network byte order. The receiver/reader needs to convert from network byte order to host byte order. There are a set of functions in Berkeley Sockets and WinSock to do this. Presumably, that is exactly what Allegro's functions do as well (I imagine that they are just wrappers over the network library calls).

Note that if the files are only ever created and read on the same machine (never shared between machines) then you don't technically need to care about storage format because the format won't change for a given machine (unless you change something about the machine, like the compiler or operating system architecture).

Matthew Leverton

So what I'm really asking is Do I really need to rewrite the code ?

Yes, you do. (Or at least, it would be easier to do that than to manage different data files for 32-bit and 64-bit.)

It should be pretty simple, since you mostly have integer data. Just use the functions I already mentioned. You can use al_fwrite to write out png_filename.

Oscar Giner
bamccaig said:

Note that if the files are only ever created and read on the same machine (never shared between machines) then you don't technically need to care about storage format because the format won't change for a given machine (unless you change something about the machine, like the compiler or operating system architecture).

Actually it's a compiler only think, not OS. So distributing an executable wouldn't be any problem since it will do the same on any computer able to run it. The problem comes when you distribute the source code and people compile it with different compilers (or compiler version) and/or for different architectures.

bamccaig

:-X

Matthew Leverton

Actually it's a compiler only think, not OS.

Correct ... and if you have any plans on ever distributing your program, you might as well just do it properly from the beginning. It's not that much more work, and it saves problems of having incompatible data files in the future. And as already mentioned, saving the entire struct as-is makes it more difficult to change the struct.

William Labbett

Thanks for the help guys.

Just wondering what I should do instead of using ftruncate now. I guess I'll have to rewrite the whole file when I want to change it's size ?

OnlineCop

#SelectExpand
1void save(struct F_REP* f_rep, PACKFILE* pf) 2{ 3 assert(F_REP && "f_rep was passed in as a NULL pointer!"); 4 assert(pf && "pf was passed in as a NULL pointer!"); 5 6 /**< first, write the # of actually-used bytes in this string */ 7 pack_iputl(strlen(f_rep->png_filename), pf); 8 9 pack_iputl(f_rep->type, pf); 10 pack_iputl(f_rep->squares_wide_on_map, pf); 11 ... 12 pack_iputl(f_rep->has_mask, pf); 13} 14 15 16void read(struct F_REP* f_rep, PACKFILE* pf) 17{ 18 size_t len_filename = 0; 19 20 assert(F_REP && "f_rep was passed in as a NULL pointer!"); 21 assert(pf && "pf was passed in as a NULL pointer!"); 22 23 /**< first, read in the number of bytes in the string */ 24 len_filename = pack_igetl(pf); 25 pack_fread(&f_rep->png_filename, len_filename, pf); 26 if (pack_feof(pf)) 27 exit_with_error("EOF reached after reading in `png_filename'!\n"); 28 29 f_rep->type = pack_igetl(pf); 30 f_rep->squares_wide_on_map = pack_igetl(pf); 31 ... 32 f_rep->has_mask = = pack_igetl(pf); 33}

Thread #605966. Printed from Allegro.cc