Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » editting a file

Credits go to Thomas Fjellstrom and torhu for helping out!
This thread is locked; no one can reply to it. rss feed Print
 1   2 
editting a file
William Labbett
Member #4,486
March 2004
avatar

hi,

Got a quetion about files. Say I open a file in binary mode and I it's an array of structs.

If I wanted to delete the last struct so that file is smaller, how would I go about it ?

My struct is F_REP so the whole file is

sizeof(int) + sizeof(F_REP) * n /*** EIDT ****/ the int is the value of n

The only way I can think of is create a temporary array, write all the structs to it then open the file in write mode so the current contents are discarded.

Thomas Fjellstrom
Member #476
June 2000
avatar

stdio includes the ftruncate function.

Quote:

4.4BSD, SVr4 (these function calls first appeared in BSD 4.2). POSIX 1003.1-1996 has ftruncate. POSIX 1003.1-2001 also has truncate, as an XSI extension.

It lets you reduce the size of a file.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

William Labbett
Member #4,486
March 2004
avatar

Thanks Thomas.

How do I get the handle for the file ?

I've looked in K&R. It says there's an fd member in the FILE struct but I tried it and got a compiler error.

torhu
Member #2,727
September 2002
avatar

I don't think there's a portable way to do this. Except for reading the file and writing it out again like you suggested yourself.

William Labbett
Member #4,486
March 2004
avatar

Alrighty. Thanks Torhu, I'll try that.

Thomas Fjellstrom
Member #476
June 2000
avatar

How do I get the handle for the file ?

fileno :)

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

William Labbett
Member #4,486
March 2004
avatar

Excellent - got it working. Thanks.

Matthew Leverton
Supreme Loser
January 1999
avatar

Note that this is not portable:

MY_STRUCT foo;
al_fwrite(fp, &foo, sizeof(MY_STRUCT));

Different architectures could have different sizes of integers, etc. So while you can write and read back on the same machine, if you go from 32-bit to 64-bit or PPC to Intel, it might break.

William Labbett
Member #4,486
March 2004
avatar

I suppose the same applies to fwrite and fread.

Does that mean it might be an idea to check sizeof(int) et al at the beginning of the prog so as to abort if the architecture isn't suitable ?

Thomas Fjellstrom
Member #476
June 2000
avatar

No, just write known sized integers to the file, in a known endian format.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Matthew Leverton
Supreme Loser
January 1999
avatar

If you are using Allegro, it has things like al_fwrite32le() to help.

William Labbett
Member #4,486
March 2004
avatar

So maybe I should switch to using ALLEGRO_FILE ? Hhhhmmmm. Just as I was getting to the end of coding my editor. Guess I'll have to go that extra mile.

Matthew Leverton
Supreme Loser
January 1999
avatar

Yes, and you should save each field independently:

for (i = 0; i < n; ++i)
{
  al_fwrite32le(fp, foo[i]->int1);
  al_fwrite32le(fp, foo[i]->int2);
// etc...
}

Then when you read, do it in the same order, using the read function.

Another advantage of this is that you can change your struct easily without invalidating all of your saved files. (Because your loader could check a version number stored in the file and adjust itself accordingly.)

William Labbett
Member #4,486
March 2004
avatar

Okay. Thanks :)

BTW, do I need that single integer at the beginning of the file which stores the number of structures in the file or can get that buy getting the size of file somehow and dividing it by the size of the structure ?

Matthew Leverton
Supreme Loser
January 1999
avatar

You could either have a number at the beginning or you could check for al_feof() and stop when you reach it.

William Labbett
Member #4,486
March 2004
avatar

The idea of the program being able to cope with different versions of the file sounds worth pursuing (I had that problem as this is the second version of the program I'm writing).

I suppose I'd read the file differently for different versions.

Then I'd store a variable in a config file which determines which version of the file to save. That way I could load older versions, and convert them to the up-to-date version.

Does that sound sensible ?

You could either have a number at the beginning or you could check for al_feof() and stop when you reach it.

Would I still be able to get the number of structures in the file that way ?

Matthew Leverton
Supreme Loser
January 1999
avatar

Would I still be able to get the number of structures in the file that way ?

If you needed to know up front (to create the memory), then it would be simplest to store the size as an integer at the beginning.

Quote:

Then I'd store a variable in a config file which determines which version of the file to save.

You really don't have to keep any details on what version to save. The program could always save the latest version. So the writing code could be simple.

However, the reading code may need to do something slightly different depending on which version it is reading. But normally it looks something like:

if (version > 10) {
  foo->x = al_fread32le(fp);
  foo->y = al_fread32le(fp);
}
else {
  foo->x = 0;
  foo->y = 0;
}

William Labbett
Member #4,486
March 2004
avatar

Right. Thanks for all of that. It should keep me busy for a good while.

OnlineCop
Member #7,919
October 2006
avatar

How frequently do you plan to add/remove data from the file?

If infrequent, keep up what you're doing. If you think, however, that reading/rewriting will be a major portion of it, you could also weigh the computational hit of writing a kind of byte index at the beginning of your file.

custom_record#SelectExpand
1 3 // 3 records follow 2 5 // offset of 1st record 3 9 // offset of 2nd record 4 11 // offset of 3rd record 5 data1a 6 data1b 7 data1c 8 data1d 9 data2a 10 data2b 11 data3a 12 data3b 13 data3c 14 data3d 15 data3e

Then you could jump immediately to the data you need. This is a lot more complex, and if you don't do it right, you'll corrupt the entire file ("FAT table? I don't need this; let's erase it and see what... hey, what the..?"), so proceed carefully if you do want to do this.

Advantages: Faster sinital seek time to get to the record you want
Disadvantages: Remove a record in the middle, and you've essentially got to load the entire glossary, modify it (correctly) after the record is removed, write that back out to the file on save (where you MAY have to shift ALL of your elements up several bytes if you remove 1+ indices). But reads would be faster...

William Labbett
Member #4,486
March 2004
avatar

I know I'll be reading writing and rewriting a lot but I think I'll be okay without the byte index since all the structures are the same size so I just need to al_fseek to start of the struct when reading/writing it.

Does that sound okay ?

/*** EDIT *****/

I've got another query hopefully someone can help with :

I was thinking that since I'm only going to be using my file editor on my machine it won't matter if the size of types is different on another, so I thought maybe I didn't need to change the code.

Then I thought, what about when the game runs on another machine ?

My struct looks like this :

#SelectExpand
1struct F_REP 2{ 3 4 char png_filename[ MAX_LENGTH_FOR_FRD_STRING ]; 5 6 int type; 7 8 int squares_wide_on_map; 9 int squares_high_on_map; 10 11 int x_offset; /* the x offset from the top left square indicating where the feature is positioned */ 12 int y_offset; 13 14 15 16 int num_layers_or_frames; 17 18 int ticks_per_frame; 19 20 int line_in_animation_specs_file; 21 22 int altitude; 23 24 25 int feature_has_shadow; 26 27 int shadow_x_offset; 28 int shadow_y_offset; 29 30 int feature_has_collision_map; 31 32 int c_x_offset; 33 int c_y_offset; 34 35 int has_mask; 36 37 38 39 40};

Could this potentially cause problems when

I do

fread( *array_of_F_REP, sizeof(F_REP), num_reps, file);

?

So what I'm really asking is Do I really need to rewrite the code ?

bamccaig
Member #7,536
July 2006
avatar

The int data type varies in size from compiler and architecture to compiler and architecture. That means that sizeof(F_REP) will also vary. If the files are written on a machine where int is 4 bytes, but read on a machine where int is 8 bytes, then you'll read incorrectly and the data will all be wrong. Additionally, from architecture to architecture, the byte order of data types varies. On some machines, bytes appear as you would expect: ABCD. On others, the bytes are reversed, DCBA. Again, if you write on one and read on another then the data will be wrong.

This is the same problem seen with network programming. If you write a networked program that you want to be able to speak to any computer of any architecture then you need to define a protocol to communicate multi-byte data with. The common convention is to send multi-byte values as Big Endian (most significant byte first), which translates to the last byte first. This is referred to as network byte order and is a good way to store data in the file system as well. The sender/writer needs to convert multi-byte data from host byte order (whichever byte order the machine uses) to network byte order. The receiver/reader needs to convert from network byte order to host byte order. There are a set of functions in Berkeley Sockets and WinSock to do this. Presumably, that is exactly what Allegro's functions do as well (I imagine that they are just wrappers over the network library calls).

Note that if the files are only ever created and read on the same machine (never shared between machines) then you don't technically need to care about storage format because the format won't change for a given machine (unless you change something about the machine, like the compiler or operating system architecture).

Matthew Leverton
Supreme Loser
January 1999
avatar

So what I'm really asking is Do I really need to rewrite the code ?

Yes, you do. (Or at least, it would be easier to do that than to manage different data files for 32-bit and 64-bit.)

It should be pretty simple, since you mostly have integer data. Just use the functions I already mentioned. You can use al_fwrite to write out png_filename.

Oscar Giner
Member #2,207
April 2002
avatar

bamccaig said:

Note that if the files are only ever created and read on the same machine (never shared between machines) then you don't technically need to care about storage format because the format won't change for a given machine (unless you change something about the machine, like the compiler or operating system architecture).

Actually it's a compiler only think, not OS. So distributing an executable wouldn't be any problem since it will do the same on any computer able to run it. The problem comes when you distribute the source code and people compile it with different compilers (or compiler version) and/or for different architectures.

bamccaig
Member #7,536
July 2006
avatar

Matthew Leverton
Supreme Loser
January 1999
avatar

Actually it's a compiler only think, not OS.

Correct ... and if you have any plans on ever distributing your program, you might as well just do it properly from the beginning. It's not that much more work, and it saves problems of having incompatible data files in the future. And as already mentioned, saving the entire struct as-is makes it more difficult to change the struct.

 1   2 


Go to: