Allegro.cc - Online Community

Allegro.cc Forums » Off-Topic Ordeals » PDF check

This thread is locked; no one can reply to it. rss feed Print
PDF check
Thomas Harte
Member #33
April 2000
avatar

A slightly tenuous entry into the realm of "General discussion for any programming topic", my question is: is my PDF writing code working correctly? There are lots of little caveats to the PDF file format (at least when you start dumping typefaces into it), any one of which I may have messed up.

Sadly, there seems to be no PDF equivalent to the W3C HTML validator. I'm therefore curious to know how your PDF renderer displays my file.

The attached .zip file is 688kb. The PDF it contains is 35.2mb. I'm currently using an extremely verbose method to describe my fonts, and I'm not using any of the various options for compressing PDF data streams. I'm not that bothered about that now, it's a lower level problem than I'm currently concerned with. I'm failing to add a record that explains how the glyphs in my typefaces map onto unicode, so copy and pasting almost certainly won't work.

So far I've only tested the built-in PDF support of OS X, since it's all I have access to here. If people with access to other renderers (Acrobat Reader, Foxit, KPDF, whatever) would be willing to download my PDF and check that it matches the following two hastily-grabbed reference renderings then I'll be very grateful when next I awake.

Thanks in advance to anyone that helps out!

http://www.allegro.cc/files/attachment/596424
http://www.allegro.cc/files/attachment/596423

lambik
Member #899
January 2001
avatar

Looks like this for me in Reader; (45% zoom to fit 1 page on screen for screenshot) I'd say it's pretty much the same
http://www.allegro.cc/files/attachment/596426
http://www.allegro.cc/files/attachment/596427

SiegeLord
Member #7,827
October 2006
avatar

It looks just like the reference rendering in Evince, pixel for pixel.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Thomas Fjellstrom
Member #476
June 2000
avatar

It won't open in KPDF at all, but the new kpdf (Okular) gives me:

http://www.allegro.cc/files/attachment/596428

http://www.allegro.cc/files/attachment/596429

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Matthew Leverton
Supreme Loser
January 1999
avatar

It looks the same as the reference pictures in Foxit Reader for Windows. The scrolling is somewhat laggy.

GullRaDriel
Member #3,861
September 2003
avatar

Same as the reference picture too, FoxitReader, but I don't see the lagging scrolling.

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

tobing
Member #5,213
November 2004
avatar

Looks fine with Adobe Reader here.

Thomas Harte
Member #33
April 2000
avatar

Thanks to everyone for testing!

Quote:

It won't open in KPDF at all

That's peculiar. Does KPDF dump anything helpful to stderr or indeed to anywhere else helpful? I'll try to get a Linux VM going and check this out

Quote:

but the new kpdf (Okular) gives me:

Also odd that the first page has a grey background. Though I'm not explicitly setting paper colour (or ink colour), so I'll look into that.

Quote:

It looks the same as the reference pictures in Foxit Reader for Windows. The scrolling is somewhat laggy.

I think I'm just paying the price for using Type 3 fonts. As you may or may not be aware, Adobe defined two interesting types of Postscript font, Type 1 and Type 3. Type 3 is pretty much just providing a Postscript description of each glyph, which is really easy and convenient and is what I'm doing. Type 1 uses a subset of Postscript and an optimised renderer but was originally meant to be proprietary so the main part is encrypted and everything is much more strict. When TrueType came along Adobe abandoned the idea of keeping Type 1 fonts to themselves, but they just did that by releasing the encryption codes. So there'd be another layer of complexity to implementing Type 1 fonts, and with the very limited verification tools available to me it's not something I felt was likely to work first time.

That and the fact that each one of my dots is a separate Postscript path probably kill the rendering speed - each dot has to be tokenised separately and the pixel form can't be cached, though if it isn't caching Type 3 glyphs then I'm not optimistic that there's any part of the Postscript path with caching enabled.

I think my plan is just to do as much as I can with Type 3 (in speed and filesize terms) and then call it a day, as it achieves all of the main functionality goals and I'll bet nobody ever uses the "print to PDF" feature anyway.

EDIT:
Well, I'm at work now and Acrobat Reader makes a surprisingly good job of the whole thing. Even copy and paste mostly works; I guess it's assuming something close to ASCII, which is not too far off the characters actually encoded in that file.

EDIT2:
On that line of thought, this PDF file is encoded as pure ASCII if anybody is that interested to see what they can look like inside before various compression filters are applied.

Thomas Fjellstrom
Member #476
June 2000
avatar

Quote:

Also odd that the first page has a grey background. Though I'm not explicitly setting paper colour (or ink colour), so I'll look into that.

I think that page has NO background at all, my post is colored grey, so the bg would show through on a transparent png.

Okular prints a lot of stuff when loading, like:

Error: Could not parse ligature component "Font59330" of "Font59330_255" in parseCharName                                                
Error: Could not parse ligature component "Font29665" of "Font29665_0" in parseCharName

And KPDF has no printout, I'd have to install the debug version to get any likely. And I can't install it without debian downgrading my KDE 4.1 install to 3.5.9

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Thomas Harte
Member #33
April 2000
avatar

Quote:

Okular prints a lot of stuff when loading, like:

Oh, good tip! I guess I'll grab a KDE4 distro. OS X's Preview.app dumps some helpful few things onto the console when I've made sufficiently large an error that it can't really figure out how to display much at all, but otherwise isn't proving to be a particularly helpful diagnostic tool.

Thomas Fjellstrom
Member #476
June 2000
avatar

You'll probably need to install the debug versions to see any of that though.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Thomas Harte
Member #33
April 2000
avatar

Cool, I'll keep that in mind when I look into it this evening. I might try hopping on the "KDE for OS X" betas before actually grabbing a whole Linux distro.

Re: my problem, it seems that ligatures are defined by the font encoding, i.e. the same place where I should be saying how my internal codes map to unicode but currently am not. I guess the default encoding on your platform expects some of the common ligatures such as 'fi' and, obviously, doesn't find them.

EDIT: much improved PDF attached. My attempts to download the beta OS X port of KDE 4 and the kdegraphics package is being thwarted because kde.org only provides torrents and there are currently 0 peers offering me kdegraphics. Though I did manage to get most of the prereqs.

EDIT2: my previous edit was the subject of a misupload, a fixed file is now attached.

Thomas Fjellstrom
Member #476
June 2000
avatar

kpdf still isn't working and theres 3 pages now in Okular, with the following (new) error: Error: Illegal entry in bfrange block in ToUnicode CMap

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Thomas Harte
Member #33
April 2000
avatar

Oh, yeah, I changed the paper size from [the printable area on] US legal to [the printable area on] A4 so that I could print the thing slightly more realistically. Another change is that the emphasised double width font should be visually correct now.

Can't imagine what I've got wrong in the bfrange. It's been pointed out to me on another forum that I've typod the %%EOF marker (to just %EOF), which is probably what's leading to the KDF & KGhostscript's refusal to do anything all. The first place a PDF reader looks in a PDF is right at the end, so having an incorrect marker there means that my PDF trivially isn't valid. I guess many of the readers are just scanning backwards for the first useful token rather than bothering to explicitly check for the %%EOF.

SiegeLord
Member #7,827
October 2006
avatar

The EPO_-_fixed.pdf still works in Evince now. I tried it in Kpdf, and it worked for me also. KGhostView couldn't load it.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Thomas Harte
Member #33
April 2000
avatar

Right. Please try the latest attachment.

It has been tested against:

  • Mac OS X v10.5's Preview.app

  • Acrobat Reader 8 for Windows

  • Foxit Reader 2.3

  • the ABC Amber PDF validator

  • the Multivalent PDF validator

Re: my comments earlier, those validators are literally just validators; they return either valid or invalid and nothing else particularly helpful.

Vanneto
Member #8,643
May 2007

Looks good in Acrobat Reader 8.

In capitalist America bank robs you.

SiegeLord
Member #7,827
October 2006
avatar

Works for me in Evince, KPDF and KGhostView. Incidentally, the page rendering speed is very slow... not sure if that's to be expected given custom glyphs.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Thomas Harte
Member #33
April 2000
avatar

Quote:

Incidentally, the page rendering speed is very slow... not sure if that's to be expected given custom glyphs.

They're not just custom glyphs, they're painfully poorly encoded glyphs. I have one PostScript function to draw a circle, then I just push the graphics transformation matrix, set it to a new location, call the circle function, pop the old matrix, ad nauseum for every glyph. Worse than that, the various typefaces that are just manipulations of others (e.g. emphasised mode is just the same glyphs, but printed twice, the second time a little to the right) are all encoded separately. Which wastes space and processing time.

One I'm confident that my low level PDF code is good, I'm going to attend to that. Having the various base glyphs in their own separate postscript functions and just calling them as many times and at as many offsets as necessary in my actual font will save a lot of space and should make things a lot faster. Type 3 fonts are basically unused for any commercial purpose, so I guess most PDF viewer authors haven't bothered to do anything more than make sure they work.

Conversely, normal Postscript forms seem to be universally cached. I think that's actually the reason that some of the PDF viewers display my pages as blank or otherwise weird when you zoom too far out; because they're built from a cached one-point diameter circle, sampling errors are multiplied massively.

Anyway, I'm still concerned about my ToUnicode map. I've tweaked it a little, but it's probably still wrong. I just can't spot why.

Thomas Fjellstrom
Member #476
June 2000
avatar

Ok, works in kpdf just fine (I might have loaded the wrong file last time I tested, seems konqueror saved this one, and probably the last one with _s instead of spaces which the first copy used). And okular prints a ton of these still:
Error: Could not parse ligature component "Font118656" of "Font118656_255" in parseCharName

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Neil Walker
Member #210
April 2000
avatar

Tried it in Adobe 7.1 professional, 8.1 reader. Both work fine, though are painfully slow.

Tried it with Adobe Designer 7 (the pdf design/creation tool) and it failed miserably, but that's probably ok.

Though why are you going to such efforts to support an ancient printer?

Neil.
MAME Cabinet Blog / AXL LIBRARY (a games framework) / AXL Documentation and Tutorial

wii:0356-1384-6687-2022, kart:3308-4806-6002. XBOX:chucklepie

Thomas Harte
Member #33
April 2000
avatar

Quote:

okular prints a ton of these still:
Error: Could not parse ligature component "Font118656" of "Font118656_255" in parseCharName

Hmmm. I really can't find anything in the PDF spec to explain this - I guess I must be failing to grasp some concept or another.

Quote:

Though why are you going to such efforts to support an ancient printer?

I always go to ridiculous efforts with my emulator. Based on the best information I have available, there's not a single deliberate compromise in there; bus contention is correct, all display data is collected at exactly the right time (most emulators just sort of do an instantaneous fetch of all the data on one scanline at the end of the scanline, and round all palette changes to that position), floppy disks are emulated internally as a sequence of magnetic polarity changes on the platter (actually because that's the only way to fully support one of the disk image formats I support) and tapes are similarly decomposed to audio waves and then those are interpretted (though that's partly because I used to support direct audio in, and in any case there are many different ways to encode the same tape and I don't think it should be the capture program's duty to guess how an emulator might like it formatted), interrupts are tested and occur for exactly the correct periods, etc, etc.

I have a relatively unique design in that the CPU is subservient to the thing actually managing the emulator and can be paused at any point, even mid-opcode.

Counterintuitively, mine is also a very speedy emulator because being rigorous allows it to take a very broad overview of how the machine is being used and apply processing optimisations accordingly.

I have supported a simple FX-80 control code to RTF filter for a while, but that was really only good for mimicking some elements of text formatting (and its not very accurate at that - things like back spaces and reverse line feeds have to be ignored). Recently someone requested a sufficiently accurate emulator to test their screen dump routines, so I just sort of set lofty aims from there. Anything less than actually emulating the print head is just a hack and won't necessarily work with all software.

Indeterminatus
Member #737
November 2000
avatar

Looking just as expected in PDF Complete.

_______________________________
Indeterminatus. [Atomic Butcher]
si tacuisses, philosophus mansisses

gnolam
Member #2,030
March 2002
avatar

Looks decent in Sumatra PDF.

--
Move to the Democratic People's Republic of Vivendi Universal (formerly known as Sweden) - officially democracy- and privacy-free since 2008-06-18!

Ron Novy
Member #6,982
March 2006
avatar

Hey.. The PDF looks fine and all, but ever since I opened your PDF file I get a program that locks up during shutdown called 'Font Capture'. I'm not sure if it's related to your PDF, but I thought you should know... I guess you are supposed to clean the temp folder to fix it but haven't tried it yet...

----
Oh... Bieber! I thought everyone was chanting Beaver... Now it doesn't make any sense at all. :-/

Go to: