PDF check

PDF check

Thomas Harte

Member #33

April 2000

A slightly tenuous entry into the realm of "General discussion for any programming topic", my question is: is my PDF writing code working correctly? There are lots of little caveats to the PDF file format (at least when you start dumping typefaces into it), any one of which I may have messed up.

Sadly, there seems to be no PDF equivalent to the W3C HTML validator. I'm therefore curious to know how your PDF renderer displays my file.

The attached .zip file is 688kb. The PDF it contains is 35.2mb. I'm currently using an extremely verbose method to describe my fonts, and I'm not using any of the various options for compressing PDF data streams. I'm not that bothered about that now, it's a lower level problem than I'm currently concerned with. I'm failing to add a record that explains how the glyphs in my typefaces map onto unicode, so copy and pasting almost certainly won't work.

So far I've only tested the built-in PDF support of OS X, since it's all I have access to here. If people with access to other renderers (Acrobat Reader, Foxit, KPDF, whatever) would be willing to download my PDF and check that it matches the following two hastily-grabbed reference renderings then I'll be very grateful when next I awake.

Thanks in advance to anyone that helps out!

http://www.allegro.cc/files/attachment/596424
http://www.allegro.cc/files/attachment/596423

[My site] [Tetrominoes]

lambik

Member #899

January 2001

Looks like this for me in Reader; (45% zoom to fit 1 page on screen for screenshot) I'd say it's pretty much the same
http://www.allegro.cc/files/attachment/596426
http://www.allegro.cc/files/attachment/596427

SiegeLord

Member #7,827

October 2006

It looks just like the reference rendering in Evince, pixel for pixel.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Thomas Fjellstrom

Member #476

June 2000

It won't open in KPDF at all, but the new kpdf (Okular) gives me:

http://www.allegro.cc/files/attachment/596428

http://www.allegro.cc/files/attachment/596429

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Matthew Leverton

Supreme Loser

January 1999

It looks the same as the reference pictures in Foxit Reader for Windows. The scrolling is somewhat laggy.

--
RTFM | Follow Me on Google+ | I know 10 people

GullRaDriel

Member #3,861

September 2003

Same as the reference picture too, FoxitReader, but I don't see the lagging scrolling.

"Code is like shit - it only smells if it is not yours"
Allegro Wiki, full of examples and articles !!

tobing

Member #5,213

November 2004

Looks fine with Adobe Reader here.

http://www.villages-and-cities.de

Thomas Harte

Member #33

April 2000

Thanks to everyone for testing!

Quote:

It won't open in KPDF at all

That's peculiar. Does KPDF dump anything helpful to stderr or indeed to anywhere else helpful? I'll try to get a Linux VM going and check this out

Quote:

but the new kpdf (Okular) gives me:

Also odd that the first page has a grey background. Though I'm not explicitly setting paper colour (or ink colour), so I'll look into that.

Quote:

It looks the same as the reference pictures in Foxit Reader for Windows. The scrolling is somewhat laggy.

I think I'm just paying the price for using Type 3 fonts. As you may or may not be aware, Adobe defined two interesting types of Postscript font, Type 1 and Type 3. Type 3 is pretty much just providing a Postscript description of each glyph, which is really easy and convenient and is what I'm doing. Type 1 uses a subset of Postscript and an optimised renderer but was originally meant to be proprietary so the main part is encrypted and everything is much more strict. When TrueType came along Adobe abandoned the idea of keeping Type 1 fonts to themselves, but they just did that by releasing the encryption codes. So there'd be another layer of complexity to implementing Type 1 fonts, and with the very limited verification tools available to me it's not something I felt was likely to work first time.

That and the fact that each one of my dots is a separate Postscript path probably kill the rendering speed - each dot has to be tokenised separately and the pixel form can't be cached, though if it isn't caching Type 3 glyphs then I'm not optimistic that there's any part of the Postscript path with caching enabled.

I think my plan is just to do as much as I can with Type 3 (in speed and filesize terms) and then call it a day, as it achieves all of the main functionality goals and I'll bet nobody ever uses the "print to PDF" feature anyway.

EDIT:
Well, I'm at work now and Acrobat Reader makes a surprisingly good job of the whole thing. Even copy and paste mostly works; I guess it's assuming something close to ASCII, which is not too far off the characters actually encoded in that file.

EDIT2:
On that line of thought, this PDF file is encoded as pure ASCII if anybody is that interested to see what they can look like inside before various compression filters are applied.

[My site] [Tetrominoes]

Thomas Fjellstrom

Member #476

June 2000

Quote:

Also odd that the first page has a grey background. Though I'm not explicitly setting paper colour (or ink colour), so I'll look into that.

I think that page has NO background at all, my post is colored grey, so the bg would show through on a transparent png.

Okular prints a lot of stuff when loading, like:

Error: Could not parse ligature component "Font59330" of "Font59330_255" in parseCharName                                                
Error: Could not parse ligature component "Font29665" of "Font29665_0" in parseCharName

And KPDF has no printout, I'd have to install the debug version to get any likely. And I can't install it without debian downgrading my KDE 4.1 install to 3.5.9

Thomas Harte

Member #33

April 2000

Quote:

Okular prints a lot of stuff when loading, like:

Oh, good tip! I guess I'll grab a KDE4 distro. OS X's Preview.app dumps some helpful few things onto the console when I've made sufficiently large an error that it can't really figure out how to display much at all, but otherwise isn't proving to be a particularly helpful diagnostic tool.

[My site] [Tetrominoes]

Thomas Fjellstrom

Member #476

June 2000

You'll probably need to install the debug versions to see any of that though.

Thomas Harte

Member #33

April 2000

Cool, I'll keep that in mind when I look into it this evening. I might try hopping on the "KDE for OS X" betas before actually grabbing a whole Linux distro.

Re: my problem, it seems that ligatures are defined by the font encoding, i.e. the same place where I should be saying how my internal codes map to unicode but currently am not. I guess the default encoding on your platform expects some of the common ligatures such as 'fi' and, obviously, doesn't find them.

EDIT: much improved PDF attached. My attempts to download the beta OS X port of KDE 4 and the kdegraphics package is being thwarted because kde.org only provides torrents and there are currently 0 peers offering me kdegraphics. Though I did manage to get most of the prereqs.

EDIT2: my previous edit was the subject of a misupload, a fixed file is now attached.

[My site] [Tetrominoes]

Thomas Fjellstrom

Member #476

June 2000

kpdf still isn't working and theres 3 pages now in Okular, with the following (new) error: Error: Illegal entry in bfrange block in ToUnicode CMap

Thomas Harte

Member #33

April 2000

Oh, yeah, I changed the paper size from [the printable area on] US legal to [the printable area on] A4 so that I could print the thing slightly more realistically. Another change is that the emphasised double width font should be visually correct now.

Can't imagine what I've got wrong in the bfrange. It's been pointed out to me on another forum that I've typod the %%EOF marker (to just %EOF), which is probably what's leading to the KDF & KGhostscript's refusal to do anything all. The first place a PDF reader looks in a PDF is right at the end, so having an incorrect marker there means that my PDF trivially isn't valid. I guess many of the readers are just scanning backwards for the first useful token rather than bothering to explicitly check for the %%EOF.

[My site] [Tetrominoes]

SiegeLord

Member #7,827

October 2006

The EPO_-_fixed.pdf still works in Evince now. I tried it in Kpdf, and it worked for me also. KGhostView couldn't load it.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Thomas Harte

Member #33

April 2000

Right. Please try the latest attachment.

It has been tested against:

Mac OS X v10.5's Preview.app
Acrobat Reader 8 for Windows
Foxit Reader 2.3
the ABC Amber PDF validator
the Multivalent PDF validator

Re: my comments earlier, those validators are literally just validators; they return either valid or invalid and nothing else particularly helpful.

[My site] [Tetrominoes]

Vanneto

Member #8,643

May 2007

Looks good in Acrobat Reader 8.

In capitalist America bank robs you.

SiegeLord

Member #7,827

October 2006

Works for me in Evince, KPDF and KGhostView. Incidentally, the page rendering speed is very slow... not sure if that's to be expected given custom glyphs.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Thomas Harte

Member #33

April 2000

Quote:

Incidentally, the page rendering speed is very slow... not sure if that's to be expected given custom glyphs.

They're not just custom glyphs, they're painfully poorly encoded glyphs. I have one PostScript function to draw a circle, then I just push the graphics transformation matrix, set it to a new location, call the circle function, pop the old matrix, ad nauseum for every glyph. Worse than that, the various typefaces that are just manipulations of others (e.g. emphasised mode is just the same glyphs, but printed twice, the second time a little to the right) are all encoded separately. Which wastes space and processing time.

One I'm confident that my low level PDF code is good, I'm going to attend to that. Having the various base glyphs in their own separate postscript functions and just calling them as many times and at as many offsets as necessary in my actual font will save a lot of space and should make things a lot faster. Type 3 fonts are basically unused for any commercial purpose, so I guess most PDF viewer authors haven't bothered to do anything more than make sure they work.

Conversely, normal Postscript forms seem to be universally cached. I think that's actually the reason that some of the PDF viewers display my pages as blank or otherwise weird when you zoom too far out; because they're built from a cached one-point diameter circle, sampling errors are multiplied massively.

Anyway, I'm still concerned about my ToUnicode map. I've tweaked it a little, but it's probably still wrong. I just can't spot why.

[My site] [Tetrominoes]

Thomas Fjellstrom

Member #476

June 2000

Ok, works in kpdf just fine (I might have loaded the wrong file last time I tested, seems konqueror saved this one, and probably the last one with _s instead of spaces which the first copy used). And okular prints a ton of these still:
Error: Could not parse ligature component "Font118656" of "Font118656_255" in parseCharName

Neil Walker

Member #210

April 2000

Tried it in Adobe 7.1 professional, 8.1 reader. Both work fine, though are painfully slow.

Tried it with Adobe Designer 7 (the pdf design/creation tool) and it failed miserably, but that's probably ok.

Though why are you going to such efforts to support an ancient printer?

Neil.
MAME Cabinet Blog / AXL LIBRARY (a games framework) / AXL Documentation and Tutorial

wii:0356-1384-6687-2022, kart:3308-4806-6002. XBOX:chucklepie

Thomas Harte

Member #33

April 2000

Quote:

okular prints a ton of these still:
Error: Could not parse ligature component "Font118656" of "Font118656_255" in parseCharName

Hmmm. I really can't find anything in the PDF spec to explain this - I guess I must be failing to grasp some concept or another.

Quote:

Though why are you going to such efforts to support an ancient printer?

I always go to ridiculous efforts with my emulator. Based on the best information I have available, there's not a single deliberate compromise in there; bus contention is correct, all display data is collected at exactly the right time (most emulators just sort of do an instantaneous fetch of all the data on one scanline at the end of the scanline, and round all palette changes to that position), floppy disks are emulated internally as a sequence of magnetic polarity changes on the platter (actually because that's the only way to fully support one of the disk image formats I support) and tapes are similarly decomposed to audio waves and then those are interpretted (though that's partly because I used to support direct audio in, and in any case there are many different ways to encode the same tape and I don't think it should be the capture program's duty to guess how an emulator might like it formatted), interrupts are tested and occur for exactly the correct periods, etc, etc.

I have a relatively unique design in that the CPU is subservient to the thing actually managing the emulator and can be paused at any point, even mid-opcode.

Counterintuitively, mine is also a very speedy emulator because being rigorous allows it to take a very broad overview of how the machine is being used and apply processing optimisations accordingly.

I have supported a simple FX-80 control code to RTF filter for a while, but that was really only good for mimicking some elements of text formatting (and its not very accurate at that - things like back spaces and reverse line feeds have to be ignored). Recently someone requested a sufficiently accurate emulator to test their screen dump routines, so I just sort of set lofty aims from there. Anything less than actually emulating the print head is just a hack and won't necessarily work with all software.

[My site] [Tetrominoes]

Indeterminatus

Member #737

November 2000

Looking just as expected in PDF Complete.

_______________________________
Indeterminatus. [Atomic Butcher]
si tacuisses, philosophus mansisses

gnolam

Member #2,030

March 2002

Looks decent in Sumatra PDF.

--
Move to the Democratic People's Republic of Vivendi Universal (formerly known as Sweden) - officially democracy- and privacy-free since 2008-06-18!

Ron Novy

Member #6,982

March 2006

Hey.. The PDF looks fine and all, but ever since I opened your PDF file I get a program that locks up during shutdown called 'Font Capture'. I'm not sure if it's related to your PDF, but I thought you should know... I guess you are supposed to clean the temp folder to fix it but haven't tried it yet...

----
Oh... Bieber! I thought everyone was chanting Beaver... Now it doesn't make any sense at all. :-/