Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
"PSD is not my favorite file format" (code.google.com)
131 points by iamelgringo on April 23, 2009 | hide | past | favorite | 33 comments


// No, PSD is an abysmal format. Having worked on this code for several weeks now, my hate for PSD has grown to a raging fire that burns with the fierce passion of a million suns.

// Trying to get data out of a PSD file is like trying to find something in the attic of your eccentric old uncle who died in a freak freshwater shark attack on his 58th birthday... I am spending a lot of time imagining amusing fates for the people responsible for this Rube Goldberg of a file format.

Woah. This guy is a natural poet. If I were the computer, my loop might have skipped an iteration.


I love detailed comments that include prose and give me a view of the mindset of the developer who's code I'm working on. It can really help you relate to the person and their code, and make the work easier.

Although it could just be that... if someone takes the time to write a humorous comment, they probably give a shit enough to not leave slop all over the file.


The key to the jokes are to be very specific in your examples. Dave Barry talked about this.


Sounds to me like the product of many years of evolution with many developers working on it, and all that at a large company. Working at a large company myself, I see how these messed up pieces of code / specifications come into being every day.


Exactly. What version of Photoshop dear HN reader was your first? Think about all the features that have been added since then and how hard it must be to maintain even a modicum of backwards compatibility.


Another thing to consider: a file format as popular as PSD probably had a great deal of interest from competitors. Sometimes file formats get obfuscated simply for the purpose of breaking the other guys' import filters.

I've seen this behavior in the file format of an extremely popular application. IIRC, in one section, there were magic bytes that told you the alignment of other blocks that told you how to look up the version of the algorithms that you were supposed to use to unpack the binary data blobs that held resource data. By jiggling around the various forms of indirection, it was possible to make the file formats nigh indecipherable from version to version, while still maintaining backwards compatibility.

Naturally, the parsing code was a total nightmare.


The same thing happens at smaller companies whose codebase has been around for years, too. (Now, imagine if that language had monkeypatching...)

Fixing things like that involves quite a bit of both time and suffering, particularly without good test infrastructures.


This doesn't sound unlike the Office file formats. Apache's POI project named its Office binary reader/writers HSSF and HWPF, acronyms for Horrible SpreadSheet format and Horrible Word Processing Format respectively. IIRC, they are basically serializations of the in-memory structures used by Office. This choice was made when a powerful Windows machine had less than four megs of RAM and manipulation of document data by a 33 mhz processor could take significant time. Of course, like Windows, Microsoft chose backward compatibility over forward movement -- a wise economic choice, at least in the short term.


I think we all hit points, like this, at some point in our coding where we get so frustrated with something we write a huge attack in comment form in the middle of the code. I think it's because we don't really have anyone to rant at whilst we're working.

Working in an environment where many people (often including the target of the rant) will see my code I find myself having to go through it before check-in to source control and remove many of these rants. Oh to be in a position where only I and my closest allies will see my frustrations!


I think Adobe made that on purpose to some extent.

The less other programs are able to handle a full-featured psd, the more consumers will likely buy the full version of photoshop (ok, some might be getting it through other sources, but at least most businesses would choose the real Photoshop over other programs).

Consider Open Office: As it handles only 99% of doc-files correctly, I found myself quickly buying the full MS-version, because you can't just manage that hassle in daily business life.

EDIT: I forgot: Great read, anyhow!!!


I seriously doubt that Adobe made the spec all spaghetti'd on purpose. They probably just kept hacking on new features without cleaning old code, and eventually someone went "Our spec is a nightmare!". At this point, someone with decision making power considered things for a brief second, squinted his eyes, and said "good."


That's why I said "to some extent". I agree that they did not intentionally plan to f* up their code. But I'm sure they could have been refactoring their spaghetti at some point, but obviously they didn't.


The code may be beautiful, but the documentation is garbage. It's garbage because nobody at Adobe uses it, and very few care about 3rd party development. Furthermore, there is ambivalence there about encouraging understanding of the file format (which is perhaps why the documentation is very partial) and definitely real fear and loathing on the issue of people emulating the Photoshop plugin host interface (if they could make that illegal, they would).

Business as usual in big monopolist land, really. The minnows will eventually bring them down anyway.


Committed by me the other day:

-May god have mercy on all our souls for this code.

+Whereas once I was lost, now I am found.


Was malloc() but now am free()?


Having written code to load PSD files, I have some sympathy with that view.

However, one thing which is good about PSD is that they've put a lot of attention into making it very backwards compatible, to the extent of including the same data twice and useful bitmap renderings. This is very handy, because you don't have to support _everything_ to get an accurate import.


IIRC, Photoshop has defaulted to saving as TIFF since version 7.0 (introduced over seven years ago). I guess they realized long ago that PSD had become an unmaintainable mess.

TIFF was designed to be extensible with custom data chunks and it predates PSD, so it probably would have been the best choice all along. I'd guess Adobe decided against it for "NIH" reasons: TIFF was originally created by Aldus, a competitor in the desktop publishing market... But Adobe ended up acquiring Aldus in the mid-'90s so that reason vanished, leaving PSD and the billions of images saved in that format as a depressing monument of obsolete corporate politics. (Of course the harm done is very limited, since those affected by PSD's problems are just developers -- users don't care as long as their files are saved and opened reliably and quickly.)


I have CS3 here at work, CS2 at home plus I have used every version from 5 up and I don't ever remember it defaulting to TIFF. In fact TIFF is at the very bottom of the list in the save as window.

Is it using tiff internal and still using the PSD extension?


Yeah, PSD is still at the top. I could swear that a fresh install of Photoshop 7.0 defaulted to TIFF -- I remember it because it was also the first OS X compatible version, so all changes felt intriguing... But I don't know if they changed that later due to feedback, or if I hallucinated the whole thing due to spending so much time staring at the rainbow spin cursor (Photoshop on Mac OS X 10.1 and a G4 was not particularly snappy).

Anyway, Photoshop guru Jeff Schewe recommends TIFF over PSD: http://www.luminous-landscape.com/forum/lofiversion/index.ph...

"Everything that can be saved in a PSD can also be save in a layered Tiff-6.0 file format...everything, paths, channels, layers, transparency, layer effects...there is nothing that PSD has "special" any longer and when that happened, Bruce and several engineers and I realized that the end of the Photoshop native file format had arrived.

"Tiff has better compression (zipped tifs), can save everything that a PSD can save and be as large as 4 gigs in size (I think PSD is still limited to 2 gigs). Tiff is publicly documented where PSD requires a special NDA to access the internals. As a result, tiff is a more "archival" format while losing nothing by being used." [- - ] "So, since Photoshop CS, only PSB is a format unique to Photoshop which does not fall under the heading of having to work with the Suite...and when once looks at all the pluses and minus, I think layered TIFF's offer the best file format for pixels today-unless you are talking raw and then it's DNG-which is essentially, a TIFF-EP file, a variant of Tiff-6.

"Which was one reason that Mark Hamburg, when Lightroom was first released in beta fought really, really hard AGAINST allowing PSD files into Lightroom. He lost that battle.

"So, at best figure that PSD and TIFF are equal but since TIFF is a documented file format and PSD isn't, I lean towards using tiffs whenever possible...because PSD files, suck-and have since Photoshop CS."


As a reluctant Photoshop upgrader who has recently moved to a new machine, I can confirm that a new install of Photoshop 7.0 defaults to PSD (on Windows, at least).


Never did default to TIFF. They did bolt some layer support hacks (binary blobs) to TIFF though.


This somehow reminds me of a joke called

  BOOL TrackPopupMenu(...);
in Windows that as per documentation returns an integer.


Win32 is full of such inconsistencies, for example:

    BOOL GetMessage(LPMSG lpMsg, HWND hWnd, UINT wMsgFilterMin, UINT wMsgFilterMax);
returns:

> If the function retrieves a message other than WM_QUIT, the return value is nonzero.

> If the function retrieves the WM_QUIT message, the return value is zero.

> If there is an error, the return value is -1. For example, the function fails if hWnd is an invalid window handle or lpMsg is an invalid pointer. To get extended error information, call GetLastError.


I think that many of these inane inconsistencies in Win32 stem from the desire to sneak in "just a bit more features" while officially maintaining full compatibility with Win16.

In the book Showstopper!, which documents the creation of Windows NT (a rather fascinating read), there's a revealing passage about how Microsoft got this Windows backwards compatibility religion. Windows had been intended to be a transitional DOS-based system on the way to OS/2, but then Windows 3.0 became an enormous success almost overnight, and Microsoft decided to break up with IBM over OS/2 development.

NT had been intended to have "multiple personalities" in the form of multiple user-level APIs, with OS/2 as the primary one. The breakup had left them without a 32-bit API. With the success of Windows 3 depending to such a large degree on backwards compatibility with DOS, it made sense that the new API should match the existing Windows API as closely as possible. Besides, there was no time to spend on redesigning anything -- they couldn't afford to let IBM's 32-bit OS/2 to steal the market before NT.

To ensure that the new "Win32" matched the old "Win16" as closely as possible, it's mentioned in the book that any API changes had to be presented for approval by Ballmer himself...! Perhaps the group in charge of the GetMessage() function figured that they could manage just fine with the existing BOOL return type (which is just a typedef for int), rather than risk Ballmer's wrath over the cosmetic API change.


I enjoyed the commit message most:

  r11  by paracelsus on Sep 11, 2007   Diff
  Photoshop loader is DONE for now, fuck you Adobe


I can understand this guy's frustration, but he could have better spent his effort commenting the rest of the file properly.

Out of 500 lines, about 30 lines are spent on the rant, and there are almost no comments of any significance anywhere else, not even a file header to explain what the class is and how it should be used.


Agree. I've written a PSD parser. My comments are also somewhat bitter:

http://www.telegraphics.com.au/svn/psdparse/trunk


Something on your web server is turning that into http://www.telegraphics.com.au//psdparse/trunk -- but adding a slash on the end stops this happening. Curious.

(I don't think it's Subversion itself; trying the same thing on another svn-over-http server I have handy doesn't produce the same results.)


Yes, I am aware of an issue. The Subversion server is reverse proxied and I think this is interacting with Apache's auto-redirect if trailing slash is missing. (This is a recent setup, no other Subversion server I've administered does this either.) I need to fix this. Thanks for your patience.


Thankfully I still have Fireworks and it's spectacular PNG file format.


Ok, three words: Best. Comment. Ever. :)


...wow.

Verbose. Hope I never have to work with PSD anywhere except in my happy photoshop 7 (yeah the upgrade wasn't worth it, and I use WINE so...)


"... PSD is not a good format. PSD is not even a bad format. Calling it such would be an insult to other bad formats...

If there are two different ways of doing something, PSD will do both, in different places. It will then make up three more ways no sane human would think of, and do those. PSD makes inconsistency an art form...

Earlier, I tried to get a hold of the latest specs for the PSD file format. To do this, I had to apply to them for permission to apply to them to have them consider sending me this sacred tome. This would have involved faxing them a copy of some document or other, probably signed in blood. I can only imagine that they make this process so difficult because they are intensely ashamed of having created this abomination..."

Awesome comments!!!

I can feel the passionated hate to PSD file format!, even althought I never had to work with that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: