Kate Gregory's Blog
Saturday, July 25, 2009
I do love solving a mystery
A client asked me to help recently with a small mystery. They had a database provided by a customer and they'd been asked to import the contents into the tables used by their own product. One of the tables had a BLOB column and from context they were quite sure it was used to hold scans of documents. There was even a "filename" column and a "filetype" column that suggested very strongly the scans were stored as TIFFs.
It had taken a while to find code to read the blobs, and when they ran it, the resulting file was rejected as not being a valid TIFF. They weren't sure if they were handling the blobs wrongly, if the data was encrypted, or if it was some other image format (they had tried PDF and GIF already.) In a highly enjoyable two hours, here's what I did:
Found short (ten lines including initialization and cleanup)
to read one blob in VB6 and save it to disk.
TIFF format details
, looked in the resulting file with notepad and confirmed it didn't start either II or MM and so wasn't a TIFF.
Looked at a few other file formats but wasn't really gaining any knowledge, just ruling things out that you could rule out by renaming and double clicking, then having the file rejected by the app that tried to open it.
Discovered Marco Pontello's
absolutely cool File Identifier, TrID
, and downloaded it
Removed the extension from what had been test.tif, pointed TrID at it, and was told 100% it was a zip file. Duh, the file started PK, I might have guessed that one.
Renamed it to test.zip, unzipped it by hand -- ooh, it IS a zip! -- and was rewarded with file.txt for my trouble
Looked at file.txt in notepad and noticed that it was full of binary-looking gibberish, but it DID start with II
Hand renamed file.txt to file.tif and double-clicked it
Presto! A scan of a document!
I left my client to write the code that did all the blobs, including unzipping them and renaming (every single blog contained a zip which contained a TIFF renamed to file.txt and no, I don't know why) from within a quickly written importer application. The big mystery was solved. Thanks, Marco!
Seen and Recommended
Saturday, July 25, 2009 12:13:01 PM (Eastern Daylight Time, UTC-04:00)
Comments are closed.
© Copyright 2019 Kate Gregory
Theme design by
newtelligence dasBlog 2.3.9074.18820
| Page rendered at Friday, December 06, 2019 2:41:45 AM (Eastern Standard Time, UTC-05:00)
On this page....
Pluralsight Free Trial
Click Start a FREE 10-Day trial
Home - Gregory Consulting Limited
Visual C++ on MSDN
Office 12 and VSTO
Seen and Recommended
Visual Studio 11
Visual Studio 2008
Visual Studio 2010
Visual Studio 2013
Visual Studio 2015
Visual Studio 2017
Visual Studio 2019