13 Ways The PDF Is Vulnerable

PDF LogoWhat makes the PDF so enticing to malicious users? There are more reasons than you think.

With the recent headlines about Adobe PDF vulnerabilities being taken advantage of, just about anyone who used a PC was on the alert. PDF files have the potential to do some serious damage to systems and data when infected.

Because the PDF is not without its weaknesses, anticipating ways in which attackers can use the format can be the best way to defend against it.

Below is a brief look at 13 ways—both technical and simple, in which the PDF is vulnerable and can be manipulated by malicious users.

1) JavaScript

Online PDFs are designated with open parameters that can be injected with malicious JavaScript code. Because of the flexibility of JavaScript, hackers have a broad range of what can be done using the PDF file as their hacking tool of choice.

2) Spam

The recent spamming attacks this year demonstrated a way of exploiting the nature of the PDF as file format. Until recently, the PDF never really got caught at the anti-spam gates. Thus, although most anti-spam products now check PDFs and other forms of image spam, PDF containing spam made it into millions of inboxes everywhere. Although not immediately threatening as code executions, spam is still spam and has the ability to deliver Trojans, viruses, and malware.

Continue reading →

Converting Scanned AutoCAD PDFs With OCR

As the new 2008 year rolls on, so does the work and no doubt, the PDF conversions as well. Don’t worry, we’re at it too. And every now and again, amidst troubleshooting and developing, we get an email from clients having difficulties with AutoCAD PDFs:

“I downloaded and installed your Pro version as a trial.  When I tried to convert a PDF file which was an AutoCAD drawing scanned and saved as such, it seems as if it was working but it opens Excel and nothing is converted in?”

If you’re experiencing or have experienced the same problem without any luck, don’t give up yet. Here’s a conversion tip: try resizing the image-based/scanned PDF.

This is because AutoCAD files are usually created with huge page dimensions that measure up to 30″ by 40″. In addition, it is difficult for the OCR engine to determine the size (in points) of any letter on an OCR page.  So the OCR engine is oftentimes unable to extract legible text from AutoCAD documents due to the small text size (hence the empty Excel output).

The only way it can determine the size of the text is by comparing it relative to the size of a stated PDF page which the OCR engine can read and support. The OCR engine in Able2Extract Professional can only support AutoCAD file dimensions of up to 22″ by 22″.

To resize the PDF:

1) Open the PDF in either Adobe Reader or Acrobat

2) Select File > Print

3) Change the Printer Name to ‘Adobe PDF’ in the drop box

4) Under the Page Scaling section ensure that ‘Choose Paper Source by PDF page size’ is deselected

AutoCad Print

5) Click OK to print a new PDF

You can also resize the PDF with our trial version of Sonic PDF Creator 2.0.  After installing Sonic, select ‘Sonic PDF’ as a printer (as opposed to Adobe PDF in step 3).

After you’ve resized the PDF, try the conversion again.

Hope this tip helps!

PDF, A De-Facto Standard No More

While you’re all excited about the upcoming holidays and can’t think of anything else but that gift list to get through, you can add one more thing to get excited about.

The de facto standard of information interchange, aka the PDF, just got one step closer to being adopted as a standardized format. Last week, the PDF 1.7 specification gained the approval votes it needed from ISO committee voting members as it reached the Enquiry “Close of voting” stage in the standardization process.

Before this certification happens though, the comments included with the votes need to be addressed before the format gets its official ISO standard tag—ISO 32000 (lovely name, no?). Even with those last few hurdles, the PDF’s standardization process is looking good.

Jim King, PDF architect and Senior Principle Scientist at Adobe Systems Inc. will serve as technical editor for the international working group meeting in January where the submitted 205 comments will be resolved.

On his blog he states, “If the group can address all the comments to the satisfaction of all countries, especially the ones voting negatively, it is possible to finish at that meeting and publish the revised document.”

So Is It Still An Adobe-Microsoft Showdown?

In the face of impending success, you can’t help but wonder about OOXML and where its standardization is headed.

OOXML was also submitted and fast tracked for an official ISO standard, but rejected in September. Alongside that rejection was the controversy over Microsoft’s active influence over committee members and their votes. The OOXML proposal then went back to the drawing board for revisions to take the negative votes and comments into account.

Boxing AnimalsNow, three months later, as its Ballot Resolution Meeting (BRM) draws near in February, OOXML’s standardization is still up in the air as its interoperability, the OOXML hot topic of the day, will be a major factor in the decision to approve it as such.

Making it even harder is that OOXML is constantly held up against ODF, the poster child of open source solutions. It’ll be interesting to see how “open” and how much “interoperability” a Microsoft format can possess in general.

While that issue unfolds, the PDF will more than likely get the ISO standardization without much drama. Has Adobe won this round already without even trying?

These are exciting times for the PDF format indeed.

Able2Extract v.5.0 Is Here!

We are proud to announce that we have officially launched the upgraded version of our flagship products, Able2Extract and Able2Doc. It’s a whole new version on a whole new level with a whole new look!

New Able2Extract 5.0 Features

This latest 5.0 version is sporting newer, more advanced features that lets you convert your PDF into more formats than ever before. We’ve managed to pack this upgrade with a lot more conversion options. Like what?, you ask. Read on.

First off the list, Able2Extract v.5.0 now offers PDF to Image conversions. Our new PDF to Image converter can generate popular image file formats, such as JPEG, BMP, GIF, PNG, and TIFF. You can designate the output directory, set image DPI and perform black and white conversions.

Second, with Able2Extract v. 5.0, you can now view and convert Microsoft’s new XPS document format. Convert XPS with all the same output features and conversion settings by simply opening and converting the format as you would a regular PDF file.

Third, this latest upgrade can support PDF Forms conversion. You can convert interactive PDF forms to editable Word Documents which you can fill out, save and modify later on. This conversion feature has the ability to retain form elements, such as text fields, radio buttons, and checkboxes.

Our Able2Doc v.4.0 can perform the same PDF Forms to Word conversion, and can also support XPS to Word conversion capabilities. Ideal for those who are only looking to convert to Word and TXT file formats.

Go ahead and sample these new features for yourself. You can download the free trial, in either the Standard or Professional versions, and take it for a test run. For ordering, product , and pricing details, check out our site—it, too, has undergone a bit of remodeling.

The ABCs of the PDF: D to F

The holidays are over for this year and it’s time to get back to work—and back to learning. Here’s the second posting on the ABCs of the PDF and, as promised, a few tidbits behind the day to day elements you use in your PDF work.

Data

The most important thing about a PDF is the data— its printability, its transmission and its integrity. Of course, this last point is the driving force behind the inability to edit PDF content, a quality with which the PDF world is familiar.

There have been notable rants and raves about this and, consequently, about the “usability” of the PDF format, citing issues such as document size and on-screen behaviour as annoyances. And yet, there are strong arguments defending the PDF and the working needs it fulfills with its “set-in-stone” data.

What do we make of this PDF usage debate? What’s the bottom line for PDF users, makers and shakers?

Fantasy: digital documents that aren’t easily manipulated by malicious users.

Reality: file integrity and data extraction go hand-in-hand with the PDF format. The only way to change or work with the data is to extract (aka “convert”) the content.

Conclusion: when working with a PDF, work with conversion in mind —which conversion software is practical for daily use, which format conversions you need, what kinds of PDFs you’re working with (scanned or native), what security features restrict the data you need, etc.

Encryption

And speaking of security, you’ll more than likely encrypt the PDF documents you create yourself. So, here are a few knick-knacks surrounding the encryption you’ll use:

• You may see the word “bit-encryption” when creating a PDF. Bit-encryption, which secure your documents, is based on the use of binary digits

• The higher the bit-number, the more secure your files are because of the increased probability of possible decryption keys. A 128-bit encryption, for instance, has a key length of 128 bits long, meaning that there 2128 possible keys

• Sonic PDF Creator v.1.2 includes 40- and 128-bit encryption

• The DES (Data Encryption Standard) was based on 58-bit encryption and adopted by US Federal government in the 1970’s. The current AES (Advanced Encryption Standard, 2000) is based on the RijnDael algorithm which makes use of128- to 256 -bit keys. It was adopted after winning a 3-year competition against other algorithms

• The concept of the computer, in fact, was based upon “cracking codes.” It was developed during WWII while trying to decode encrypted messages through the use of an “Enigma” machine

Fonts

As a PDF user, you know that part of maintaining the document’s appearance is retaining the textual font within the PDF. Yet, there is more to fonts than just a pretty face.

• There are about 20 components in the anatomy of a letter that define one typeface from another

• There are 3 different types of hyphen/dashes and, of course, vary in usage— and in look, from typeface to typeface (Three? Yes, three. Who would’ve thought?)

• Which fonts are best used for on-screen (PDF) presentation?

• The fonts used in a document affect the way you read the textual information. Serif fonts help to guide a reader’s eye along the lines in large blocks of text. Thus, Times New Roman, for instance, is generally used for printed text. Sans-serif fonts are used ideally for on-screen text because it presents a legible rendition on-screen

• Do you know the history behind the letters and fonts you use in your PDFs?

Hopefully, next time you read or create a PDF, you’ll look and think differently about the extraction, the encryption and the fonts you use on a daily basis. And, who knows, with a little tinkering, you just might create that ultimate PDF!