Excel Output Alternatives with Able2Extract's – PDF to Excel Conversion Option
When undertaking a PDF to Excel conversion, you will be prompted with the following dialog box:
The above dialog box gives you two options for your PDF to Excel Conversion.
- Automatic – The default conversion option into Excel is recommended for most conversions into Excel. Under this conversion option, the software algorithm automatically determines the positioning of the Excel columns. In most cases, this will result in perfect alignment within Excel.
In cases where the Automatic conversion results in column misalignments within Excel, the user may want to choose the Custom conversion option. The Custom conversion option allows the user to designate the columns prior to conversion to Excel.
Custom – This option allows the user to make a manual designation as to where the columns of data will be created once converted into Excel. This designation is made visually by the user prior to conversion to Excel.
The "Custom" option may be the most effective option in the following situations:
- The Automatic conversion results in the misalignment of data within Excel.
- The Document data being converted into Excel goes over multiple file pages. This is great for report data that is standardized across many pages.
- The headers and footers of the document are causing misalignments within Excel.
This dialog contains sub options that will allow you to further modify your PDF to Excel conversions. This feature can be accessed by clicking on the gear icon in the main Convert to Excel dialog.
Simple Analysis – This option allows Able2Extract to automatically designate the rows and columns of the PDF to Excel conversion. Use this option when your PDF content contains simple table structures with basic rows and columns. In some cases, the Simple Analysis option can help to fix problematic conversions when converting in Full Analysis mode.
Full Analysis – This is the default for Automatic PDF to Excel conversions. By using this option, Able2Extract will perform a deeper analysis of your PDF tables to produce more accurate conversion results.
Note that when you save your conversion options using the Save as Default setting, the Automatic conversion options you select here will also affect the PDF to Excel conversions done through the Batch conversion feature.
The options in this category specify how Page Ranges for Custom Excel conversions are created. Page Ranges are groups of pages that have similarly structured content and can, thus, have the same table structures applied to them for conversion. They help Able2Extract in performing complex PDF to Excel conversions. Hence, the following options will give Able2Extract some direction as to how the source document is structured.
You will be able to choose from the following options:
- Each Page Has Its Own Table Set
When checked on, Able2Extract will create table structures for each individual page.
- All Consecutive Pages Have The Same Table Set
Selecting this option will allow Able2Extract to create table structures that are appropriate for all consecutively selected pages.
When this option is turned on, Able2Extract will attempt to analyze the selected pages and automatically determine which pages can share the same table structures needed for the conversion process.
Single Table Per Range
With this option checked on, Able2Extract will generate tables around the selection for which you can specify the Custom options noted above.
If the Single Table Per Range checkbox is left unchecked, then Able2Extract's conversion algorithm will perform a detailed analysis on the content of the pages and determine if several tables have to be created to more accurately convert the document.
Save As Default
This check box allows you to save your conversion settings as a default. Select this option and click on OK.
Custom Excel Panel
The Custom Excel conversion Panel provides the end-user with enhanced controls for setting up columns, rows and tables. One major functionality change is the switch from using a single "table" covering the full page, to discrete table selections – allowing more than one conversion "table" on a page. Another change is to offer an Excel preview, so the user can see how the table and selection will appear in a spreadsheet post conversion.
The Page Range feature enables users to select the Page Range to apply a table structure on each page of the indicated page range. The Expand icon enable users to add additional pages where they would like to the table structure. The Exclude icon enables users to exclude a page within a previously selected Page Range.
By default, when the user enters the Custom Excel conversion after selecting the whole document, or selecting several pages of a document, then the entire document will be selected as the Current page range. When the user enters the Custom Excel conversion after selecting only a portion of one page on the document, then only the current page will be selected as the Current page range.
Navigation for Page Range selections can be viewed on the left-side Thumbnail preview sidebar. The initial selection should appear with a red border around the entire page.
Tables can be added to a page by hitting the Add icon under the Tables area on the conversion Panel. To delete a table, you can click on the Delete icon and then left click on the table you wish to delete. If you would like the conversion algorithm to recalculate the table structure for tables within a given page range – you can hit the Replot icon and it will automatically recalculate the column structure for all tables within the active page range. Tables can only be vertically spaced – they cannot be side-by-side.
Once a table has been created, you can edit the table structure using the tools in the conversion Panel. To add a column, click on the Add Columns icon, and then left click where you would like to add a column within the table. To remove a column, click the Erase Column Line icon and the left click on a column to erase it.
The drop down menu below the Add/Erase column icons has to do with the treatment of content (such as text or numbers) when it comes into contact or into close contact with a column line. The two most common treatment items are either "Never Split" or "Always Split". In some cases, say, where a page is slightly tilted, so a column line does hit some words or numbers, you may want to use one of the gray area options, such as "split of 2 spaces between words".
The button below the drop down menu is entitled "Column Types". This allows you to designate how the content within each column is treated for the purposes of Excel – as numbers, by default, or text. The Tables are represented within the dialogue from top to bottom, on the active page.
By default, the conversion algorithm will create a separate row for each line of text that is recognized within a table. In certain cases, however, this may not appropriately capture how the data should be converted (for instance, a "cell" within a table that has multiple lines, whereas other columns just have a single line of data).
The first step is to check the "Show Rows" item under the Rows area of the conversion Panel. This will show the rows on the page, and should correspond with the row placement in the Excel Preview pane. The second step is to check the "Manual Row Editing" item in the conversion Panel, which will activate the "Add Rows" and "Erase Row Line" icons in the conversion Panel. Once these are active you can now add and remove rows in the same way you would add and remove columns.
In some cases, it may make sense to select a specific column to demarcate the rows based on a specific column table within a table or by using the existing row lines on the page. To do so, click on the "Row Settings" button, and make the appropriate selection on the dialogue box. Once this dialogue box opens, you first select the table you'd like to work with (listed vertically from top to bottom, on the active page) then select the scheme you'd like to use to demarcate the rows for each table, and then click "OK".
Header and Footer
For multi-page reports, users may want to exclude the headers or footers from the table. This is controlled by the horizontal lines at the top and bottom of the page. The headers and footers appear as black lines that go completely horizontally across the top and bottom of the page. To adjust the header, click on the "Edit header" icon on the conversion panel. Then use your mouse and left click the header, hold, and move the mouse up and down to move where the header falls. To adjust the footer, click on the "Edit footer" icon on the conversion panel, left click the footer line, and adjust accordingly.
The Header/Footer Options button in the conversion Panel enables the user to select to enable/disable the headers and footers. The user can also opt to keep the contents of the first header (and exclude the rest of the headers) – this is useful for tabular data, such as a report that spans multiple pages, where the header data is useful at the top of a spreadsheet, but not necessary through the rest of the data set. Similarly, the user can opt to keep the data in the last footer – which in some cases may contain table summary or sums from the data set which the user may want to retain in the spreadsheet.
You can activate/deactivate the conversion Preview Table, but selecting the checkbox next to "Show Preview" located towards the bottom of the conversion Panel. By selecting the checkbox next to "Only current page", only the current selected page will appear in the Preview Table. Making this selection may be useful for larger documents, where a very large Preview table can result in slow performance.