{"id":6356,"date":"2016-09-01T17:35:38","date_gmt":"2016-09-01T17:35:38","guid":{"rendered":"https:\/\/www.investintech.com\/resources\/blog\/?p=6356"},"modified":"2019-08-23T12:55:11","modified_gmt":"2019-08-23T12:55:11","slug":"analyze-open-data-able2extract-powerbi-datahero","status":"publish","type":"post","link":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html","title":{"rendered":"How To Analyze Open Data With Able2Extract, Power BI And DataHero"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">There is a general sense of helplessness when it comes to analyzing public data, especially as people think it involves insane amounts of statistical mastery and in-depth knowledge of complicated statistical software. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is especially nerve wracking for data journalists, who are keen on using data to write stories that can actually influence a certain aspect of our society, such as healthcare or education. Truth be told, analyzing data and storytelling actually go hand in hand. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Since the Open Data initiative started, more and more data sets have seen the light of the day on various data-related portals. The most interesting data sets for journalists are the ones who are publicly available, simply because they are free to use and analyze. Those data sets are available on a variety of online sources, such as: <\/span><a href=\"http:\/\/www.data.gov\" target=\"_blank\" rel=\"nofollow\"><span style=\"font-weight: 400;\">www.data.gov<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"http:\/\/open.canada.ca\/\" target=\"_blank\" rel=\"nofollow\"><span style=\"font-weight: 400;\">open.canada.ca<\/span><\/a><span style=\"font-weight: 400;\">, <\/span><a href=\"https:\/\/data.gov.uk\/\" target=\"_blank\" rel=\"nofollow\"><span style=\"font-weight: 400;\">data.gov.uk<\/span><\/a><span style=\"font-weight: 400;\"> and many more. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Open data portals contain thousands and thousands of data sets, related to various branches of government: education, business, economy, crime, justice, healthcare and more. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once you start exploring the online data, you will see that it usually comes in 3 main formats: HTML, XML and PDF.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6373 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg\" alt=\"Common Open Dataset Formats \" width=\"541\" height=\"332\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg 541w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats-300x184.jpg 300w\" sizes=\"auto, (max-width: 541px) 100vw, 541px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">However, if you start investigating the data sets in more depth, you will quickly notice that there is only one format that\u2019s present in almost every data set \u2014 the PDF. So, the logic goes that if you know how to analyze data that\u2019s locked inside a PDF, you\u2019ll know how to analyze any. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">But what makes people want to store data in a non editable format?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">First of all, when you save a data set as a PDF you are reducing its size, so it\u2019s easier to store and upload to online databases. Secondly, since the PDF is not editable by default, you are making sure that no one tampers with your data and changes any of the ever-so-important numerical values. Remember, people spend countless hours gathering data and they are keen on protecting their hard work as much as possible.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">So, once you find a PDF data set, where do you go next? <\/span><\/p>\n<p><span style=\"font-weight: 400;\">You now basically have only one option \u2014 you need to get that data into an Excel or CSV file format, while preserving source document accuracy as much as possible. After you do that, the next step would be to import that converted file into a data visualization tool of your choice, which we will cover later in this tutorial.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When it comes to exporting PDF data, the only tool in the market that has advanced enough PDF exporting capabilities is <\/span><a href=\"https:\/\/www.investintech.com\/prod_downloadsa2e.htm\"><span style=\"font-weight: 400;\">Able2Extract<\/span><\/a><span style=\"font-weight: 400;\">. That is because Able2Extract is not just a regular PDF converter. See, most (if not all) PDF converters on the market only convert PDF to Excel automatically, leaving you with a messy data set. The automatic conversion works good for one page invoices but converting a 1,000 page data set takes a lot more than that. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Able2Extract is the only converter that lets you fully customize your conversion by manually setting up row and column structure, prior to conversion. In addition it lets you preview the conversion results from within the software, which lets you export your data set as accurately as possible. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">First, find your PDF data set. For this tutorial, we are going to use a practice data set containing all funded projects from Canadian Environmental Damages Fund. You can download it <\/span><a href=\"http:\/\/donnees.ec.gc.ca\/data\/partnerships\/grantscontributions\/environmental-damages-fund-funded-projects\/Detailed_Project_Report_-_EDF_Projects_Funded_04-2009_to_03-2014_ENG.pdf\" target=\"_blank\" rel=\"nofollow\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Open the data set in Able2Extract and use custom PDF to Excel conversion to convert it to an Excel file. Set up row and column structure using the right side panel and make sure to check the \u201cPreview conversion\u201d box. Once satisfied, hit the <\/span><b>convert<\/b><span style=\"font-weight: 400;\"> button to send the data to Excel.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6363 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Converting-Open-Data-Able2Extract.jpg\" alt=\"Able2Extract Custom PDF to Excel\" width=\"624\" height=\"337\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Converting-Open-Data-Able2Extract.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Converting-Open-Data-Able2Extract-300x162.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">So, we got our data from PDF and into Excel. Great job! \u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The next step is to go to Excel and clean the data. This will take 15 minutes to 2 hours, depending on the data set, but the thing you are looking for in the end is to end up with data in the <\/span><b>tabular<\/b><span style=\"font-weight: 400;\"> format, which means there is a separate row for each record. It should look something like this:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6362 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Converted-Excel-Data.jpg\" alt=\"PDF to Excel Conversion Results\" width=\"624\" height=\"416\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Converted-Excel-Data.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Converted-Excel-Data-300x200.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Make sure you don\u2019t have any empty rows or blank cells and that all text is formatted in the same way. If there is a row with 3 cells missing it\u2019s best to delete the whole row because it can mess up your end result and produce inaccurate results. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now that we have a clean and tidy data set, it\u2019s time to give life to these numbers and visualize them. Enter data visualization. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Data visualization simply means to create interesting charts from just plain data, which makes it easier to understand and present to your readers. When it comes to visualizing data you have an option between a desktop dataviz tool and a cloud dataviz tool. We will explore one example of both. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Our recommended desktop software for visualizing complex data is Power BI. We are recommending it because of its compatibility with Excel and the fact that it\u2019s free to use for datasets up to 1 GB. You can download it <\/span><a href=\"https:\/\/powerbi.microsoft.com\/en-us\/\" target=\"_blank\" rel=\"nofollow\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Before we start with Power BI, you will need to know that analyzing data starts by asking questions and then using data to answer them. For example, you can ask questions regarding our practice data set before we even upload it to the dataviz tool:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400; text-align: left;\"><span style=\"font-weight: 400;\">What was the EDF funding per region?<\/span><\/li>\n<li style=\"font-weight: 400; text-align: left;\"><span style=\"font-weight: 400;\">Which group received the biggest funding?<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Depending on the data set, you can ask a 1000 questions and, make no mistake, you will get a 1000 answers. OK, let\u2019s move on to more serious stuff. Power BI.<\/span><\/p>\n<h2><strong>Power BI<\/strong><\/h2>\n<p><span style=\"font-weight: 400;\">Power BI is a Business Intelligence tool created for monitoring business performance and discovering market opportunities. Today we will use it as a data journalism tool in order to answer the two questions above. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once you open Power BI you first click on <strong>Get Data &gt; Excel &gt; Connect &gt; Your file<\/strong>. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Choose a sheet where data is located and press Load. Alternatively, you can press Edit if you\u2019d like to check your data set for mistakes once again. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Once you do so, you will find a blank canvas and your data values on the right sidebar panel. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6360 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Side-Panel.jpg\" alt=\"Accessing PowerBI Side Panel\" width=\"624\" height=\"292\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Side-Panel.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Side-Panel-300x140.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">These are the values we are going to slice and dice. Let\u2019s try to answer our first question. If you remember, we wanted to know what was the EDF funding per region. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">The basic data field there is <\/span><b>EDF Funding<\/b><span style=\"font-weight: 400;\"> so we\u2019ll drag it into the \u201cValues\u201d box. The canvas immediately changes and it is now showing us the total EDF funding: <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6372 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Values-PowerBI.jpg\" alt=\"PowerBI EDF Funding Values\" width=\"352\" height=\"377\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Values-PowerBI.jpg 352w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Values-PowerBI-280x300.jpg 280w\" sizes=\"auto, (max-width: 352px) 100vw, 352px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Let\u2019s now introduce another data field. Select the \u201cPie chart\u201d. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6361 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Selecting-Data-Visualization-Graphic.jpg\" alt=\"PowerBI Data Visualization Selection\" width=\"367\" height=\"284\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Selecting-Data-Visualization-Graphic.jpg 367w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Selecting-Data-Visualization-Graphic-300x232.jpg 300w\" sizes=\"auto, (max-width: 367px) 100vw, 367px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Drag the \u201cRegion\u201d field into the \u201cLegend\u201d box. Congrats, you made your first data visualization! We now have an overview of the funding per region and we can already start answering some questions. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6375 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Region-Visualization.jpg\" alt=\"EDF Funding Visualization By Region\" width=\"538\" height=\"476\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Region-Visualization.jpg 538w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Region-Visualization-300x265.jpg 300w\" sizes=\"auto, (max-width: 538px) 100vw, 538px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">However, if you pay close attention you can see that we still don\u2019t know the exact funding for each region. To show the exact values of data fields, go to \u201cFormat\u201d panel:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6359 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Format-Panel.jpg\" alt=\"Accessing PowerBI Fromat Panel\" width=\"366\" height=\"406\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Format-Panel.jpg 366w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Format-Panel-270x300.jpg 270w\" sizes=\"auto, (max-width: 366px) 100vw, 366px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Expand the \u201cDetail Labels\u201d category, find the Label Style and select \u201cBoth\u201d from the drop down menu. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6358 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Detail-Labels.jpg\" alt=\"Selecting PowerBI Detail Labels\" width=\"354\" height=\"286\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Detail-Labels.jpg 354w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Detail-Labels-300x242.jpg 300w\" sizes=\"auto, (max-width: 354px) 100vw, 354px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Our pie chart is now showing us the specific monetary values for each segment. Great, first question answered. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6370 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Region-Values.jpg\" alt=\"EDF Funding Pie Chart\" width=\"624\" height=\"461\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Region-Values.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Region-Values-300x222.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">OK, next up is to see which Group received the biggest funding. We\u2019ll repeat the process but we\u2019ll use a different chart, just to demonstrate different features of Power BI.<\/span><\/p>\n<p>First, find and click on the Clustered Bar Chart.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6374 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Clustered-Bar-Chart.jpg\" alt=\"Selecting Clustered Bar Chart\" width=\"360\" height=\"434\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Clustered-Bar-Chart.jpg 360w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/PowerBI-Clustered-Bar-Chart-249x300.jpg 249w\" sizes=\"auto, (max-width: 360px) 100vw, 360px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Drag the <\/span><b>EDF Funding<\/b><span style=\"font-weight: 400;\"> into the <\/span><b>Values<\/b><span style=\"font-weight: 400;\"> box and drag the <\/span><b>Group<\/b><span style=\"font-weight: 400;\"> into the <\/span><b>Axis<\/b><span style=\"font-weight: 400;\"> box. Turn on the data labels and you\u2019ll quickly see that the University of Waterloo received the biggest funding \u2014 almost $320,000<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6369 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Group-Values.jpg\" alt=\"EDF Group Values Chart\" width=\"556\" height=\"604\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Group-Values.jpg 556w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Group-Values-276x300.jpg 276w\" sizes=\"auto, (max-width: 556px) 100vw, 556px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Now that you know how to ask questions and visualize public data, we will now quickly go over another tool that can help you visualize your data in the Cloud. Have in mind that the Cloud tools only support lower file sizes, which means you\u2019re best off using them for 10-20 page data sets. Luckily, the data set from our example is actually pretty small. <\/span><\/p>\n<h2><b>DataHero <\/b><\/h2>\n<p><span style=\"font-weight: 400;\">DataHero is a cloud solution for Business Intelligence and data visualization. It allows you to connect files from numerous online and offline sources and it even has an integrated data cleaning tool, which is nice, but I do not recommend relying solely on it. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">You can use DataHero for free, for files up to 2 MB in size. Anything larger than that, and you\u2019ll probably have to pay a monthly subscription which is between $60 and $90. For this purpose, we are going to use a free plan. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Create an account, click on the Data tab and click on <\/span><b>Import Data<\/b><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6365 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Import-Data.jpg\" alt=\"Importing Data With DataHero\" width=\"624\" height=\"187\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Import-Data.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Import-Data-300x90.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Find your Excel file, select the sheet and upload it: <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6367 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Uploading-Data.jpg\" alt=\"Uploading Data with Datahero\" width=\"624\" height=\"425\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Uploading-Data.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Uploading-Data-300x204.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">On the next screen, check formatting and proceed. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">What\u2019s cool about DataHero is that it automatically suggests data visualizations: <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6368\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Visualization-Suggestions.jpg\" alt=\"Suggested Visualizations From DataHero\" width=\"673\" height=\"198\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Visualization-Suggestions.jpg 618w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Data-Hero-Visualization-Suggestions-300x88.jpg 300w\" sizes=\"auto, (max-width: 673px) 100vw, 673px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">I was originally interested in EDF Funding by project category so I\u2019ll just create a brand new chart. DataHero uses the same drag &amp; drop interface so it\u2019s really easy to start using it. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">First, drag the EDF Funding field onto the canvas. <\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6372 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Values-PowerBI.jpg\" alt=\"PowerBI EDF Funding Values\" width=\"352\" height=\"377\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Values-PowerBI.jpg 352w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/EDF-Funding-Values-PowerBI-280x300.jpg 280w\" sizes=\"auto, (max-width: 352px) 100vw, 352px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Next, drag &amp; drop the Project Category field.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-6366 size-full\" src=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/DataHero-Project-Funding-Category.jpg\" alt=\"DataHero Pie Chart Visualization\" width=\"624\" height=\"363\" srcset=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/DataHero-Project-Funding-Category.jpg 624w, https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/DataHero-Project-Funding-Category-300x175.jpg 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">As you can see, we received our answer. Most of the funding money (35%) went into Restoration projects and the rest \u00a0was dispersed equally between other three categories.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">There are other, more complex, data visualization tools but we will stick with DataHero and Power BI for the time being as they offer the most features in their free plans. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Let\u2019s recap the entire process of analyzing public data that\u2019s archived in PDF:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Find a relevant data set<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Use Able2Extract\u2019s Custom PDF to Excel feature \u00a0to convert it to Excel or CSV<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Clean the data in Excel and remove blank rows and cells<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Visualize the data using a tool like Power BI or DataHero<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">By now you should have a clear understanding of the entire process of analyzing public data and should be well on your way to using it to shape the future of journalism. The strategy is simple \u2014 just upload clean, high quality data and play around with it until you get what you are looking for. <\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>There is a general sense of helplessness when it comes to analyzing public data, especially as people think it involves insane amounts of statistical mastery and in-depth knowledge of complicated statistical software. This is especially nerve wracking for data journalists, who are keen on using data to write stories that can actually influence a certain &#8230; <a title=\"How To Analyze Open Data With Able2Extract, Power BI And DataHero\" class=\"read-more\" href=\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\" aria-label=\"More on How To Analyze Open Data With Able2Extract, Power BI And DataHero\">Continue reading \u2192<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[332],"tags":[50,335,232,321,336,316,85,48,205,122],"class_list":["post-6356","post","type-post","status-publish","format-standard","hentry","category-tech-tips-tutorials","tag-able2extract","tag-big-data","tag-data-analysis","tag-data-extraction","tag-data-science","tag-data-visualization","tag-pdf-to-excel","tag-research-tools","tag-resources","tag-tutorial"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Analyze Open Data With Able2Extract, Power BI And DataHero<\/title>\n<meta name=\"description\" content=\"Learn how to work with open data, Able2Extract, Power BI and DataHero. This step by step tutorial shows you how to analyze open data archived in PDF.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Analyze Open Data With Able2Extract, Power BI And DataHero\" \/>\n<meta property=\"og:description\" content=\"Learn how to work with open data, Able2Extract, Power BI and DataHero. This step by step tutorial shows you how to analyze open data archived in PDF.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\" \/>\n<meta property=\"og:site_name\" content=\"PDF Blog | Investintech PDF Solutions\" \/>\n<meta property=\"article:published_time\" content=\"2016-09-01T17:35:38+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-08-23T12:55:11+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg\" \/>\n<meta name=\"author\" content=\"Reena\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@able2extract\" \/>\n<meta name=\"twitter:site\" content=\"@able2extract\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\"},\"author\":{\"name\":\"Reena\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/person\/9d21ba7980d32dbd36069a4878f8e409\"},\"headline\":\"How To Analyze Open Data With Able2Extract, Power BI And DataHero\",\"datePublished\":\"2016-09-01T17:35:38+00:00\",\"dateModified\":\"2019-08-23T12:55:11+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\"},\"wordCount\":1730,\"publisher\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg\",\"keywords\":[\"Able2Extract\",\"big data\",\"data analysis\",\"data extraction\",\"data science\",\"data visualization\",\"PDF to Excel\",\"research tools\",\"resources\",\"tutorial\"],\"articleSection\":[\"Tech Tips and Tutorials\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\",\"url\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\",\"name\":\"Analyze Open Data With Able2Extract, Power BI And DataHero\",\"isPartOf\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg\",\"datePublished\":\"2016-09-01T17:35:38+00:00\",\"dateModified\":\"2019-08-23T12:55:11+00:00\",\"description\":\"Learn how to work with open data, Able2Extract, Power BI and DataHero. This step by step tutorial shows you how to analyze open data archived in PDF.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage\",\"url\":\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg\",\"contentUrl\":\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg\",\"width\":541,\"height\":332,\"caption\":\"Common Open Dataset Formats\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.investintech.com\/resources\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How To Analyze Open Data With Able2Extract, Power BI And DataHero\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#website\",\"url\":\"https:\/\/www.investintech.com\/resources\/blog\/\",\"name\":\"PDF Blog | Investintech PDF Solutions\",\"description\":\"Everything PDF\",\"publisher\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.investintech.com\/resources\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#organization\",\"name\":\"PDF Blog | Investintech PDF Solutions\",\"url\":\"https:\/\/www.investintech.com\/resources\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2024\/12\/Investintech-apryse-logo-w270.webp\",\"contentUrl\":\"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2024\/12\/Investintech-apryse-logo-w270.webp\",\"width\":270,\"height\":40,\"caption\":\"PDF Blog | Investintech PDF Solutions\"},\"image\":{\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/able2extract\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/person\/9d21ba7980d32dbd36069a4878f8e409\",\"name\":\"Reena\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/aceff76f1b124f7ffb271de50b78f12a7599655c7087ea3a656b61cf9a89c376?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/aceff76f1b124f7ffb271de50b78f12a7599655c7087ea3a656b61cf9a89c376?s=96&d=mm&r=g\",\"caption\":\"Reena\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Analyze Open Data With Able2Extract, Power BI And DataHero","description":"Learn how to work with open data, Able2Extract, Power BI and DataHero. This step by step tutorial shows you how to analyze open data archived in PDF.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html","og_locale":"en_US","og_type":"article","og_title":"Analyze Open Data With Able2Extract, Power BI And DataHero","og_description":"Learn how to work with open data, Able2Extract, Power BI and DataHero. This step by step tutorial shows you how to analyze open data archived in PDF.","og_url":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html","og_site_name":"PDF Blog | Investintech PDF Solutions","article_published_time":"2016-09-01T17:35:38+00:00","article_modified_time":"2019-08-23T12:55:11+00:00","og_image":[{"url":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg","type":"","width":"","height":""}],"author":"Reena","twitter_card":"summary_large_image","twitter_creator":"@able2extract","twitter_site":"@able2extract","schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#article","isPartOf":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html"},"author":{"name":"Reena","@id":"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/person\/9d21ba7980d32dbd36069a4878f8e409"},"headline":"How To Analyze Open Data With Able2Extract, Power BI And DataHero","datePublished":"2016-09-01T17:35:38+00:00","dateModified":"2019-08-23T12:55:11+00:00","mainEntityOfPage":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html"},"wordCount":1730,"publisher":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/#organization"},"image":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage"},"thumbnailUrl":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg","keywords":["Able2Extract","big data","data analysis","data extraction","data science","data visualization","PDF to Excel","research tools","resources","tutorial"],"articleSection":["Tech Tips and Tutorials"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html","url":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html","name":"Analyze Open Data With Able2Extract, Power BI And DataHero","isPartOf":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage"},"image":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage"},"thumbnailUrl":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg","datePublished":"2016-09-01T17:35:38+00:00","dateModified":"2019-08-23T12:55:11+00:00","description":"Learn how to work with open data, Able2Extract, Power BI and DataHero. This step by step tutorial shows you how to analyze open data archived in PDF.","breadcrumb":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#primaryimage","url":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg","contentUrl":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2016\/09\/Open-Datasets-Common-Formats.jpg","width":541,"height":332,"caption":"Common Open Dataset Formats"},{"@type":"BreadcrumbList","@id":"https:\/\/www.investintech.com\/resources\/blog\/archives\/6356-analyze-open-data-able2extract-powerbi-datahero.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.investintech.com\/resources\/blog\/"},{"@type":"ListItem","position":2,"name":"How To Analyze Open Data With Able2Extract, Power BI And DataHero"}]},{"@type":"WebSite","@id":"https:\/\/www.investintech.com\/resources\/blog\/#website","url":"https:\/\/www.investintech.com\/resources\/blog\/","name":"PDF Blog | Investintech PDF Solutions","description":"Everything PDF","publisher":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.investintech.com\/resources\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.investintech.com\/resources\/blog\/#organization","name":"PDF Blog | Investintech PDF Solutions","url":"https:\/\/www.investintech.com\/resources\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2024\/12\/Investintech-apryse-logo-w270.webp","contentUrl":"https:\/\/www.investintech.com\/resources\/blog\/wp-content\/uploads\/2024\/12\/Investintech-apryse-logo-w270.webp","width":270,"height":40,"caption":"PDF Blog | Investintech PDF Solutions"},"image":{"@id":"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/able2extract"]},{"@type":"Person","@id":"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/person\/9d21ba7980d32dbd36069a4878f8e409","name":"Reena","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.investintech.com\/resources\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/aceff76f1b124f7ffb271de50b78f12a7599655c7087ea3a656b61cf9a89c376?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/aceff76f1b124f7ffb271de50b78f12a7599655c7087ea3a656b61cf9a89c376?s=96&d=mm&r=g","caption":"Reena"}}]}},"_links":{"self":[{"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/posts\/6356","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/comments?post=6356"}],"version-history":[{"count":13,"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/posts\/6356\/revisions"}],"predecessor-version":[{"id":9438,"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/posts\/6356\/revisions\/9438"}],"wp:attachment":[{"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/media?parent=6356"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/categories?post=6356"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.investintech.com\/resources\/blog\/wp-json\/wp\/v2\/tags?post=6356"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}