WordPress blog analysis in Power BI

I have created an example of Power BI application that allows you to analyse and visualize your wordpress blog including text analysis. You can watch the results below. If you would like to try and get the statistics about your own blog please see the “Try it at your own!“. In the “What’s cool?” and “What’s inside” sections you will find more infomration about what is inside. Enjoy.

The direct link is: here

What’s cool?

The most interesing thing is that you can use almost all of the data stored in the wordpress database to create your own reports exactly as you wish with no limits using all of the Power Bi functionalities. Additionaly there is a few interesing cases that has been solved by the Power Query capabilities. The current versions contain the following tables:

  • X Calculations
  • Authors
  • Category
  • Comments
  • Content
  • Post date
  • Tag
  • Text analysis
The “X Calculations” table contains all of the measures that you can use to visualize your blog. I have add dozen of measures that can be reached with collected data including the categories count, comments count, content attachments count, content published post count, tag count, tags discount count, words count and many more. The two most interesting things included in the model is the “Text analysis” table and the comments country atribute that is calculated from the IP address. In the text analysis all of the content text’s was splitted to single words and also checked if the word is a stopword. By referencing the content table with the text analysis table you can view the single post text analysis as well as the overall analysis.

What’s inside?

Based on the collected data described above I have created a few sample reports that show the blog summaries and the Power Bi capabilities. Here is the short description of the included reports.

  • Post by categories and authors and filter  by year.

1

  • Tags count as a word cloud filtered by category

2

  • Content text words count as a word cloud by category. The addition filter “Text word is stopword” allows you to show only the “interesing” word

3

  • Top 10 most used words (based on words count in all content texts) by category. The chart show the number of each word’s occurence. The addition filter “Text word is stopword” allows you to show only the “interesing” word

4

  • Authors summaries report show how many authors is creating the blog and how many and in what type they are publishing

5

  • Comment summary that shows how many comments are marked as a spam and what is the country of the comments author based on the authors IP address

6

  • Content from each content type by date

7

  • Content posting details shows the time of the content publishing in the blog splitted by the authors

8

  • Single content tracking can be used to view all of the contents elements related to single post or page. Provide also some statistics about the content types usage

9

  • The empty report template that you can use for creating your own reports. Please see the details below

10

 

Try it at your own!

If you would like to try this application on your own with your blog you can get the Power BI Desktop project and all requied files from: here To run and configure the application please follow the steps below. You can choose the general description if you are fammiliar with Power BI and Power BI Desktop and the detailed descriptions if you are new with Power BI.

Please note that you will need the MySQL Connector. You can find the drivers here: http://dev.mysql.com/downloads/file/?id=412152

Also please note that the refreshing data process can take severa minutes (becouse of the text analysis complecity). In my case it takse me 3 minutes with seconds.

General description

You will need only to modify the xml config file included in files, modify one line in Power Bi scripts, provide the database user and pass and import custom visual to Power Bi Desktop. You have to modify the following elements (parameter_value from each parameter):

  • server_address
  • server_port
  • database_name
  • GeoIPCountryWhois
In this parametr please provide the connection parameters to your wordpress database. You will be ask for your login and password during the first connection. Also you can modify the “stop_words” parameter. According to the knowledge about text analysis stop words are the unnecesery words that we would like to remove or mark during our analysis. As for now there are the English and Polish stopwords, but you can add more with no limit. The queries and tables parameters should not been changed. In the “GeoIPCountryWhois” parameter you can paste the url of the file that is included in the project directory with this name.
After opening the project please go to the queries, select “functionReturnConfigPath” and go to Advanced Editor where you will find the following:

In this line please provide the url of your config file. Also in the query editor go to the Data Source Settings and provide your login and pass to the mysql database.

The last thing before go is to iport all of the custom visuals that are stored in the folder “CustomVisuals”.

After this steps you should be able to refresh and get you own statistics.

Detailed configurations description

If you have never used the Power BI you are still able to try this on your own with no costs. First of all you will need to get the Power Bi Desktop from http://powerbi.microsoft.com/ and additionaly if you want to share and embed the report on the web you will need a free account at www.powerbi.com. Please note that in the free version of powerbi.com you will be not able to refresh the data automaticly. After downloading the Power BI Desktop please install the software. Than you will be set up with the enviroment and you can set up the project.

Just like described above you will need now to modify the xml config file included in files, modify one line in Power Bi scripts and import custom visual to Power Bi Desktop. You have to modify the following elements (parameter_value from each parameter):

  • server_address
  • server_port
  • database_name
  • GeoIPCountryWhois
In this parametr please provide the connection parameters to your wordpress database. You will be ask for your login and password during the first connection. Also you can modify the “stop_words” parameter. According to the knowledge about text analysis stop words are the unnecesery words that we would like to remove or mark during our analysis. As for now there are the English and Polish stopwords, but you can add more with no limit. The queries and tables parameters should not been changed. In the “GeoIPCountryWhois” parameter you can paste the url of the file that is included in the project directory with this name.
Now you can open the Power Bi Desktop project from downloaded files. After opening the file please ho to “Home” and “Edit Queries” and from the query list on the left select the “functionReturnConfigPath”, select the “Advanced Editor” and modify the following line:

In this line please provide the url of your config file. Also you will need to provide the login and password to your wordpress database. To enter your creditintial in the editor pane (the place you should be now – query editor) please go to “File” then “Option and settings” and then select the mysql database and go to “edit”. In the opened window you can provide the login and password to your database. By clicking on the “File” and “Apply and close” you can refresh the source data and download the data from your wordpress database. Please note that some of the visualization can not be accesible, becouse the last thing that you need to do is to import the custom visuals that are not avaliable out of the box. To do this please on the “Report” tab select the “…” icon from visualization on the right and import each files that are included in the project directory. Now you can refresh and use your reports.

Enjoy!

Please use and share with your results on the web and your blogs, but please remember to point the seequality.net  as a source and add a comment with link to this post. In the case of any errors and bugs you can send me an email directly at slawomirdrzymala@seequality.net or just write a comment. Enjoy!

Leave a Comment

Your email address will not be published. Required fields are marked *