It’s been a few months since I last posted in this blog. As I am doing my PhD, I have been quite busy learning two things. First, since my background is chemistry, specifically crystal engineering, I have been busy transitioning towards the social sciences. There’s quite a lot of material I had to cover to be able to keep with the latest areas in Business and Innovation studies. Second, having no programming background before, I had to spend some time learning the basics. I am happy with my progress in data science with languages such as Python, R, SQL and other tools like Tableau. I will cover the pros and cons of learning programming as a social scientist and how to actually learn them efficiently in another post.
For now, I just want to share a Python code I made to convert Web of Knowledge text files to a Dataframe / CSV . This is useful if you want to check each publication manually with Excel before analysis in another bibliometric software such as VosViewer and CitNetExplorer. I also provided a code to convert these back to the original Web of Knowledge format.
Update: This post is outdated. When I was starting with bibliometrics, I did not realize that you can download a CSV file directly from the Web of Science and this file can be fed directly to VosViewer and CitNetExplorer. Nonetheless, looking back, my lack of knowledge about this feature turned out to be a good thing as it pushed me to start learning seriously to program in Python.
Literature review can be a tedious process. With so many articles to read, new researchers in a field can find themselves stuck, trying to stay on top of all the readings required. In an effort to streamline the process, bibliometrics can be a powerful tool to make the article selection more efficient, adding a visual component to it.
Last November 10-11, I gave a talk on bibliometric methods at the 8th joint PhD workshop of VU Amsterdam and FH Munster. I got really great response from my talk, with people asking me to make a manual on the topic. Though I only started using bibliometrics three months ago, I found that learning the basics to be a very useful investment. In this post, I will try to create a simple manual on the basics of the method.
Benefits of Bibliometrics
Especially for researchers, here are some things you would be able to do after reading this post:
Get an overview of the important publications in your field of study
Generate a database of important researchers and institutes in your field
Visualize how your field is connected
Though there are many ways to do this, I found using the Web of Science as database and the bibliometric software VosViewer and CitiNetExplorer to have the easiest learning curve. The process generally is composed of the following steps:
Downloading the articles from the database
Generating the maps using the software
Formulating the Keywords
The first part is just the regular literature search on the Web of Science. Most scientists would be knowledgeable already on this area, having done literature search in the past. Though the basic search would usually suffice, it would be more efficient to learn how to use the advanced search with the Boolean operators.
For example, if you are researching on entrepreneurship in the Netherlands. You want to search the terms entrepreneurship and Netherlands together. At the same time, you might want to include related words like business or industry and even the words Holland and Dutch. With these in mind, your keyword search could be:
TS = ((entrepreneurship OR business OR industry) AND(Netherlands OR Holland OR Dutch))
This yielded 3,381 results as of Nov 2016. A preliminary look at the results can then be done. At this point, you can decide to reformulate the keywords or stick with the results. The good thing is that you can easily change your keywords if the list of articles fail to reflect your intended outcome.
Downloading the articles
This part is the easiest yet most tedious. The problem with the Web of Science is that you can only download 500 article data at a time. Thus, if you have 3,000 articles, then you have to repeat the saving process 6 times.
At the results page, what you want to do is click the down arrow beside ‘Save to EndNote online’ and click ‘Save to Other File Formats’
Afterwards, save the first 500 records by typing at the records space 1 to 500. Also, for the record content should be with the cited references. And finally, click send.
You will then have a text file containing information about the first 500 records.
To save the next 500, click again the down arrow and save records 501-1000, 1001-1500 and so on.
Using the Software
With the articles downloaded, it is now possible to analyze them with the software. Download CitNetExplorer. It’s just a matter of loading the text files into the program. It automatically generates a map of the most cited papers in your set of papers. This software is smart such that even if an article does not have the keywords you used, it can still be included if it is cited a lot by the papers in your database.
More importantly, it also shows the connection among these papers. Through this, one can infer how the field developed and how ideas have evolved over time. By being able to visualize how these papers are related to one another, doing literature review then becomes a little bit easier.