From time to time, I get messages from students asking for career advice. In the academia community in Reddit, there are daily posts also asking for how to search for jobs or find mentors. If I had one advice for highly motivated students who might not have the best network or guidance to help them in their path, it is this: Don’t be afraid to send cold emails to successful people.
Ever since I came across this idea, I have been applying it in my own career. Whenever I read articles or listen to podcasts that are fascinating, I reach out to the authors just to say thank you. Sometimes, if I feel it’s relevant, I might even ask for career advice. Believe it or not, most of them actually respond. So, far I have around 85% success rate. People are generally gracious and they always want to help others if what you ask is not too inconvenient for them. The worst thing that would happen is that they would not reply and your email goes to spam. The best case however is that they see your interest and motivation, and you become under their radar. If they sense some opportunity, they might remember you and reach out to you.
If you want to go further and really impress them, you can even create the so-called pre-interview project . It’s a small project that one could do in a short amount of time that can impress a possible collaborator, future employer, etc. Especially in this noise-filled world, if you really want to be recognized, you have to find a way to differentiate yourself from the rest.
In the first month of my PhD, I came across the article by Hambrick and Chen on how academic fields develop. Briefly, they described three processes that fields how to go through to receive acceptance: differentiation, legitimation and mobilization. The first differentiation means that a field should try to set itself apart from other existing fields. At the same time however, it should not stray too far away as it still have to build legitimacy and gain recognition from the wider scientific community. Mobilization refers to the field’s ability to mobilize resources. As I see it, this last process basically serve as the fuel to advance the two other processes. During those days, I found this study to be interesting but set it aside for the next two years not knowing how to incorporate it in my work.
In the past few months, I’ve been thinking of a way to unify the different studies that I’ve been conducting for my PhD. One day, I randomly came across the article again. I then realized that the dynamic of conforming and differentiation occur everywhere. It does not only occur at the field level but also in other levels; not only in scientific development but also in many other facets of life. At the individual level, all of us, in one way or another, conform to the communities we belong to while at the same time, try to make ourselves stand out. To contribute, we try to bring something new or unique to the groups that we are in. Upon looking further, this has been referred to as the theory of optimal distinctiveness in psychology circles, which has been described even as early as 1991.
Recently, this concept has also been gaining more interest at the organizational level, such as this review by Zhao et al. Firms cannot compete by only conforming with other firms in their market, they also have to differentiate themselves from other players in their area. As there can be different dimensions of comparison across firms, the balancing act between conforming and differentiating can be complex. Research then is of value to explore how to orchestrate such dynamics effectively.
Studying in the past two years how new scientific fields develop, I notice the recurring theme of conformation vs. differentiation. Fields have to manage these forces if they want to be established. Big firms have to be aware of these forces if they want to stay relevant. New entrepreneurial firms also have to balance the two to gain resources and find customers. Teams within firms, to stay innovative, have to be in touch with what their colleagues are doing while at the same time, bring new things to the table. Researchers and managers also have to practice optimal distinctiveness.
Last week, I was having lunch with a colleague and she asked me, “what is your research niche?” The answer did not come as easily because it was something I was struggling with before. Ever since I started my PhD, I was exploring various perspectives, not trying to settle with a specific scientific field. Working at the interface of many fields including pharmaceutical sciences, innovation studies, scientometrics, sociology of science and management, I did not want to settle with a field afraid that it would lock me in. After all, it is an important decision as it would affect the future career opportunities I could pursue in academia. The answer should be something that I am greatly interested in, something that I can stand exploring in for the rest of my academic career. At the same time, it should be something that would have an exciting future ahead of it.
Finding the field I identify with was a journey, picking up nuggets along the way. One is collaborating with a supervisor whose expertise is bibliometrics. This exposed me to journals like Scientometrics that always interested me whenever new issues would be released as various studies get creative in analyzing various texts. I was also very interested in data science, curious about the new techniques people apply to analyze and present data. At the same time, I was fascinated with how new academic fields started. This article by Hambrick and Chen, “New academic fields as admittance-seeking social movements: The case of strategic management” was one of the first that I read on the topic. With my interest forming in such direction, I had to read Kuhn’s The Structure of Scientific Revolutions which confirmed my interest in this field whatever it is called.
All of these nuggets seem unrelated at first but they were pointing towards something. The problem was I did not know what to call the field I was interested in. Fortunately, a review published recently in Science by Fortuno et al. helped me. In their review, they were able to put into words what I was really excited studying in my future academic career. It is called the “Science of Science.” They viewed the science of science as “a transdisciplinary approach that uses large data sets to study the mechanisms underlying the doing of science.”
With what I am doing now, studying how fragment-based drug discovery emerged as a scientific field, I felt that this review deeply resonated with me. I encourage everyone else, even those not from my field to read it, as it is very fascinating.
There has been more push for scientists to interact with the greater public. As a young scientist hoping to break into the field, it is important to take every opportunity to get exposure. Last year, I had a great opportunity to be featured in Nature Biotechnology. Although it ended up to be just one line in the end, the experience I had of being interviewed was a nice opportunity to understand how science journalism works and more importantly, to share the research I am doing to my target audience of drug discovery practitioners. In this brief blog post, I will share how it happened.
Last year, there was a drug that got approved from the firm Astex. Coincidentally, although the drug was not derived from the approach I am studying for my PhD, the firm Astex is one of the pioneers of in the approach called fragment-based drug discovery (FBDD). With this news, the person writing the article, Mark Peplow, was looking around for more information about Astex and FBDD. With some luck, he came across our consortium website (Fragnet.EU) and found the research that I was carrying out which was just about that – the development of the approach. He first reached out to my supervisor Peter. However, realizing how great of an opportunity it would be for me, my supervisor who was very supportive decided to direct him to me.
With that, a time was set for our interview. Before the actual interview, I prepared a little bit by reviewing the numbers I had with regards to collaboration in FBDD. At first I was a little nervous, since I had not experienced an interview before. However, with time, I eased up and just talked about all the things I knew about FBDD. It probably ended up to be a 30 minute call as I talked about various facets of collaborations. It was a really pleasant experience overall. At the end, he informed me when it would get published and said that he would inform me once again when it happens.
Starting a PhD program or any research project for that matter, one of the first things that you have to do is the literature review. When I first started carrying out the review, I found searching literature and organizing the readings to be excruciating. Where do you begin? In what order should you read your articles? Where do you stop reading? After delving into bibliometrics, I found that using the tools are really helpful to make the literature review less painstaking and more efficient. In this post, I will just list my ideas on how various bibliometric techniques can aid in this task.
One of the first things one has to do is to download the literature. Many researchers would carry this out by using Google scholar and search the keywords that they are familiar with. The problem however with this process is that especially for beginning researchers, they would not know all the relevant keywords in the first place and thus, exclude a lot of important papers. For more advanced researchers, they can resort to the Web of Science or Scopus and apply various Boolean operators to narrow or widen their search. But still, the problem persists, how can you ensure that you have not excluded valuable articles that are not using the keywords in their title, abstract or author-identified keywords.
Bibliometrics has an approach that can be helpful. To ensure that your collection of articles will be comprehensive, you can grow that collection from a seed of articles. To do so, you first download a set of articles through keywords that you are sure are related to your topic of interest. After downloading data from these set of articles, you can grow this set by downloading their frequently cited articles. One can set a minimum threshold of citations an article should have before it is downloaded. This can easily be done through software like CitNetExplorer, which exports the DOI.
Extending this further, another step one can do is download the citing articles. This is especially helpful for fields where advances are constantly occurring, making it difficult to track the keywords being used. This also allows one to identify the adjacent fields that the original field is extending to. This step can easily be done through the citation report feature of the Web of Science. As a caveat though, one should set a threshold on how many citations a paper should have in the original dataset before it is added to ensure that all the papers are still relevant. This can be done in the absolute or relative. For instance, one should consider that a paper cites 5 papers from the original dataset or at least 30% of its citations are from this. One should also consider the journal and category the article belongs to.
Organizing your Papers
Having downloaded the papers, it is now important to organize them by topics. To help with identifying the subtopics within your main topic, you can create a rough cooccurrence map of the keywords. This can be carried out through software like VosViewer. This shows you the different keywords used in your literature and how related they are with each other.
A more direct way of organizing the papers is by plotting the bibliographic coupling network of the publications. This plot shows paper according to how they are related to each other based on the references they share.
Now that you have to organized your papers, there are many ways to read them according to your preference. I propose to subdivide them by core papers and current papers. You can then read the core papers first to contextualize the foundations of the field. These core papers are identified by high citation count within your set of papers. On the other hand, the current papers show the current trends in the field. These are identified by looking at the latest publications in the top journals in your field. This journals can be identified by combining measures of citations, number of relevant articles and relatedness of keywords.
To carry out the actual literature review, everyone has their own system. I fortunately have found something that works for me. It involves combining Microsoft Access with a qualitative data analysis software like Atlas.Ti. I plan to share my system in the coming weeks.
NOTE: This is draft#1 and is still under revision.
At the Science, Business and Innovation department at VU Amsterdam, students frequently need to assess the strategies of various high tech firms. In this post, I will outline a basic toolkit that academic researchers can use to draw and analyze two basic networks of a company – knowledge and collaboration network. Collaboration network refers to explicit partnerships that members of a firm have with other institutions. The collaboration network is usually obtained from looking at the co-authorship in a firm’s works. Meanwhile, knowledge network is related to the sources of knowledge that a firm uses in its own innovation. This knowledge network can be traced by looking at the citations of a company’s output. The main difference between the two networks is that a company does not have to formally partner with another organization in order to learn from it, rather it can also do so by tracking the other company’s activities or through informal social networks. This form of learning is not manifested through co-authorships but through citations. By analyzing the citation network, we can see whether this knowledge relationship is one-sided or whether both companies cite each other’s works.
In order to draw the various networks of a high tech firm, the first step is simply to look at the company website. It usually has tons of information about a company already. It shows its founders, its services and perhaps even its collaborations. With basic company information known, it is now possible to draw various network maps either by looking at the firm’s patents or publications.
One of the things I would check first, especially for a high tech startup is the publication set of the company. High tech startups publish due to a variety of reasons, such as for marketing, sometimes using the publication as a signal to investors that the company is innovative. Moreover, if a company is a pioneer in a field, publishing can help it gain legitimacy for the emerging field that it is part of. Using the Web of Science or Scopus, one could do a basic search of the firm name. In Web of Science’s advanced search, you could use the tags OO for organization, OG for organization-enhanced and AD for address. I prefer to use the address tag as the database’s preprocessing algorithm can sometimes modify the name of companies. However, the problem might be that you would not be able to find any publication because the company has just kicked off and thus, has not carried out any activity under its own name. In such cases, especially for academic spinoffs, you can resort to searching the founders’ names. For many startups based in academia, the founder might still be affiliated with the university, causing most of the company’s publications still tied to the originating university’s name.
The other logical thing to search would be patents. I have found the Patentsview platform covering the US PTO to be a very reliable source for patents. Having an API feature allows automatized downloading of patents from the website (you just need however to read the documentation found in the website). Same comment with the publications, if the patents cannot be found through company search, sometimes they might be registered under the university or under just the founder’s name.
Through these two methods, various interesting analyses can be carried out. To draw the knowledge network, I would look at the cited works of the publications/patents of the company. For publications, this can easily be done through the cited works/authors feature in VosViewer. For patents, however, preprocessing should be done to format the cited works, which can be fed to programs such as VosViewer / Gephi / Pajek.
To draw the collaboration network, we have to look at the co-authorships of publications or patents. Once again, this can easily be carried out with VosViewer for publications but preprocessing should be done for patents.
I attended the European Scientometrics Summer School last Sept. 16-23 in Berlin. For those not familiar with the field, scientometrics refer to the analysis of scientific publications through various statistical methods. As the amount of scientific output increase, scientometricians are needed to organize and make sense of all the data being generated. I found the talks very engaging, as they give a tour of the methods in the field and their various applicaitons. The organizers did a good job of providing a theoretical background of various concepts used in bibliometrics analysis while at the same time, balancing it by having computer laboratory sessions where we applied the concepts learned. I greatly appreciate how they wanted to ensure that we take various units like citations, impact factor, keyword usage, etc with grain of salt.
The discussion that caught my attention the most was on the merit of citations. I think, generally, people tend to take citations for granted. Many academics consider citations as the currency of science. It’s almost the measure of a scientist’s worth. The thing however is that citations are affected by so many factors that great care should be given in its analysis. It varies per field, per subfield and as noted many times before, has a bias towards English publications. I particularly enjoyed this list of 15 reasons to cite another person’s work as presented by Sybille Hinze from DZHW Germany:
Paying homage to pioneers
Giving credit for related work (homage to peer)
Identifying methodology, equipment, etc.
Providing background reading
Correcting one’s own work
Correcting the work of others
Criticising previous work
Alerting to forthcoming work
Providing leads to poorly disseminated, poorly indexed, or uncited work
Authenticating data and classes of facts – physical constants, etc.
Identifying original publications in which an idea or concept was discussed
Identifying original publications or other work describing an eponymic concept or term
Disclaiming work or ideas of others (negative claim)
Disputing priority claims of others (negative homage)
 Weinstock, M. (1971). Citation Indexes. In: Encyclopedia of Library and Information Science. Vol. 5, p. 16-40, Marcel Dekker Inc., New York
It’s been a few months since I last posted in this blog. As I am doing my PhD, I have been quite busy learning two things. First, since my background is chemistry, specifically crystal engineering, I have been busy transitioning towards the social sciences. There’s quite a lot of material I had to cover to be able to keep with the latest areas in Business and Innovation studies. Second, having no programming background before, I had to spend some time learning the basics. I am happy with my progress in data science with languages such as Python, R, SQL and other tools like Tableau. I will cover the pros and cons of learning programming as a social scientist and how to actually learn them efficiently in another post.
For now, I just want to share a Python code I made to convert Web of Knowledge text files to a Dataframe / CSV . This is useful if you want to check each publication manually with Excel before analysis in another bibliometric software such as VosViewer and CitNetExplorer. I also provided a code to convert these back to the original Web of Knowledge format.
Update: This post is outdated. When I was starting with bibliometrics, I did not realize that you can download a CSV file directly from the Web of Science and this file can be fed directly to VosViewer and CitNetExplorer. Nonetheless, looking back, my lack of knowledge about this feature turned out to be a good thing as it pushed me to start learning seriously to program in Python.
Literature review can be a tedious process. With so many articles to read, new researchers in a field can find themselves stuck, trying to stay on top of all the readings required. In an effort to streamline the process, bibliometrics can be a powerful tool to make the article selection more efficient, adding a visual component to it.
Last November 10-11, I gave a talk on bibliometric methods at the 8th joint PhD workshop of VU Amsterdam and FH Munster. I got really great response from my talk, with people asking me to make a manual on the topic. Though I only started using bibliometrics three months ago, I found that learning the basics to be a very useful investment. In this post, I will try to create a simple manual on the basics of the method.
Benefits of Bibliometrics
Especially for researchers, here are some things you would be able to do after reading this post:
Get an overview of the important publications in your field of study
Generate a database of important researchers and institutes in your field
Visualize how your field is connected
Though there are many ways to do this, I found using the Web of Science as database and the bibliometric software VosViewer and CitiNetExplorer to have the easiest learning curve. The process generally is composed of the following steps:
Downloading the articles from the database
Generating the maps using the software
Formulating the Keywords
The first part is just the regular literature search on the Web of Science. Most scientists would be knowledgeable already on this area, having done literature search in the past. Though the basic search would usually suffice, it would be more efficient to learn how to use the advanced search with the Boolean operators.
For example, if you are researching on entrepreneurship in the Netherlands. You want to search the terms entrepreneurship and Netherlands together. At the same time, you might want to include related words like business or industry and even the words Holland and Dutch. With these in mind, your keyword search could be:
TS = ((entrepreneurship OR business OR industry) AND(Netherlands OR Holland OR Dutch))
This yielded 3,381 results as of Nov 2016. A preliminary look at the results can then be done. At this point, you can decide to reformulate the keywords or stick with the results. The good thing is that you can easily change your keywords if the list of articles fail to reflect your intended outcome.
Downloading the articles
This part is the easiest yet most tedious. The problem with the Web of Science is that you can only download 500 article data at a time. Thus, if you have 3,000 articles, then you have to repeat the saving process 6 times.
At the results page, what you want to do is click the down arrow beside ‘Save to EndNote online’ and click ‘Save to Other File Formats’
Afterwards, save the first 500 records by typing at the records space 1 to 500. Also, for the record content should be with the cited references. And finally, click send.
You will then have a text file containing information about the first 500 records.
To save the next 500, click again the down arrow and save records 501-1000, 1001-1500 and so on.
Using the Software
With the articles downloaded, it is now possible to analyze them with the software. Download CitNetExplorer. It’s just a matter of loading the text files into the program. It automatically generates a map of the most cited papers in your set of papers. This software is smart such that even if an article does not have the keywords you used, it can still be included if it is cited a lot by the papers in your database.
More importantly, it also shows the connection among these papers. Through this, one can infer how the field developed and how ideas have evolved over time. By being able to visualize how these papers are related to one another, doing literature review then becomes a little bit easier.