Getting data from Fxempire + working with RapidAPI to get SimilarWeb data. Step-by-step tutorial.

Hi!
My name is Eugene and I am a data analyst at the BDCenter Digital.

In this article, I will tell you how I obtained data for one of our studies.
In short, I needed to get data on the amount of web traffic for the last 6 months for the sites of financial brokers and cryptocurrency exchanges. And then compare it.

You can see the result in the graph below.

Let me show you how to do this.

To work, we will need Python version 3+.
Importing libraries:

Getting Fxempire data

First, we need to find the data we want to get. For this, we go to the Fxempire website and find the section with all brokers.
We set the necessary settings in the filter, see how the URL changes. Then we have to go to the developer panel in the browser and find the link where this data is given.

All wee need is to substitute the results of filtering from the link in the browser into the link of the response from the server. Then we will get access to the data in JSON format.

Now, using the requests library, we can access this address and get the data. And then convert them using the JSON library to Pandas DataFrame.

Result:

156 columns. Wow! But we only need three of them. The ‘name’, ‘websiteAddress’, which I will rename to ‘link’, and the third column ‘group’, which contains a string with the name of the site group ‘brokers’.

Getting Cryptoexchanges data

For this task, I will use the RapidAPI service. This is the marketplace for the API.

It contains a variety of APIs, including popular ones, such as CoinGecko or CoinMarketCap. I used Coinpaprika.
It is easy:
1. Go to the API you need and choose what data you want to get
2. Choose a programming language. The code is generated automatically
3. See the response example

Result:

In total, there are more than 600 cryptocurrency exchanges in the table. But some of them are already inactive. Let’s filter them

As with Fxempire, our final data frame in this section should consist of the same columns. This is necessary so that we can connect the two data frames without problems.

Getting SimilarWeb data

After appending the two data frames, the total number of sites became almost 600. Now I need to get web traffic data from SimilarWeb for each site.

For this, I’m looking for SimilarWeb on RapidAPI. To get data for all sites, I need to make at least 600 requests. Therefore, the Basic plan does not suit me, it has a quota of only 100 requests. So I get Pro for $ 20.

The procedure is similar to the previous one.
1. Choose what we want to get
2. Choose a programming language. The code is generated automatically

Please note that I have highlighted the place in the code where we will change the site address.
I also want to note that the SimilarWeb API expects this format of links: http://www.example.com or https://www.example.com

For the convenience of using the API I wrote a function that accepts a link as input and returns a data frame with traffic data

Then, using a simple cycle we put the links in the function, get the data, and store it in a data frame.

Result:

Visualization

Import the libraries and merge data frames

There may be empty rows in the data. I also found that in the list of exchanges, some positions were duplicated because the exchange is listed by different names and the link to the site has the same one.
For example:

To avoid this, I applied methods .dropna() to delete rows with the values nan and .drop_duplicates by last column with traffic

Now we can do the graph.

You saw the result of this code at the very beginning of this article.

Thanks for your attention.
If you have any questions, please contact me.

Data Analyst