[Coding Question] “Most Lucrative Products” – Meta
The business world faces different questions on a daily basis, which can be solved with a good understading of the data. One needs to master the art of coding to be able to torture the data long enough so that it confesses. In this blog post, I will solve a coding question which deals with data analysis using the powerful Python library, Pandas.
You can find the coding question here.
I solved this question using two approaches which are inherently the same, but use different functions/commands. This is the first solution:
# rename the dataframe df = online_orders # compute the total revenue for each transaction df['total_revenue'] = df['cost_in_dollars'] * df['units_sold'] # extract the number of month from the dates df['month'] = df.date.dt.month # limit the dataframe to contain only the rows with dates btw 2022-01-01 and 2022-06-30 df = df[df.month <= 6] # extract the result df.groupby('product_id').total_revenue.sum().reset_index().sort_values(by = 'total_revenue', ascending = False)[:5]
And this is the second solution:
# rename the dataframe df = online_orders # query the results to extract the first half of 2022 df = df.query("'2022-01-01' < date < '2022-06-30'") # create a new column for the revenue of each transaction df = df.assign(revenue = lambda x: x['cost_in_dollars'] * x['units_sold']) # extract the final result df.groupby(by = "product_id", as_index = False).revenue.sum().nlargest(5, 'revenue')
You can find a complete explanation and walkthrough of the problem here: