[Coding Question] “Most Lucrative Products” – Meta
The business world faces different questions on a daily basis, which can be solved with a good understading of the data. One needs to master the art of coding to be able to torture the data long enough so that it confesses. In this blog post, I will solve a coding question which deals with data analysis using the powerful Python library, Pandas.
You can find the coding question here.
I solved this question using two approaches which are inherently the same, but use different functions/commands. This is the first solution:
# rename the dataframe
df = online_orders
# compute the total revenue for each transaction
df['total_revenue'] = df['cost_in_dollars'] * df['units_sold']
# extract the number of month from the dates
df['month'] = df.date.dt.month
# limit the dataframe to contain only the rows with dates btw 2022-01-01 and 2022-06-30
df = df[df.month <= 6]
# extract the result
df.groupby('product_id').total_revenue.sum().reset_index().sort_values(by = 'total_revenue', ascending = False)[:5]
And this is the second solution:
# rename the dataframe
df = online_orders
# query the results to extract the first half of 2022
df = df.query("'2022-01-01' < date < '2022-06-30'")
# create a new column for the revenue of each transaction
df = df.assign(revenue = lambda x: x['cost_in_dollars'] * x['units_sold'])
# extract the final result
df.groupby(by = "product_id", as_index = False).revenue.sum().nlargest(5, 'revenue')
You can find a complete explanation and walkthrough of the problem here: