[Coding Question] “Most Lucrative Products” – Meta


Photo from cottonbro studio on Pexels


The business world faces different questions on a daily basis, which can be solved with a good understading of the data. One needs to master the art of coding to be able to torture the data long enough so that it confesses. In this blog post, I will solve a coding question which deals with data analysis using the powerful Python library, Pandas.

You can find the coding question here.

I solved this question using two approaches which are inherently the same, but use different functions/commands. This is the first solution:

# rename the dataframe
df = online_orders

# compute the total revenue for each transaction
df['total_revenue'] = df['cost_in_dollars'] * df['units_sold']

# extract the number of month from the dates
df['month'] = df.date.dt.month

# limit the dataframe to contain only the rows with dates btw 2022-01-01 and 2022-06-30
df = df[df.month <= 6]

# extract the result
df.groupby('product_id').total_revenue.sum().reset_index().sort_values(by = 'total_revenue', ascending = False)[:5]


And this is the second solution:

# rename the dataframe 
df = online_orders

# query the results to extract the first half of 2022
df = df.query("'2022-01-01' < date < '2022-06-30'")

# create a new column for the revenue of each transaction
df = df.assign(revenue = lambda x: x['cost_in_dollars'] * x['units_sold'])

# extract the final result
df.groupby(by = "product_id", as_index = False).revenue.sum().nlargest(5, 'revenue')


You can find a complete explanation and walkthrough of the problem here: