Plot top 10 values in python

I have a set of information and I want to grab the TOP 10 values verse the everything else. To elaborate I want to add all the values that are not in the TOP 10 together and add them to say a pie chart labeled as "others" along with the top 10. Currently I have the following code where X is my dataframe:

temp = X.SOME_IDENTIFIER.value_counts()
temp.head(10).plot(kind='pie')

This gets me a pie chart of just the top ten but I do not wish to discard all the other values from the dataframe. I want to add them as an eleventh variable on the chart but am not sure how to do this. Any help or advice is appreciated.

asked Mar 23, 2015 at 19:47

Brant MullinixBrant Mullinix

1371 gold badge2 silver badges13 bronze badges

2

Assign the results to a new dataframe (temp2), and then insert a new record that sums any remaining items in the list. It also identifies the number of unique items remaining.

temp = X.SOME_IDENTIFIER.value_counts()
temp2 = temp.head(10)
if len(temp) > 10:
    temp2['remaining {0} items'.format(len(temp) - 10)] = sum(temp[10:])
temp2.plot(kind='pie')

answered Mar 23, 2015 at 20:15

Plot top 10 values in python

AlexanderAlexander

99.1k27 gold badges186 silver badges185 bronze badges

4

Using pandas:

# Sort the DataFrame in descending order; will create a Series
s_temp = X.SOME_IDENTIFIER.sort_values(ascending=False)

# Count how many rows are not in the top ten
not_top_ten = len(s_temp) - 10
    
# Sum the values not in the top ten
not_top_ten_sum = s_temp.tail(not_top_ten).sum()

# Get the top ten values
s_top = s_temp.head(10)

# Append the sum of not-top-ten values to the Series
s_top[10] = not_top_ten_sum

# Plot pie chart
_ = s_top.plot.pie()

# Show plot
plt.show()

answered Jul 15, 2020 at 22:25

Here's how I approached to the problem:

temp = X.SOME_IDENTIFIER.value_counts().sort_values(ascending=False).head(10)
df=pd.DataFrame({'XX':temp.index,'Y':temp.values})
df=df.append({'XX'='Other','Y'=X.SOME_IDENTIFIER.value_counts().sort_values(ascending=False).iloc[10:].sum()})
df.set_index('XX').plot(kind='pie',y='Y')

Explanation----> I stored the top 10 values in a dataframe and manually calculated the sum of the rest of the values from the series and appended the result in the dataframe with the name Other and plotted the piechart for that dataframe. You will get the result hopefully.

answered Jul 8 at 12:07

Plot top 10 values in python

I need help plotting some categorical and numerical Values in python. the code is given below:

%%time  
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns




%%time  df=pd.read_csv('train_feature_store.csv')
df.info
df.head
df.columns



plt.figure(figsize=(20,6)) 
sns.countplot(x='Store', data=df) 
plt.show()



Size = df[['Size','Store']].groupby(['Store'], as_index=False).sum() Size.sort_values(by=['Size'],ascending=False).head(10)

However, the data size is so huge (Big data) that I'm not even able to make meaningful plotting in python. Basically, I just want to take the top 5 or top 10 values in python and make a plot of that as given below:-

https://i.stack.imgur.com/pHcAI.png

In an attempt to plot the thing, I'm trying to put the below code into a dataframe and plot it, but not able to do so. Can anyone help me out in this:-

Size = df[['Size','Store']].groupby(['Store'], as_index=False).sum() Size.sort_values(by=['Size'],ascending=False).head(10)

Below, is a link to the sample dataset. However, the dataset is a representation, in the original one where I'm trying to do the EDA, which has around 3 thousand unique stores and 60 thousand rows of data. PLEASE HELP! Thanks!

https://drive.google.com/file/d/1j77Xvl1mzUAPNZ53b89LzODSu1ZsbvEJ/view?usp=sharing

What I have tried:

%%time  
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns




%%time  df=pd.read_csv('train_feature_store.csv')
df.info
df.head
df.columns



plt.figure(figsize=(20,6)) 
sns.countplot(x='Store', data=df) 
plt.show()



Size = df[['Size','Store']].groupby(['Store'], as_index=False).sum() Size.sort_values(by=['Size'],ascending=False).head(10)

How do you find the top 10 values in Python?

How to Get Top 10 Highest or Lowest Values in Pandas.
Step 1: Create Sample DataFrame. ... .
Step 2: Get Top 10 biggest/lowest values for single column. ... .
Step 3: Get Top 10 biggest/lowest values - duplicates. ... .
Step 4: Get Top N values in multiple columns. ... .
Step 5: How do nsmallest and nlargest work..

How do you get top 5 values in Python?

Python's Pandas module provide easy ways to do aggregation and calculate metrics. Finding Top 5 maximum value for each group can also be achieved while doing the group by. The function that is helpful for finding the Top 5 maximum value is nlargest().

How do you plot data values in Python?

Data can also be plotted by calling the matplotlib plot function directly..
The command is plt.plot(x, y).
The color and format of markers can also be specified as an additional optional argument e.g., b- is a blue line, g-- is a green dashed line..

How do you plot a counter in Python?

countplot() method is used to Show the counts of observations in each categorical bin using bars. Parameters : This method is accepting the following parameters that are described below: x, y: This parameter take names of variables in data or vector data, optional, Inputs for plotting long-form data.