博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
python制作可视化图表_可视化数据–用python覆盖图表
阅读量:2522 次
发布时间:2019-05-11

本文共 4756 字,大约阅读时间需要 15 分钟。

python制作可视化图表

Visualizing data is vital to analyzing data.  If you can’t see your data – and see it in multiple ways – you’ll have a hard time analyzing that data.  There are quite a few ways to visualize data and, thankfully, with pandas, matplotlib and/or seaborn, you can make some pretty powerful visualizations during analysis.

可视化数据对于分析数据至关重要。 如果看不到您的数据,并且无法以多种方式查看它,那么您将很难分析这些数据。 有很多方法可以可视化数据,而且值得庆幸的是,使用pandas,matplotlib和/或seaborn,您可以在分析过程中进行一些非常强大的可视化。

One of the things I like to do when I get a new dataset is try to visualize data points against each other to see if there’s anything that jumps out at me.   To do this, I like to overlay charts against each other to find any patterns in the data / charts. With , this is pretty easy to do but working with dual-axis can be a bit confusing at first.

当我获得一个新的数据集时,我想做的一件事就是尝试使数据点彼此可视化,以查看是否有任何东西跳到我身上。 为此,我希望将图表彼此重叠以在数据/图表中找到任何模式。 使用 ,这很容易做到,但是一开始使用双轴可能会有些混乱。



Want  to learn more about data visualization and/or matplotlib? Here are a few books / websites with good info on the topic.

想更多地了解数据可视化和/或matplotlib? 这是一些有关该主题的好信息的书籍/网站。



One chart that I like to look at for data that I know has a relationship – like sales revenue and number of widgets sold – is the dual overlay of revenue vs quantity.  An example of one of my go-to approaches for visualizing data is in Figure 1 below.

我想查看一个我知道有关联关系的数据(例如销售收入和售出的小部件数量)的图表是收入与数量的双重叠加。 下面的图1是我用于可视化数据的一种方法的示例。

Visualizing data - revenue vs number of items
Figure 1: Visualizing data — Revenue vs Quantity chart overlay
图1:可视化数据—收入与数量图表重叠

In this chart, we have Monthly Sales Revenue (blue line) chart overlay-ed against the Number of Items Sold chart (multi-colored bar chart). This type of chart lets me quickly see if there are any easy patterns in the revenue vs # of items.

在此图表中,我们将“每月销售收入”(蓝线)图表与“已售物品数量”图表(彩色条形图)重叠。 这种图表使我可以快速查看收入与项目数量之间是否有任何简单的模式。

I’ve not found a quick/easy way to build the multi-colored bar chart without hacking the data and building each colored section manually…so if you know a better way that what I share below, let me know.

我没有找到一种快速/简便的方法来构建多色条形图,而又不会破坏数据并手动构建每个有色部分……因此,如果您知道我下面分享的更好的方法,请告诉我。

一个例子 (An example)

Here’s my code for building this chart .

这是我构建此图表的代码。

import numpy as npimport pandas as pdimport matplotlib.pyplot as plt%matplotlib inline # needed for jupyter notebooksplt.rcParams['figure.figsize']=(20,10) # set the figure sizeplt.style.use('fivethirtyeight') # using the fivethirtyeight matplotlib themesales = pd.read_csv('examples/sales.csv') # Read the data insales.Date = pd.to_datetime(sales.Date) #set the date column to datetimesales.set_index('Date', inplace=True) #set the index to the date column# now the hack for the multi-colored bar chart: # create fiscal year dataframes covering the timeframes you are looking for. In this case,# the fiscal year covered October - September.# --------------------------------------------------------------------------------# Note: This should be set up as a function, but for this small amount of data,# I just manually built each fiscal year. This is not very pythonic and would# suck to do if you have many years of data, but it isn't bad for a few years of data. # --------------------------------------------------------------------------------fy10_all = sales[(sales.index >= '2009-10-01') & (sales.index < '2010-10-01')]fy11_all = sales[(sales.index >= '2010-10-01') & (sales.index < '2011-10-01')]fy12_all = sales[(sales.index >= '2011-10-01') & (sales.index < '2012-10-01')]fy13_all = sales[(sales.index >= '2012-10-01') & (sales.index < '2013-10-01')]fy14_all = sales[(sales.index >= '2013-10-01') & (sales.index < '2014-10-01')]fy15_all = sales[(sales.index >= '2014-10-01') & (sales.index < '2015-10-01')]# Let's build our plotfig, ax1 = plt.subplots()ax2 = ax1.twinx()  # set up the 2nd axisax1.plot(sales.Sales_Dollars) #plot the Revenue on axis #1# the next few lines plot the fiscal year data as bar plots and changes the color for each.ax2.bar(fy10_all.index, fy10_all.Quantity,width=20, alpha=0.2, color='orange')ax2.bar(fy11_all.index, fy11_all.Quantity,width=20, alpha=0.2, color='gray')ax2.bar(fy12_all.index, fy12_all.Quantity,width=20, alpha=0.2, color='orange')ax2.bar(fy13_all.index, fy13_all.Quantity,width=20, alpha=0.2, color='gray')ax2.bar(fy14_all.index, fy14_all.Quantity,width=20, alpha=0.2, color='orange')ax2.bar(fy15_all.index, fy15_all.Quantity,width=20, alpha=0.2, color='gray')ax2.grid(b=False) # turn off grid #2ax1.set_title('Monthly Sales Revenue vs Number of Items Sold Per Month')ax1.set_ylabel('Monthly Sales Revenue')ax2.set_ylabel('Number of Items Sold')# Set the x-axis labels to be more meaningful than just some random dates.labels = ['FY 2010', 'FY 2011','FY 2012', 'FY 2013','FY 2014', 'FY 2015']ax1.axes.set_xticklabels(labels)

翻译自:

python制作可视化图表

转载地址:http://mpqwd.baihongyu.com/

你可能感兴趣的文章
BiTree
查看>>
5个基于HTML5的加载动画推荐
查看>>
水平权限漏洞的修复方案
查看>>
静态链接与动态链接的区别
查看>>
Android 关于悬浮窗权限的问题
查看>>
如何使用mysql
查看>>
android addRule()
查看>>
转:app store 注册账号生成证书上传app完整的教程
查看>>
dedecms 搬家流程
查看>>
POST提交大量数据,导致后面数据丢失
查看>>
To Do List
查看>>
增强一个对象的方法(继承、装饰者模式、动态代理)
查看>>
十一、多线程——8-线程池
查看>>
四、基础类库中的常用类——4-国际化与格式化
查看>>
Anders Hejlsberg访谈:Checked Exceptions的问题
查看>>
revit api 使用过滤器
查看>>
Qt532_字符编码转换
查看>>
设置网页自适应最大最小宽度,超过则隐藏
查看>>
几日工作记录
查看>>
const 和 static readonly 区别
查看>>