Power BI is an extremely flexible visualization tool. Last week I wrote about how to transform your data with Python, and this week I’ll go a little further with Python visuals.
Power BI是一种非常灵活的可视化工具。 上周,我写了关于如何使用Python转换数据的信息 ,而本周,我将进一步介绍Python视觉效果。
Transforming with Python is excellent. You can create more effective and intricate calculations and load them directly from a Pandas data frame to your Power BI tables. But it does impose some limitations, the main one for me is in regards to when your code is executed.
使用Python进行转换非常出色。 您可以创建更有效,更复杂的计算,并将其直接从Pandas数据框中加载到Power BI表中。 但这确实施加了一些限制,对我来说,主要限制是关于何时执行代码。
When you create a transformation, you’re making changes in the whole dataset, which is great for many cases such as cleaning, arranging, adding pre calculations and so on.
创建转换时,您将在整个数据集中进行更改,这在许多情况下非常有用,例如清理,整理,添加预计算等。
Basics of Python transformations in Power BI Power BI中Python转换的基础The problem is when you have too many possibilities or outcomes. For example, if you want the user to see a number coming from different combinations of categories, you would need a result for each possibility, which can easily take absurd proportions.
问题是当您有太多可能性或结果时。 例如,如果希望用户看到来自不同类别组合的数字,则每种可能性都需要一个结果,而结果很容易采用荒谬的比例。
That’s where Python visuals come handy; they let us perform the calculations on time so we can work with filtered data without having to replicate the same measure for every possible outcome.
这就是Python视觉效果派上用场的地方; 它们使我们能够按时执行计算,因此我们可以处理过滤后的数据,而不必为每个可能的结果重复相同的度量。
In this article, I’ll go through the basics of creating a Python visual in Power BI using Pandas, Scipy, and Matplotlib.
在本文中,我将介绍使用Pandas,Scipy和Matplotlib在Power BI中创建Python视觉的基础知识。
SIR Model Dashboard. SIR模型仪表板。In this example, we’ll explore more about infectious diseases while implementing a relatively simple algorithm for modelling how they spread.
在此示例中,我们将探索有关传染病的更多信息,同时实现一种相对简单的算法来模拟传染病的传播方式。
SIR, which stands for Susceptible, Infected, and Recovered.
SIR,代表易感,已感染和已恢复。
I’ve based my implementation on the work from scipython.com; let’s have a look at the code.
我的实现基于scipython.com的工作; 让我们看一下代码。
import numpy as npfrom scipy.integrate import odeintimport matplotlib.pyplot as plt# N - Population# I0 - Initial number of infected people# R0 - Initial number of recovered people# S0 - Initial number of suceptible people # beta - Contact rate# gamma = Recovery rate (1/ days to recover)# t - Time in days (list with the days to be calculated)N = 1000I0, R0 = 1, 0S0 = N - I0 - R0beta, gamma = 0.2, 1./10 t = np.linspace(0, 160, 160)# SIR modeldef deriv(y, t, N, beta, gamma): S, I, R = y dSdt = -beta * S * I / N dIdt = beta * S * I / N - gamma * I dRdt = gamma * I return dSdt, dIdt, dRdt# vector with starting conditionsy0 = S0, I0, R0# over time(t)ret = odeint(deriv, y0, t, args=(N, beta, gamma))S, I, R = ret.TWe have seven different variables (N, I0, R0, S0, beta, gamma, and t), the SIR equation, a vector with the starting conditions for the model, and a call from the method odeint of the Scipy’s class integrate.
我们有七个不同的变量(N,I0,R0,S0,β,γ,和T),在SIR方程,与模型的起始条件的载体,以及从方法的调用odeint的SciPy的的类的integrate 。
What matters the most for us to integrate this to Power BI is the variables. Our objective is to replace those variables with references to the tables.
对于我们来说,将其集成到Power BI中最重要的是变量。 我们的目标是用对表的引用替换这些变量。
We can already go to PBI and create a table for each variable we want to replace; each table should contain a column with the possible values for the variable. For exemplification purposes, I’m using a simple list, but you can generate these values systematically or use values from a dataset you already have.
我们已经可以转到PBI并为要替换的每个变量创建一个表; 每个表应包含一列,其中包含该变量的可能值。 出于示例目的,我使用一个简单的列表,但您可以系统地生成这些值,也可以使用已经拥有的数据集中的值。
New table for a variable 变量的新表Now we can create some slicers to select from those values.
现在,我们可以创建一些切片器以从这些值中进行选择。
Slicers 切片机 Format — Selection controls — Single select 格式—选择控件—单选That’s it. All the elements are in place. When we add a Python Visual to our report and add the variables we created, we’ll receive a data frame with the values.
而已。 所有元素均已就绪。 当我们在报表中添加Python Visual并添加创建的变量时,我们将收到一个包含值的数据框。
Fields — Variables 字段-变量Let’s try our visualization in Jupyter before we bring it to PBI.
在将其引入PBI之前,让我们在Jupyter中尝试我们的可视化。
First, let’s build a data frame to simulate how we’ll receive this data. We can call it ‘dataset’, just like PBI does.
首先,让我们建立一个数据框架来模拟我们如何接收这些数据。 我们可以将其称为“数据集”,就像PBI一样。
import pandasdataset = pandas.DataFrame([{'Population':10, 'I0':1, 'Gamma':0.06, 'Beta':0.25, 't':150}])dataset Mock data frame 模拟数据框When we replace the variables with references to the data frame, we get this:
当我们用对数据框架的引用替换变量时,得到以下信息:
import numpy as npfrom scipy.integrate import odeintimport matplotlib.pyplot as pltN = dataset['Population'].values[0]I0, R0 = dataset['I0'].values[0], 0S0 = N - I0 - R0beta, gamma = dataset['Beta'].values[0], dataset['Gamma'].values[0]t = np.linspace(0, dataset['t'].values[0], dataset['t'].values[0])# SIR modeldef deriv(y, t, N, beta, gamma): S, I, R = y dSdt = -beta * S * I / N dIdt = beta * S * I / N - gamma * I dRdt = gamma * I return dSdt, dIdt, dRdt# vector with starting conditionsy0 = S0, I0, R0# Over time (t)ret = odeint(deriv, y0, t, args=(N, beta, gamma))S, I, R = ret.TPretty cool, now we can plot each simulated variable (S, I, and R) over time.
很酷,现在我们可以绘制随时间变化的每个模拟变量(S,I和R)。
# labels, figure, and axislabels = ['Susceptible', 'Infected', 'Recovered with immunity']fig, ax = plt.subplots(1, figsize=(16,8), facecolor="#5E5E5E")ax.set_facecolor("#5E5E5E")# plotplt.plot(t, S/1000, '#26D4F9', alpha=0.5, lw=2)plt.plot(t, I/1000, '#F0F926', alpha=0.8, lw=2)plt.plot(t, R/1000, '#26F988', alpha=0.8, lw=2)# legendlegend = ax.legend(labels, ncol=3, fontsize = 12, bbox_to_anchor=(1, 1.05))legend.get_frame().set_alpha(0)for text in legend.get_texts(): plt.setp(text, color = 'w')# axis labels, limits, and ticksax.set_xlabel('Days', c='w', fontsize=14)ax.set_ylabel('1000s of People', c='w', fontsize=14)plt.xticks(c='w', fontsize=12)plt.yticks(c='w', fontsize=12)ax.set_ylim(0,)# grid and spinesax.grid(axis='y', c='grey', lw=1, ls='--', alpha=0.8)ax.spines['top'].set_visible(False)ax.spines['right'].set_visible(False)plt.title("S.I.R. Model", c='w', fontsize=16, loc='left')plt.show() Python visualization Python可视化Fantastic, now let’s get this to Power BI.
太棒了,现在让我们将其用于Power BI。
We already added the visual, so we need to copy/ paste this code, make sure our variables names are the same as we used in the dataset references, and it’s done.
我们已经添加了视觉效果,因此我们需要复制/粘贴此代码,并确保我们的变量名称与数据集引用中使用的变量名称相同,并且已完成。
Dashboard assembly 仪表板组装Now we can choose the values for each variable with the slicers, and the chart will update dynamically, calculating and displaying the simulation on-time.
现在,我们可以使用切片器为每个变量选择值,图表将动态更新,按时计算并显示仿真。
Final visualization, SIR model dashboard 最终可视化,SIR模型仪表板Great, we implemented an equation and designed a visualization for it with Python. Then we brought all that to Power BI and used our Tables as the source of the variables.
太好了,我们实现了一个方程,并使用Python设计了一个可视化。 然后,我们将所有内容都带到Power BI中,并使用我们的表作为变量的来源。
But that’s just one way of taking advantage of Python visuals; besides the many helpful calculations and libraries, we can add a lot of freedom to our visualization when we design in Matplotlib.
但这只是利用Python视觉效果的一种方式。 除了许多有用的计算和库之外,当我们在Matplotlib中进行设计时,我们可以为可视化添加很多自由。
From small, precise adjustments like printing only one of the spines to completely different visualizations or combinations of charts, Matplotlib is a great tool to have on your belt.
从仅打印一根刺这样的细微精确调整到完全不同的可视化效果或图表组合,Matplotlib就是您的理想选择。
import numpy as npimport matplotlib.pyplot as plt# sort valuesdataset = dataset.sort_values('cases_by_1000')# number of columnsN = len(dataset)# theta and radii (x and y)theta = np.linspace(0.0, 2 * np.pi, N, endpoint=False)radii = dataset['cases_by_1000'].values# width and colorswidth = np.pi / Ncolors = plt.cm.viridis(radii / 100.)fig = plt.figure(figsize=(16,16))# polar axisax = plt.subplot(111, projection='polar')# bar chartax.bar(theta, radii, width=width, color=colors)# remove spinesax.spines['polar'].set_visible(False)# ticksplt.yticks(fontsize=18)plt.xticks(theta, labels = dataset['State'].values, fontsize=20)plt.show() COVID-19 Brazil API and COVID-19巴西API和 Wikipedia Wikipedia Brazil, 2020/09/05 — Cases per 1,000 People by State 巴西,2020/09/05-每千人每州的病例As a last note, I would like to add that by using Python visuals, you may miss a lot of Power BI’s interactivity, such as clicking values to highlight them in other charts and tooltips. There were also a couple of functions Matplotlib provides that I couldn’t implement in PBI, mostly around interactivity and animation.
最后,我想补充一点,通过使用Python视觉效果,您可能会错过许多Power BI的交互性,例如单击值以在其他图表和工具提示中突出显示它们。 Matplotlib提供了一些我无法在PBI中实现的功能,主要是围绕交互性和动画。
Thanks for reading my article. I hope you enjoyed it.
感谢您阅读我的文章。 我希望你喜欢它。
Further Reading:Scipy odeint;Matplotlib Polar bar;MAA — SIR Model;
扩展 阅读: Scipy odeint ; Matplotlib极杆 ; MAA-SIR模型 ;
翻译自: https://medium.com/swlh/going-further-with-python-visuals-in-power-bi-a46280a1dfd9
相关资源:python计算机视觉编程 pdf 中文完整版下载