python ssh库
In Suzieq, we needed to select a python library to fetch data from network devices (and servers) via SSH. This led us to a search for and evaluation of five SSH libraries. which I thought were the most suitable for the task at hand.
在Suzieq中 ,我们需要选择一个python库以通过SSH从网络设备(和服务器)获取数据。 这导致我们搜索并评估了五个SSH库。 我认为最适合手头的任务。
We had a fairly simple list of requirements from the SSH libraries:
我们从SSH库中获得了相当简单的要求列表:
Good performance 很棒的表演 Scalable 可扩展 Python 3 support Python 3支持 Support the common authentication and connection models used in the majority of enterprises for connecting to network devices. 支持大多数企业用于连接网络设备的通用身份验证和连接模型。Lets now examine the effects of these requirements.
现在让我们检查这些要求的效果。
Asyncio
异步
Network operation lends itself well to asynchronous operation. In Python 3, especially starting with python 3.5, asynchronous programming became easy. In subsequent releases, all the way upto python 3.7, the readbaility and ease of use of the async mode only got better and many new libraries sprang up that did asynchronous versions of the common use cases such as REST and SSH. There’s even a Python async library for SNMP.
网络操作非常适合异步操作。 在Python 3中,尤其是从python 3.5开始,异步编程变得容易了。 在随后的发行版中,直到python 3.7,异步模式的可读性和易用性都得到了改善,并且出现了许多新库,它们执行了REST和SSH等常见用例的异步版本。 甚至还有一个用于SNMP的Python 异步库 。
To those who may not know the difference, asynchronous mode enables concurrency, which is not the same as parallel execution. A simple to understand model for understanding this difference is that concurrency is useful even with a single core while parallel execution inherently relies on multiple cores. To explain this with more words, any task has two pieces in its execution: CPU and I/O. When a task is executing I/O, its typically waiting for a long time for the IO to complete and fetch the results. Think of disk reads or network communication as examples of waiting for I/O. While the task is waiting, other tasks can be scheduled to execute. Any application that can take advantage of this waiting to work on other things is said to support concurrency. In case of parallel execution, the task is broken up into multiple individual pieces and each piece executes simultaneously on multiple cores. An example of a task that lends itself to parallelizing is performing a common operation on multiple elements in a list.
对于可能不知道它们之间的区别的人,异步模式启用并发,这与并行执行不同。 一个简单易懂的模型可以理解这种差异,即是说,即使只有一个内核,并发还是有用的,而并行执行固有地依赖于多个内核。 为了用更多的话来解释这一点,任何任务的执行都有两个部分:CPU和I / O。 当任务正在执行I / O时,它通常要等待很长时间才能使IO完成并获取结果。 将磁盘读取或网络通信视为等待I / O的示例。 在任务等待时,可以安排其他任务执行。 据说可以利用这种等待来完成其他事情的任何应用程序都支持并发。 在并行执行的情况下,任务分为多个单独的部分,每个部分在多个内核上同时执行。 一个适合于并行化的任务的示例是对列表中的多个元素执行通用操作。
On the basis of this explanation, it’s hopefully clearer that concurrency is far more critical and natural with network I/O and benefits networking code independently of whether the task can be parallelized. For example, you can use a pool of processes to parallelize rendering of a template while at the same time using a pool of threads for pushing the template to different network devices.
根据这一解释,希望可以更清楚地看出,并发对于网络I / O而言更为关键和自然,并且独立于是否可以并行执行任务而使网络代码受益。 例如,您可以使用进程池并行化模板的呈现,同时使用线程池将模板推送到不同的网络设备。
Given that Suzieq will be polling many network devices, an async library version is preferred over the non-async version to provide both good performance and for scaling.
鉴于Suzieq将轮询许多网络设备,因此异步库版本比非异步版本更可取,以提供良好的性能和扩展性。
Authentication and Connection Method Support
身份验证和连接方法支持
Connecting to network devices in enterprises and most organizations (except hyperscalars who have the ability to put together a good public key infrastructure to use certificate-based authentication), requires support for at least the following features:
连接到企业和大多数组织中的网络设备(具有能够组合良好的公钥基础结构以使用基于证书的身份验证的超标量设备),至少需要支持以下功能:
Ignoring Host Key Files 忽略主机密钥文件 Supporting Private Key FIles instead of password, including private key file with passphrase 支持私钥文件而不是密码,包括带有密码短语的私钥文件 Support communicating via a jump host 支持通过跳转主机进行通信 Support ssh config files 支持ssh配置文件 Support using ssh-agent 使用ssh-agent的支持Any python library we used in Suzieq needed to support for these libraries.
我们在Suzieq中使用的任何python库都需要支持这些库。
SSH and Network Devices
SSH和网络设备
Networking devices are notorious for not behaving like proper shells when you use SSH to connect to them. Kirk Byers, the author of netmiko, wrote a nice post about why he created the Netmiko library. He explains some of these problems.
使用SSH连接到网络设备时,网络设备的行为不正常,因此臭名昭著。 netmiko的作者Kirk Byers写了一篇不错的文章,介绍了他为什么创建了Netmiko库。 他解释了其中一些问题。
The primary problem has to do with how configuration commands work with network devices. Network device shells are contextual, especially so for configuration. For example, if you want to issue a command to assign an IP address to an interface. You need to first issue a command “configure”, followed by “interface Ethernet 1/1” (assuming Ethernet 1/1 is the interface you want to assign the IP address to), followed by “ip address 192.168.1.1”. Issuing three independent commands will not work nor will issuing the three separated by “;” as network device shells are modal. So, it would be very helpful for an SSH library intending to support communicating with network devices to help with this.
主要问题与配置命令如何与网络设备一起使用有关。 网络设备外壳是上下文相关的,尤其是对于配置而言。 例如,如果要发出命令以将IP地址分配给接口。 您需要首先发出命令“ configure”,然后发出“ interface Ethernet 1/1”(假设以太网1/1是您要为其分配IP地址的接口),然后发出“ ip address 192.168.1.1”。 发出三个独立的命令将不起作用,也不会发出以“;”分隔的三个命令 因为网络设备外壳是模态的。 因此,对于打算支持与网络设备进行通信的SSH库而言,这将非常有帮助。
Fortunately, Suzieq never uses any configuration command. All its commands are so called “show” commands. Because of this, communicating with network devices for Suzieq is no different than communicating with any Linux server or Linux NOS such as Cumulus or SONIC(I don’t know if SONIC’s shell functions only as a traditional network device shell or it can act like a Linux shell). As a consequence, we can ignore requiring the SSH library used within Suzieq from needing to support context-driven configuration commands.
幸运的是,Suzieq从未使用任何配置命令。 它的所有命令都称为“显示”命令。 因此,与Suzieq的网络设备进行通信与与任何Linux服务器或Linux NOS(例如Cumulus或SONIC)进行通信没有区别(我不知道SONIC的外壳是否仅充当传统的网络设备外壳,或者是否可以像Linux Shell)。 因此,我们可以忽略需要支持上下文驱动的配置命令而无需在Suzieq中使用的SSH库。
The first contender is Netmiko. Netmiko also comes with the ability to parse the command output and return it as a structured output via the popular command parsing library, textfsm. Netmiko is used in the popular NAPALM network device access library. netmiko itself is based off of paramiko, the next contender. Paramiko is what popular tools such as Ansible use. The third contender, ssh2-python, is a library that I ran into because it billed itself as being very fast and the basis for a fast parallel SSH client library, parallel-ssh (remember note about parallelism and concurrency above). The fourth contender is asyncssh, an asynchronous full-featured SSH implementation of SSH. The final contender is a relative newcomer called scrapli. Scrapli is not itself an SSH library, but a wrapper around paramiko, asyncssh and ssh2 SSH libraries. It provides both a synchronous and an asynchronous version of SSH connection to network devices. However, unlike asyncssh which doesn’t provide any additional support for network devices, scrapli tries to make it easy to connect to the network devices and issue configuration commands. We’ll be using the scrpali wrapper around asyncssh in the tests below. However, scrapli doesn’t provide any support for connecting to anything Linux-y like Cumulus and SONIC(??) or servers; it also supports far fewer devices than Netmiko supports, at the time of this writing.
第一名竞争者是Netmiko 。 Netmiko还具有解析命令输出并通过流行的命令解析库textfsm将其作为结构化输出返回的功能 。 Netmiko在流行的NAPALM网络设备访问库中使用。 netmiko本身基于下一个竞争者paramiko 。 Paramiko是Ansible等流行工具使用的。 第三个竞争者ssh2-python是我遇到的一个库,因为它声称自己非常快,并且是快速并行SSH客户端库parallel-ssh的基础(请记住上面有关并行性和并发性的说明)。 第四个竞争者是asyncssh ,这是SSH的异步全功能SSH实现。 最终的竞争者是一个叫做scrapli的相对较新的参与者 。 Scrapli本身不是SSH库,而是paramiko,asyncssh和ssh2 SSH库的包装。 它提供了到网络设备的SSH连接的同步和异步版本。 但是,与不为网络设备提供任何其他支持的asyncssh不同,scrapli尝试简化连接到网络设备并发出配置命令的过程。 在下面的测试中,我们将在asyncssh周围使用scrpali包装器。 但是,scrapli不提供对支持任何Linux-y的支持,例如Cumulus和SONIC(??)或服务器。 在撰写本文时,它还支持的设备数量远远少于Netmiko支持的设备。
All five libraries satisfy all the SSH functionality desired by Suzieq. Here is a table to compare the different libraries and their features:
所有五个库均满足Suzieq所需的所有SSH功能。 下表比较了不同的库及其功能:
I spun up a number of topologies using Vagrant. The simulations had different NOS such as Arista’s EOS, Cisco’s NXOS, Cumulus Linux, and JunOS. If the number of the hosts tested varies across the NOS, its because of whatever simulation I had ready to spin up. This allowed me to verify that changing the NOS didn’t affect the test results. I ran the benchmark test from my laptop, a Lenovo Yoga with i7–8550U CPU and 16GB RAM and SSD. The simulations, except for Junos, ran on a different machine, an Intel NUC with an i7–8550U processor with 64GB RAM and an SSD. The network connectivity between my laptop and the NUC was wireless. The simulation using Junos ran on my laptop as well because of the IP addressing of Virtualbox (the addressing is only exposed on the local machine by default). JunOS VM is available on virtualbox only and I could not make it work on libvirt, unlike the other NOS. I used python 3.7.5.
我使用Vagrant旋转了许多拓扑。 模拟具有不同的NOS,例如Arista的EOS,Cisco的NXOS,Cumulus Linux和JunOS。 如果在NOS上测试的主机数量有所不同,那是因为我已经准备好进行任何模拟。 这使我可以验证更改NOS不会影响测试结果。 我通过笔记本电脑(带有i7–8550U CPU和16GB RAM和SSD的Lenovo Yoga)进行了基准测试。 除Junos之外,其他仿真均在另一台计算机上运行,该计算机是带有i7–8550U处理器,64GB RAM和SSD的Intel NUC。 我的笔记本电脑和NUC之间的网络连接是无线的。 由于Virtualbox的IP地址(在默认情况下,该地址仅在本地计算机上公开),因此在我的笔记本电脑上也可以使用Junos进行模拟。 JunOS VM仅在virtualbox上可用,与其他NOS不同,我无法使其在libvirt上运行。 我使用的是Python 3.7.5。
I verified that the overall timing values were not affected if I shifted the order in which the different libraries were run i.e. I sometimes ran asyncssh first, netmiko second and so on while at other times I ran paramiko first, ssh2 second and so on.
我验证了如果更改了不同库的运行顺序,则总时序值不会受到影响,即有时我先运行asyncssh,然后运行netmiko,以此类推,而有时我先运行paramiko,然后运行ssh2,依此类推。
Netmiko has a lot more possible parameters to configure to ensure that I was doing an apples to apples comparison between the SSH performance of the various libraries. For example, I verified if setting the NOS type specifically in the connection parameters versus asking netmiko to autodetect made a difference. I didn’t test this option against all NOS, but against NXOS version 9.3.4, setting it to autodetect consistently performed better and so I left that parameter to autodetect in all the tests. Similarly, I set the use_textfsm parameter in command execution to False to ensure that the timings were not affected by any additional parsing that the library was performing after the data was obtained.
Netmiko有许多可能的参数需要配置,以确保我在进行各种库的SSH性能之间的比较。 例如,我验证了是否在连接参数中专门设置了NOS类型,而不是要求netmiko自动检测是否有所不同。 我没有针对所有NOS测试此选项,而是针对NXOS版本9.3.4进行了测试,将其设置为自动检测始终表现更好,因此我在所有测试中都将该参数保留为自动检测。 同样,我将命令执行中的use_textfsm参数设置为False,以确保计时不受数据获取后库正在执行的任何其他解析的影响。
I ran what I think is a simple, common command in each case for the NOS, “show version” (for classical NOS) and “uname -a” for Cumulus and Linux servers. The code that I used for benchmarking is available via this github gist.
对于每种情况,我都运行了一个简单的通用命令,对于Cumulus和Linux服务器,运行了“显示版本”(对于传统的NOS)和“ uname -a”。 我用于基准测试的代码可通过此github gist获得 。
Benchmarking with timeit
使用timeit进行基准测试
Python comes with a module timeit that’s supported with the base Python distribution. I used it to get the execution times for each of the four libraries. I measured a single host execution time as well as a multi-host execution time. While it is possible to write more complex code to do thread management myself, I first chose to ignore this model and execute the comamands in as simple a fashion as possible using the library. In benchmarking methodology, its generally accepted practice to execute multiple runs of the command and average out the execution time across all those times. I used a repeat count of 3 and 10 to obtain the timings because of the time to execute a command, though most benchmarking methodologies typically employ much larger numbers. I noticed however that the times were consistently slower with 10 repeats vs 3 repeats.
Python附带了模块timeit ,基本的Python发行版都支持该模块。 我用它来获取四个库中每个库的执行时间。 我测量了单个主机的执行时间以及多个主机的执行时间。 尽管可以编写更复杂的代码自己进行线程管理,但我首先选择忽略该模型,并使用该库以尽可能简单的方式执行命令。 在基准测试方法中,它通常被接受的做法是执行命令的多次运行,并在所有这些时间中平均执行时间。 由于执行命令所需的时间,我使用了3和10的重复计数来获得时序,尽管大多数基准测试方法通常使用大得多的数字。 但是我注意到,重复10次和3次相比,时间总是比较慢。
Here are some of the outputs of running the test:
这是运行测试的一些输出:
$ python ssh_timeit.py nxos 10Running single host timing for simulation: nxosSINGLE HOST RUN(Avg of 10 runs) — — — — — — — — — — — — — — — — — — — — — -asyncssh: 5.310472506971564scrapli: 14.621411228028592ssh2: 7.677067372016609paramiko: 11.580183147045318netmiko: 27.1105942610302Running multi-host timing for simulation: nxos, 4 hostsMULTI HOST RUN(Avg of 10 runs) — — — — — — — — — — — — — — — — — — — — — asyncssh: 9.257326865044888scrapli: 17.880212849995587ssh2: 32.365094934997614paramiko: 39.12168894999195netmiko: 91.02830006496515For 10 hosts, I used only 3 repeats because 10 repeats seemed to stress my server too much. The numbers are with Cumulus, and so no scrapli values:
对于10台主机,我只使用了3次重复,因为10次重复似乎使我的服务器承受了太多压力。 这些数字带有Cumulus,因此没有scrapli值:
$ python ssh_timeit.py cumulus 3Running single host timing for simulation: cumulusSINGLE HOST RUN(Avg of 3 runs) — — — — — — — — — — — — — — — — — — — — — -asyncssh: 0.6976866699988022scrapli: -1ssh2: 0.7593440869823098paramiko: 0.9228836600086652netmiko: 4.598790391988587Running multi-host timing for simulation: cumulus, 10 hostsMULTI HOST RUN(Avg of 3 runs) — — — — — — — — — — — — — — — — — — — — — asyncssh: 4.895733051002026scrapli: -1ssh2: 31.351762487960514paramiko: 28.609976636013016netmiko: 64.02421957900515And one more with Junos:
还有Junos:
$ python ssh_timeit.py junos 10Running single host timing for simulation: junosSINGLE HOST RUN(Avg of 10 runs) — — — — — — — — — — — — — — — — — — — — — -asyncssh: 1.6286625529755838 scrapli: 1.9315025190007873 ssh2: 1.5417965339729562 paramiko: 1.6812862670049071 netmiko: 12.91307156701805 Running multi-host timing for simulation: junos, 2 hostsMULTI HOST RUN(Avg of 10 runs) — — — — — — — — — — — — — — — — — — — — — asyncssh: 1.9371595539851114scrapli: 2.3284904189640656ssh2: 3.1178728850209154paramiko: 3.258504494035151netmiko: 25.267352762049995A few things jump out at these outputs:
在这些输出中有一些事情跳出来:
The single host performace across the four libraries, asyncssh, scrapli, libssh2 and paramiko, are roughly equivalent. If I run the test enough times, I can make any one of them best the others, with the exception that Scrapli never bested asyncssh. 四个库(asyncssh,scrapli,libssh2和paramiko)的单个主机性能大致相等。 如果我运行测试足够的时间,我可以使其中任何一个都比其他最好,除非Scrapli从未击败过asyncssh。 Netmiko is consistently the slowest Netmiko始终是最慢的 In the case of multihost performance, asyncssh always beats out the others, by a fairly wide margin, at least 15x faster than the slowest. 在多主机性能的情况下,asyncssh总是以相当大的优势击败其他同类产品,至少比最慢的同类产品快15倍。The results are summarized in this table below (note the slower times with 10 repeats vs 3 repeats):
结果总结在下表中(请注意,重复10次与重复3次的速度较慢):
Concurrency with Synchronous IO
同步IO的并发
So, you may wonder, how do people deal with multiple hosts using libraries other than asyncssh. The answer is that programmers have to build their own version of concurrency by either using threads or processes. Ansible uses processes, if I remember correctly. Here is a link to a post that shows how such a code might be written (I just randomly picked an entry from the search result). Python’s asyncio library uses multi-threading by default, not multi-processing, though you can write code to adapt it to use multiprocessing. I don’t want to do thread management if I can help it. The more I can rely on well-tested code, the more I can focus on my tool’s value add and also focus on testing what’s essential.
因此,您可能想知道,人们如何使用asyncssh之外的库处理多个主机。 答案是程序员必须通过使用线程或进程来构建自己的并发版本。 如果我没记错的话,Ansible会使用流程。 这里是一个链接后 ,显示了如何这样的代码可以写成(我只是随机挑选从搜索结果中的条目)。 Python的asyncio库默认情况下使用多线程,而不是多处理,尽管您可以编写代码以使其适应多处理。 如果可以的话,我不想进行线程管理。 我越依赖于经过良好测试的代码,就越可以专注于工具的增值,也可以专注于测试关键要素。
Code Readability
代码可读性
This brings me to another criterion in helping us decide on which library did we wanted to use in Suzieq. How simple and easy to read is the code. First up are the best answers in my opinion, asyncssh and netmiko. Here they are:
这使我有了另一个标准,可以帮助我们确定我们想在Suzieq中使用哪个库。 代码多么简单易读。 在我看来,首先是asyncssh和netmiko的最佳答案。 他们来了:
async def async_ssh(host, port=22, user='vagrant', password='vagrant'): conn = await asyncssh.connect(host, port=port, username=user, password=password, client_keys=None, known_hosts=None) output = await conn.run(command) # print(f'{host}, {output.stdout.strip()}, {output.exit_status}') conn.close() def netmiko_ssh(host, port=22, user='vagrant', password='vagrant'): dev_connect = { 'device_type': 'autodetect', 'host': host, 'port': port, 'username': user, 'password': password } net_connect = ConnectHandler(**dev_connect) output = net_connect.send_command(command, use_textfsm=False) net_connect.disconnect() # print(output)Both are fairly easy to follow and hide all sorts of low level details from the code. asyncssh has the equivalent of netmiko’s dev_connect variable model that you can use instead of passing the parameters in the call to connect as we’ve done. Paramiko is fairly equivalent, if not as terse as asyncssh. Scrapli follows netmiko’s model except that it doesn’t have the autodetect mode, and so its code looks like this:
两者都很容易遵循,并且从代码中隐藏了各种底层细节。 asyncssh具有与netmiko的dev_connect变量模型等效的功能,您可以使用它,而不必像完成操作那样在调用中传递参数来进行连接。 Paramiko相当相当,即使不如asyncssh简洁。 Scrapli遵循netmiko的模型,但它没有自动检测模式,因此其代码如下所示:
async def scrapli_ssh(host, port=22, user='vagrant', password='vagrant'): dev_connect = { "host": host, "auth_username": user, "auth_password": password, "port": port, "auth_strict_key": False, "transport": "asyncssh", } if use_sim == nxos_sim: driver = AsyncNXOSDriver elif use_sim == eos_sim: driver = AsyncEOSDriver elif use_sim == junos_sim: driver = AsyncJunosDriver async with driver(**dev_connect) as conn: # Platform drivers will auto-magically handle disabling paging for you output = await conn.send_command(command) # print(output)Next up is ssh2-python (this is taken from ssh2-python’s examples):
接下来是ssh2-python(摘自ssh2-python的示例):
def ssh2_ssh(host, port=22, user='vagrant', password='vagrant'): # Make socket, connect sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((host, port)) # Initialise session = Session() session.handshake(sock) session.userauth_password(user, password) # Public key blob available as identities[0].blob # Channel initialise, exec and wait for end channel = session.open_session() channel.execute(command) channel.wait_eof() channel.close() channel.wait_closed() # Print output output = b'' size, data = channel.read() while size > 0: output += data size, data = channel.read() # Get exit status output = output.decode("utf-8").strip() # print(f'{host}, {output}, {channel.get_exit_status()}')This is far less elegant than the first two, exposing sockets, sessions and channels. While one could argue that all this code could be tucked away in a routine and made to look as elegant as asyncssh or netmiko, I’m inherently lazy. I don’t want to do work that I don’t want to, unless there’s a real strong motivation to do it.
这比前两个要优雅得多,它们公开了套接字,会话和通道。 虽然有人可能会争辩说所有这些代码都可以放在例程中并使其看起来像asyncssh或netmiko一样优雅,但我天生就是懒惰的。 除非有强烈的动力去做,否则我不想做不想做的工作。
The main takeaways from these results are:
这些结果的主要收获是:
async versions far outstrip their synchronous equivalents when it comes to performance 在性能方面,异步版本远远超过了同步版本 Before the advent of Scrapli, it was difficult for network operators to use asynchronous SSH libraries 在Scrapli出现之前,网络运营商很难使用异步SSH库 Network devices make it far more difficult to work with compared to Linux-y NOS because what they offer is not a programmable shell. 与Linux-y NOS相比,网络设备使使用起来更加困难,因为它们提供的不是可编程的外壳。While this is not a professional benchmarking article, I hope it helped the readers appreciate what a serious difference asyncio makes in performance. Networking tools can get a good leg up in performance and staying simple by moving their libraries to the async versions.
尽管这不是一篇专业的基准测试文章,但我希望它能帮助读者理解asyncio在性能方面的严重差异。 通过将它们的库移至异步版本,网络工具可以提高性能并保持简单。
Asyncssh is the python ssh library used in Suzieq. Its successfully connected to Juniper MX, Juniper QFX, Cisco’s 9K, Cumulus, Arista and SONIC machines without a problem. We use textfsm internally on the gathered data, if structured output is not available. Given our requirements, this was the best choice. Our choice was validated even more by how helpful the maintainer of asyncssh, Ron Frederick is. I needed help with figuring something out, and he sent me an excellent, detailed and thoughtful response. What more could a developer or user ask for?
Asyncssh是Suzieq中使用的python ssh库。 它成功连接到Juniper MX,Juniper QFX,思科的9K,Cumulus,Arista和SONIC机器,没有任何问题。 如果结构化输出不可用,则在内部对收集的数据使用textfsm。 根据我们的要求,这是最佳选择。 异步维护者Ron Frederick的帮助对我们的选择进行了进一步的验证。 我需要帮助找出问题,他给了我一个出色,详尽和周到的答复。 开发人员或用户还能要求什么?
翻译自: https://medium.com/the-elegant-network/a-tale-of-five-python-ssh-libraries-19cb8b72c914
python ssh库
相关资源:Python SSH模块