pytorch度量学习新功能

科技2025-03-23 50

PyTorch Metric Learning has seen a lot of changes in the past few months. Here are the highlights.

在过去几个月中， PyTorch Metric Learning发生了很多变化。这里是重点。

距离，减速器和调节器 (Distances, Reducers, and Regularizers)

Loss functions are now highly customizable with the introduction of distances, reducers, and regularizers.

现在，通过引入距离，归约器和正则化器，可以高度自定义损失函数。

距离 (Distances)

Consider the TripletMarginLoss in its default form.

考虑其默认形式的TripletMarginLoss。

This loss function attempts to minimize:

此损失函数试图使以下各项最小化：

where “d” represents L2 distance. But what if we want to use a different distance metric like unnormalized L1, or signal-to-noise ratio? With the distances module, you can try out these ideas easily:

其中“ d”代表L2距离。但是，如果我们想使用其他距离度量标准(如未标准化的L1或信噪比)怎么办？使用距离模块，您可以轻松地尝试以下想法：

You can also use similarity measures rather than distances, even though similarities are inversely related to distances:

您也可以使用相似性度量而非距离，即使相似性与距离成反比：

With a similarity measure, the TripletMarginLoss internally swaps the anchor-positive and anchor-negative terms:

通过类似的度量，TripletMarginLoss在内部交换了anchor阳性和anchor阴性术语：

where “s” represents similarity.

其中“ s”代表相似性。

减速器 (Reducers)

Losses are typically computed per element, pair, or triplet, and then are reduced to a single value by some operation, such as averaging. Many PyTorch loss functions accept a reduction parameter, which is usually “mean”, “sum”, or “none”.

通常按元素，对或三元组计算损失，然后通过某些操作(例如求平均值)将其减少到单个值。许多PyTorch损失函数接受减少参数，通常为“均值”，“和”或“无”。

In PyTorch Metric Learning, the reducer parameter serves a similar purpose, but instead takes in an object that performs the reduction. Here is an example of a ThresholdReducer being passed into a loss function:

在PyTorch度量学习中，reducer参数起着类似的作用，但取而代之的是执行还原的对象。这是将ThresholdReducer传递到损失函数的示例：

This ThresholdReducer will discard losses that fall outside of the range (10, 30), and then return the average of the remaining losses.

此ThresholdReducer将丢弃超出范围(10，30)的损失，然后返回剩余损失的平均值。

正则化器 (Regularizers)

It’s common to add embedding or weight regularization terms to the core metric learning loss. Thus, every loss function has an optional embedding regularizer parameter:

通常在核心度量学习损失中添加嵌入或权重正则化术语。因此，每个损失函数都有一个可选的嵌入正则化参数：

And classification losses have an optional weight regularizer parameter:

分类损失具有可选的权重调整器参数：

灵活的MoCo自我监督学习 (Flexible MoCo for Self-Supervised Learning)

Momentum Contrastive Learning (MoCo) is a state-of-the-art self-supervision algorithm.

动量对比学习(MoCo)是一种最新的自我监督算法。

original paper 原始文件

In a nutshell, it consists of the following steps:

简而言之，它包括以下步骤：

Initialize two convnets, Q and K, that have identical weights.

初始化两个权重相同的convnet Q和K。 At each iteration of training, set the weights of K to (m)*K + (1-m)*Q, where m is the momentum.

在每次训练迭代时，将K的权重设置为(m)* K +(1-m)* Q，其中m是动量。 Retrieve a batch of images, X, and a randomly augmented version, X`.

检索一批图像X和一个随机增强的版本X`。 Pass X into Q, and X` in K, and store K’s output in a large queue.

将X传递给Q，将X传递给K，然后将K的输出存储在大队列中。 Apply the InfoNCE loss (a.k.a NTXent), using [Q_out, K_out] as positive pairs, and [Q_out, queue] as negative pairs.

使用[Q_out，K_out]作为正对，并使用[Q_out，queue]作为负对，应用InfoNCE损失(aka NTXent)。 Backpropagate and update Q.

反向传播并更新Q。

This simple procedure works amazingly well for creating good feature extractors. You might be wondering if it’s possible to use a different loss function, distance metric, or reduction method. And what about mining hard negatives from the queue?

这个简单的过程非常适合创建良好的特征提取器。您可能想知道是否有可能使用其他损失函数，距离度量或折减方法。那么从队列中挖掘硬底片又如何呢？

With this library, it’s very easy to try these ideas by using CrossBatchMemory. First, initialize it with any tuple-based loss, and optionally supply a miner:

有了这个库，使用CrossBatchMemory可以很容易地尝试这些想法。首先，使用任何基于元组的损失将其初始化，并可选地提供一个矿工：

Create “labels” to indicate which elements are positive pairs, and specify which part of the batch to add to the queue:

创建“标签”以指示哪些元素是正对，并指定要添加到队列中的批次的哪一部分：

Compute the loss and step the optimizer. CrossBatchMemory takes care of all the mining, loss computation, and bookkeeping for the queue:

计算损失并逐步优化。 CrossBatchMemory负责队列的所有挖掘，损失计算和簿记：

To confirm that CrossBatchMemory works with MoCo, I wrote a notebook demonstrating that it achieves accuracy equivalent to the official implementation on CIFAR10 (using InfoNCE and no mining). You can run the notebook on Google Colab.

为了确认CrossBatchMemory是否可以与MoCo一起使用，我写了一个笔记本，展示了其达到的精度与CIFAR10上的正式实现等效(使用InfoNCE且未进行挖掘)。您可以在Google Colab上运行笔记本。

精度计算器 (AccuracyCalculator)

If you need to compute accuracy based on k-nearest-neighbors and k-means clustering, AccuracyCalculator is a convenient tool for that. By default, it computes 5 standard accuracy metrics when you pass in query and reference embeddings:

如果您需要基于k最近邻和k均值聚类计算精度，则AccuracyCalculator是一个方便的工具。默认情况下，当您传递查询和引用嵌入时，它将计算5个标准准确性指标：

Adding your own accuracy metrics is straightforward:

添加您自己的准确性指标很简单：

Now when you call “get_accuracy”, the returned dictionary will include “some_amazing_metric”. Check out the documentation for details on how this works.

现在，当您调用“ get_accuracy”时，返回的字典将包含“ some_amazing_metric”。请查阅文档以获取有关其工作原理的详细信息。

分布式包装 (Distributed Wrappers)

To make losses and miners work in multiple processes, use the distributed wrappers:

要使损失和矿工在多个流程中工作，请使用分布式包装器：

Why are these wrappers necessary? Under the hood, metric losses and miners usually have to access all tuples in a batch. But if your program is running in separate processes, the loss/miner in each process doesn’t get to see the global batch, and thus, will see only a fraction of all tuples. Using the distributed wrappers fixes this problem. (Thanks to John Giorgi who figured out how to implement this in his project on constrastive unsupervised textual representations, DeCLUTR.)

为什么需要这些包装器？在幕后，公制损失和矿工通常必须成批访问所有元组。但是，如果您的程序在单独的进程中运行，则每个进程中的损失/矿工都无法看到全局批处理，因此只能看到所有元组的一小部分。使用分布式包装程序可以解决此问题。 (感谢John Giorgi提出了如何在他的有关无监督文本表示形式DeCLUTR的项目中实现这一点。)

Google Colab +文档上的示例 (Examples on Google Colab + Documentation)

To see how this library works in actual training code, take a look at the example notebooks on Google Colab. There’s also a lot of documentation and an accompanying paper.

要查看该库在实际培训代码中的工作方式，请查看Google Colab上的示例笔记本。还有很多文档和随附的论文。

As a final note, here’s a long and narrow view of this library’s contents.

最后一点，这是该库内容的狭长视图。

Hope you find it useful!

希望你觉得它有用！

翻译自: https://medium.com/@tkm45/pytorch-metric-learning-whats-new-15d6c71a644b

相关资源：pytorch 深度学习机器视觉

Processed: 0.012, SQL: 8