nsga竞标赛

科技2022-08-04 107

nsga竞标赛

Real-Time Bidding (RTB) has become a relevant paradigm in display advertising. It mimics stock exchanges and utilizes computer algorithms to buy and sell ads in real-time automatically. Imagine that you have to participate in N ≫ 1 of those online ad auctions with a limited bidding budget. The task is to create such a bidding strategy that you can win some of them, and that the placed ads generate at least Nc clicks. That should be done by spending as little money as possible. In the following, we will look at a possible solution to this problem.

实时出价(RTB)已成为展示广告的一种相关范例。它模仿证券交易所，并利用计算机算法自动实时地买卖广告。想象一下，您必须在有限的出价预算下参加N≫ 1项在线广告拍卖。我们的任务是制定一种出价策略，以使您可以赢得其中的一些，并且所放置的广告至少产生Nc次点击。这应该通过花尽可能少的钱来完成。在下文中，我们将研究该问题的可能解决方案。

1.实时出价生态系统 (1. Real-Time Bidding ecosystem)

A brief description of the RTB ecosystem is given in the figure above. When a user visits an ad-supported site each ad placement will trigger an auction. Bid requests will be sent via the ad exchange to the different bidding agents. Upon receiving a bid request, every bidding agent calculates a bid that is sent together with an ad to the Ad exchange. Finally, the winner’s ad will be shown to the visitor along with the regular content of the website. The whole process should be completed within a fraction of the second. A more detailed introduction to RTB could be found in [1,2].

上图中对RTB生态系统进行了简要说明。当用户访问广告支持的网站时，每个广告展示位置都会触发拍卖。出价请求将通过广告交易平台发送给其他出价代理。收到出价请求后，每个出价代理都会计算与广告一起发送到Ad Exchange的出价。最后，优胜者的广告将与网站的常规内容一起显示给访问者。整个过程应该在一秒钟之内完成。在[1,2]中可以找到有关RTB的更详细介绍。

2.问题描述 (2. Problem description)

Winning bid distribution of an auction. The probability to win the auction by placing bid x is given by the area under ps(s) on the left side of x. 拍卖中标分配。通过放置出价 x赢得拍卖的概率由 x左侧ps(s)下的区域给出。

For simplicity, we will consider that we only have to create a strategy for a particular ad (for example, white sneakers from a particular brand) but the approach can be easily generalized to multiple ads, each one of them having a different budget and target. The ad exchange generates a large number of bid requests which are processed by many bidding agents, each one of them having the opportunity to make a bid. The user and publisher data contained in every bid request could be used to predict the probability distribution function of the winning bid price s, and the probability that the user will click on the displayed ad. For every auction n ∈ {1,…N} they will be denoted as:

为简单起见，我们认为我们只需要为特定广告创建策略(例如，特定品牌的白色运动鞋)，但是这种方法可以轻松地推广到多个广告，每个广告都有不同的预算和目标。广告交换产生大量的出价请求，这些出价请求由许多投标代理处理，每个投标代理都有机会进行投标。每个投标请求中包含的用户和发布者数据可用于预测中标价格s的概率分布函数，以及用户单击显示的广告的概率。对于每次拍卖n∈{1，…N}，它们将表示为：

For every auction n, we will place a bid price xn. The probability to win is then given by:

对于每次拍卖n，我们都会下一个竞标价格xn 。然后，获胜的概率为：

The integral from 0 to xn takes into account all cases where the winning bid price generated by taking into account all other participants except us is smaller than our bid price xn. Because of the probabilistic nature of our assumptions, we can not guarantee which auction we are going to win or if a user will click on the displayed ad. To describe these random events we will use the following Bernoulli random variables:

从0到xn的积分考虑了所有情况，其中通过考虑除我们以外的所有其他参与者而产生的中标价格小于我们的出价xn 。由于我们的假设具有概率性质，因此我们无法保证我们将赢得哪场拍卖，也无法保证用户是否会点击所展示的广告。为了描述这些随机事件，我们将使用以下伯努利随机变量：

where Cn describes the user ad click events (click: Cn=1, no click: Cn=0) and Wn|xn — the event of winning the n-th auction by placing the bid price xn (win: Wn|xn=1, loss: Wn|xn=0). The probability for each one of these events to occur is given by:

其中Cn描述用户广告点击事件(点击： Cn = 1 ，没有点击： Cn = 0 )，并且Wn | xn-通过放置出价xn赢得第n次拍卖的事件(获胜： Wn | xn = 1 ，损失： Wn | xn = 0) 。这些事件中每一个发生的概率由下式给出：

The total number user clicks on our ad obtained by placing the bids {xn|n = 1, 2 . . . N} is given by:

用户通过放置出价{xn | n = 1，2所获得的广告点击次数。。。 N}由下式给出：

This is a random variable, as well. For simplicity, we will look only at its expected value:

这也是一个随机变量。为简单起见，我们将仅查看其预期值：

The amount of money spent on the auctions that we have won can be described by the following random variable:

我们赢得的竞标花费金额可以通过以下随机变量来描述：

As in the equation for the total number of click events, we will look only at the expected value of this variable:

就像点击事件总数的方程式一样，我们将仅查看此变量的期望值：

The problem of placing N bids x1, . . . xN such that the expected number of user clicks 𝔼(Υ) = Nc and that the spent amount of money on winning bids is minimized can be solved with the method of the Lagrange multipliers:

放置N个出价x1，...的问题。。。 xN可以使预期的用户点击次数𝔼(Υ)= Nc，并且使中标所花费的金钱最小化可以使用拉格朗日乘数的方法解决：

where f(x) has to be minimized under the condition that g(x) = 0.

其中g(x)= 0必须将f(x)最小化。

3.优化问题的解决方案 (3. Solutions to the optimization problem)

We will consider an analytically solvable case that can be used to check if our numerical solution is implemented correctly. Then we will briefly describe some of the problems that arise if we apply this approach to real data: the large system of equations that have to be solved and the approximation of the winning bid probability distribution by using a finite number of observations. A numerical approach that addresses these two problems can be found in this Github repository.

我们将考虑一个可解析的情况，该情况可用于检查我们的数值解决方案是否正确实现。然后，我们将简要描述将这种方法应用于实际数据时会出现的一些问题：必须解决的大型方程组以及通过使用有限数量的观察值来近似中标概率分布。可以在此Github存储库中找到解决这两个问题的数值方法。

3.1单次点击率和中标分布 (3.1 Single click-through probability and winning bid distribution)

We will assume that the winning bid distribution for every auction n can be parametrized by an exponential distribution:

我们将假设每次拍卖n的中标价格分布可以通过指数分布来参数化：

It follows that the probability to win auction n if our bid is xn is given by:

因此，如果我们的出价为xn，则赢得拍卖n的概率为：

To make the problem analytically solvable we have assumed that the probability distribution functions to win the auctions 1, . . . N and the corresponding user click-through probabilities are all the same:

为了使该问题在分析上可以解决，我们假设概率分布函数可以赢得拍卖1。。。 N和相应的用户点击率都相同：

By applying the method of the Lagrange multipliers, we obtain the optimal bid price xn and the expected amount of money spent to be:

通过应用拉格朗日乘数的方法，我们可以获得最佳出价xn和预期的花费金额为：

In real situations, we expect that N·pc ≫ Nc (i.e. we have to win only a small fraction of all auctions to achieve the goal of getting Nc clicks) which allows us to expand ln() around 1:

在实际情况下，我们期望N · pc≫ Nc (即，我们必须赢得所有拍卖的一小部分才能达到获得Nc点击的目标)，这可以使ln()扩大1左右：

Since 1/α is the mean value of the exponential distribution function and Nc/(N·pc) ≪ 1, it follows that x is a very low value, i.e. we are participating at every auction with a very low bid price. We may speculate that a similar result is obtained if we use different probability distribution functions for the prices of successful bids, i.e. that we will only be interested in the left side of the distribution because that is where the optimal value is located. This also implies that we should have a very precise description of p_{W|x} for a small x, which in practice could be a difficult task to achieve.

由于1 /α是指数分布函数的平均值，并且Nc /(N·pc)≪ 1 ，因此x是一个非常低的值，即我们以非常低的出价参加每次拍卖。我们可以推测，如果对中标价格使用不同的概率分布函数，则将获得相似的结果，即，由于分布在最优值的位置，因此我们仅对分布的左侧感兴趣。这也意味着对于一个小的x ，我们应该对p_ {W | x}有一个非常精确的描述，这在实践中可能是一项艰巨的任务。

3.2多种点击率和中标分布 (3.2 Multiple click-through probabilities and winning bid distributions)

The general case where each auction is described by a unique probability distribution function and where the click-through probabilities can be different for each n can be solved numerically using the Python scipy library. This approach quickly becomes unfeasible if N is in the order of 10³, which is not sufficient for more realistic cases with N >10⁶. To make the problem manageable by the python scipy library, we will assume that the winning bid distribution of an auction can be described by one out of I different possible probability distribution functions:

可以使用Python scipy数值求解一般情况，其中每个拍卖都由唯一的概率分布函数描述，并且每个n的点击概率可能不同图书馆。如果N大约为10³ ，则此方法将很快变得不可行，对于N>10⁶的更实际情况而言，这是不够的。为了使问题可以通过python scipy库解决，我们将假定拍卖的中标价格分布可以由以下I个可能的概率分布函数中的一个描述：

The same idea can be applied to the click-through probability which can only take J different values:

可以将相同的想法应用于只能采用J个不同值的点击率：

If we look closely at the solution to the optimization problem (8), we see that the optimal bid price is the same for all auctions with the same distribution of successful bids i and the same click-through probability j. We will denote this optimal price with x ̃_{ij}. With these considerations in mind, the functions f, g from the Lagrange optimization problem (8) can be rewritten to:

如果我们仔细研究优化问题(8)的解决方案，我们会发现，对于所有具有相同成功投标i分布和相同点击率j的拍卖，最优投标价格都是相同的。我们将用x ̃_ {ij}表示这个最优价格。考虑到这些考虑因素，来自拉格朗日优化问题(8)的函数f，g可重写为：

where N_{ij} is equal to the number of cases where the distribution of successful bids is of type i and the click-through probability is of type j. With this simplification, we can numerically solve problems where I·J <10³.

其中N_ {ij}等于中标的分布为i类型且点击率为j类型的情况数。通过这种简化，我们可以在数值上解决I·J <10³的问题。

To demonstrate the applicability of this approach, we have considered the case where I=3 and J=2:

为了证明这种方法的适用性，我们考虑了I = 3和J = 2的情况：

The optimal solution is shown in the following figure:

最佳解决方案如下图所示：

Optimal bids for the case of having three types of auctions (described by the winning bid distribution ps(s)) and two types of click-through probabilities pc. 在具有三种类型的拍卖(由中标价格分布ps(s)描述)和两种类型的点击概率pc的情况下的最佳出价。 N (subset) column refers to the number of auctions where the winning bid distribution functions and the user click-through probabilities are the same. The N(子集)列是指中标分配功能和用户点击率相同的拍卖次数。使用分析概率分布函数(pdf)时， x (analytical pdf) column contains the optimal bid prices when using the analytical probability distribution function (pdf). The x(分析pdf)列包含最佳出价。从样本数据点推断pdf时， x (spline pdf) column contains the optimal bid prices when inferring the pdf from a sample of data points. The relative error of the optimal bid prices in both cases is at most 2%. x(样条pdf)列包含最佳竞标价格。在这两种情况下，最佳竞标价格的相对误差最多为2％。

3.3从真实数据获得概率分布函数 (3.3 Obtain probability distribution functions from real data)

Under realistic conditions, we have to infer the probability distribution of successful bids from the events (prices of successful bids) in our data. We can count the number of events for a grid of x values and then use spline interpolation as an approximation of the distribution function. We have applied this idea to the previous example, where instead of using the analytical form of the winning bid distribution, we have sampled data points from this distribution. From the table above you can see that the differences between the two solutions are minimal. We must take into account that the number of sampled data points per distribution is in the order of 10⁶. A lower number of sampled data points inevitably leads to a lower accuracy of the spline approximation. We also have to keep in mind that the spline approximation generates a function h(x) whose second derivative d²h(x)/dx² is zero at the boundaries of the x grid. This restriction can become problematic for probability distribution functions that do not go to 0 for x → 0. One such example is the exponential probability distribution function, where the second derivative at x = 0 is:

在现实条件下，我们必须从数据中的事件(中标价格)推断中标的概率分布。我们可以计算x值网格的事件数，然后将样条插值用作分布函数的近似值。我们已将此想法应用到前面的示例中，在该示例中，我们不使用中标竞标分布的分析形式，而是从该分布中采样了数据点。从上表中可以看到两种解决方案之间的差异很小。我们必须考虑到每个分布的采样数据点的数量级约为10 7 。数量较少的采样数据点不可避免地导致样条逼近的准确性降低。我们还必须记住，样条曲线逼近会生成一个函数h(x) ，该函数的二阶导数d²h(x)/dx²在x网格的边界处为零。对于x→0不会变为0的概率分布函数，此限制可能会成为问题。一个这样的例子是指数概率分布函数，其中x = 0的二阶导数是：

Another problem is that with the spline approximation we cannot guarantee that the resulting function is non-negative.

另一个问题是，通过样条曲线逼近我们不能保证结果函数为非负数。

摘要 (Summary)

In this article, we have created a simple bidding strategy by assuming that we know the winning bid probability distribution function of each auction and the click-through probability for each advertising event. From the two examples we have considered, we have seen that the optimal solution requires precise knowledge of the left side of the winning bid probability distribution function.

在本文中，我们通过假设我们知道每次拍卖的中标概率分布函数和每个广告事件的点击概率来创建简单的出价策略。从我们考虑的两个示例中，我们已经看到，最佳解决方案需要对中标概率分布函数左侧的精确知识。

翻译自: https://medium.com/ki-labs-engineering/an-ad-auction-bidding-strategy-cd8f95d77d50

nsga竞标赛