客户终身价值的简单计算:一个固定的局部最优策略方法外文翻译资料

 2023-01-11 10:25:07

客户终身价值的简单计算:一个固定的局部最优策略方法

Julio B. Clempner amp; Alexander S. Poznyak

摘要:在本文中,我们展示了一个新的方法——以一个固定的局部最优策略来计算客户终身价值。这个方法就是开发一系列的可控的马尔可夫链。我们提出的方法是基于图周值函数波动(增加或减少)的动态过程。我们证明它可能在递归形式中代表固定最优策略。然后,我们提供了一个分析公示的数值实现固定局部最优策略。我们还实现用第二种方法来解决同样的问题,就是基于线性规划,用c变量方法计算可以更加容易处理。最后我们表明这两种方法具有相关性,在一系列有限的数值迭代运算之后,我们提议过的两种方法汇聚为线性规划方法。另外提出了非传统的方法——遍历性验证。上述方法的有效性已经在理论上成功地演示了,并在最优化和博弈论方法上通过模拟信用卡的营销实验来计算客户终身价值。

关键词:客户终身价值; 最优化; 最优政策方法; 线性规划; 遍历性可控的马尔可夫链

1.介绍

客户终身价值(CLV)被定义为根据消费者的活动,未来边际收入的折现价值(Gupta et al.2004)。具体来说,CLV估值技术就是当存在一个长期关系时,估计在客户一生中的盈利能力和量化(等级)的潜在经济价值(Berger et al. 1998,Blattberg et al. 2001,Dwyer 1997,Rust 2000, Aravindakshan et al. 2004, Rust et al. 2004, Blattberg et al. 2009, Gupta 2009,Persson 2013)。它在预测货币价值中,将客户细分、收入、投资回报率等因素考虑在内。结果,CLV是将客户终身价值最大化并增加持续时间的制定和实施客户特定策略的方法(Blattberg et al. 2009, Gupta 2009)。在本文中,CLV的计算是基于由马尔可夫决策过程(MDP)来表示的客户购买动态行为。

MDP是一个健全的模型,该模型允许通过能够决定最优方法来采取当前状态考虑奖励的最优策略来控制马尔可夫链(Poznyak et al. 2000)。线性规划 (Vorobeychik et al. 2012)是个可用于简化动态编程方法的的独立技术。然后,对于MDP几乎可行的方法是基于线性规划。

优化问题的一个最基本的主题就是减少政策优化的空间(Howard 1960)。这个主意是为了减少搜索最有政策的复杂性和显示对那些容易实现的政策的注意。在一些最优化问题上,表明限制一系列固定化政策最优化的可能性是很重要的。这简化了搜索,因为很多在静止状态下能够达到的计算方法可实现的要求相对较低的内存。Derman (1970) 发现了一个线性规划方式为动态规划和对偶程序的调查,使用不同的线性规划,他得到了一个最优控制策略。后来,Hordijk and Kallenberg (1979)直接从对偶程序中夺得了最优策略。然而,重要性理解随机决策变量的意思是,它可能使我们获得线性规划直接解决MDPs而不使用双重计划。从这个意义上,Altman 和Shwartz(1991)获得了线性规划通过遍历性假设来解决MDPs。条件限制搜索最优政策MDP固定政策(甚至确定性固定政策)史研究领域的一个兴趣点 (Borkar 1983, Borkar 1986, Clempner and Poznyak 2011, Feinberg and Shwartz 2002,Sennott 1986,Sennott 1989,Vorobeychik et al. 2012)。

本文中,我们提出一个新方法寻找固定局部最优策略(Clempner and Poznyak 2011, Clempner and Poznyak 2013) 考虑一个有限状态和行动的遍历性控制马尔可夫链来计算CLV:

我们首先提出一个图周值函数波动(增加或减少)的动态过程。

我们证明它可以代表该函数递归形式使用领先一步固定局部最优策略。

最后,我们提供了一个分析公式的数值来实现固定局部最优策略。重要的是要注意到本文相关策略的分析是跟一个广泛应用于现代化理论的领先一步优化算法息息相关的。

我们表明实用程序序列相对应于局部最优行为策略的非单调性字符,不允许证明极限点的存在。

另外,最优策略的收敛性由一系列遍历性可控有限的马尔可夫链可得 (Poznyak et al. 2000)。

遍历性验证我们呈现了一种非传统方法,证明了马尔可夫链的驳斥不依赖于历史经验。

我们还描述了一个不同的方法使用C变量方法基于线性规划的方法来解决相同的问题。

介绍c变量方法的一个重要方面是一个不可行的解决方案可以被用c变量的简单测验检测到。此外,c变量方法使问题更易于处理。

我们终于表明这两种方法有关:在提议过的方法有限数量的迭代方法收敛达到相同的结果,作为线性规划的方法。

我们理论上验证了该方法,并从优化和博弈论的角度通过了信用卡营销模拟实验。

本文的其余部分组织如下。以下部分解释了该方法背后的动机使用马尔可夫链来计算客户生命周期价值,提出了信用卡营销的例子。在第三节,我们现展示了下文所需的精确的背景资料。接下来,在第四节,我们介绍发展的州值函数来寻找局部最优政策。这包括一个非传统的方法遍历性验证表明存在类齐次马尔可夫链,在很长一段时间的“忘记”的初始状态开始。第五节描述了一个线性规划方法来解决同样的问题,介绍了在确定离位条件并使问题容易计算处理中C变量的使用方法。第六届中展示了我们目前有关信用卡促销为算例,优化和博弈论方法。最后,在第7节中,我们描述我们的结论和现在我们的未来的工作。

2.动机

马尔可夫决策过程为研究许多重要的概率系统相关的问题提供了提供丰富的数学框架。这一过程由一组状态,行动,转移概率和效用函数组成。MDP在模拟动态环境和预测消费者和供应商的行为中有效用。MDP的目标是最大化在一个给定的有限时间达到目标状态的可能性。为了达到目标,在每个时间段系统声明提供选择一个操作所需的所有信息。

我们专注于遍历可控马尔可夫决策过程的类有一个独特的固定(即限制)分布。一个可控的马尔可夫决策过程的目标是构建一个控制器,通过选择合适的继任者的状态以便达到一个确定的目标。

剩余内容已隐藏,支付完成后下载完整资料


外文文献出处:Library Hi Tech

附外文文献原文

SIMPLE COMPUTING OF THE CUSTOMER LIFETIME VALUE: A FIXED

LOCAL-OPTIMAL POLICY APPROACH

Abstract

In this paper,we present a new method for finding a fixed local-optimal policy for computing the

customer lifetime value. The method is developed for a class of ergodic controllable finite Markov

chains. We propose an approach based on a non-converging state-value function that fluctuates

(increases and decreases) between states of the dynamic process. We prove that it is possible to

represent that function in a recursive format using a one_step_ahead fixed-optimal policy. Then, we provide an analytical formula for the numerical realization of the fixed local-optimal strategy. We also present a second approach based on linear programming, to solve the same problem,that implement the c-variable method for making the problem computationally tractable. At the end, we show that these two approaches are related: after a finite number of iterations our proposed approach converges to same result as the linear programming method. We also present a non-traditional approach for ergodicity verification. The validity of the proposed methods is successfully demonstrated theoretically and, by simulated credit-card marketing experiments computing the customer lifetime value for both an optimization and a game theory approach.

Keywords: Customer lifetime value, optimization, optimal policy method, linear programming, ergodic controllable Markov chains

1. Introduction

The customer lifetime value (CLV) is defined as the discounted value of future marginal earnings based on the customers activity (Gupta et al. 2004). Specifically,CLV is a valuation technique that estimates the customer profitability and quantifies (ranks) the potential financial value of customers over their lifetime, only if a long-term relationship exists (Berger et al. 1998,Blattberg et al. 2001,Dwyer 1997,Rust 2000, Aravindakshan et al. 2004, Rust et al. 2004, Blattberg et al. 2009, Gupta 2009,Persson 2013).It predicts monetary values of the customers taking into account the segment of the customer, the revenue, the returns on investments, etc. As a result, the CLV is a method for formulating and implementing customer-specific strategies for maximizing their lifetime profits and increasing their lifetime duration (Blattberg et al. 2009, Gupta 2009). In this paper,the computation of the CLV is based on the customer purchase dynamic behavior represented by a Markov Decision Process (MDP).

An MDP is a robust model that allows controlling a Markov chain by finding the optimal policies that determine what optimal action to take to the current state considering the reward (Poznyak et al. 2000). Linear Programming (Vorobeychik et al. 2012) is an independent technique that can be used to simplify the Dynamic programming method. However, for MDP practically the only available methods are based on Linear Programming.

A fundamental topic in optimization problems is the reduction of the space of policies to optimize (Howard 1960). The idea is to decrease the complexity of the search for optimal policies and restrict the attention to those policies that are easy to implement. In several optimization problems it is important to show that it is possible to restrict the optimization to a class of stationary policies. This simplifies the search because many computational methods are available in the stationary case and the implementation requires relatively little memory. Derman (1970) found a Linear Program formulation for dynamic programming and investigated its dual program and, using a different Linear Program he obtained an optimal control policy. Afterward, Hordijk and Kallenberg (1979) obtained an optimal policy directly from the dual program.However, the importance in understanding the stochastic meaning of the decision variables is that it may enable us to obtain Linear Program for solving MDPs directly without using the dual program. In this sense, Altman and Shwartz (1991) obtained a Linear Program for solving MDPs by making eigodic assumptions. Conditions that restrict the search of optimal policies for MDPs to stationary policies (or even to deterministic stationary policies) are an interest area of research (Borkar 1983, Borkar 1986, Clempner and Poznyak 2011, Feinberg and Shwartz 2002,Sennott 1986,Sennott 1989,Vorobeychik et al. 2012).

In this paper we present a new method for finding a fixed-local optimal policy (Clempner and Poznyak 2011, Clempner and Poznyak 2013) for computing the CLV considering an ergodic controlled Markov chain with finite states and actions:

We first propose a non-converging state-value function that fluctuates (increases and decreasebetween states of the dynamical process.

We prove that it is possible to represent that ftinction in a recursive format using a one-step-ahead fixed local-optimal policy.

As a result, we provide an analytical formula for the numerical realization of the fixed local-optimal strategy. It is important to note that the policies analyzed in this paper are related with the one step-ahead optimization algorithms are widely used in the modern optimization theory.

We show that the behavior of the utility sequence corresponding to the local-optimal strategy has a non-monotonic character that does not permit to prove exactly the existence ofa limit point.

In addition, the convergence of the optimal strategy is also obtained for a class of ergodic controllable finite Markov chains (Poznyak et al. 2000).

We present a non-traditional approach for eigodicity verification, which proves that the confutation on the proposed class of homogeneous Markov chain does not depend on its history.

We also describe a different method based on a Linear Programming approach using the c-v

剩余内容已隐藏,支付完成后下载完整资料


资料编号:[287233],资料为PDF文档或Word文档,PDF文档可免费转换为Word

您需要先支付 30元 才能查看全部内容!立即支付

发小红书推广免费获取该资料资格。点击链接进入获取推广文案即可: Ai一键组稿 | 降AI率 | 降重复率 | 论文一键排版