Re: RFM analysis

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Re: RFM analysis

Jim Lemon-4
Hi Hemant,
Let's take it one step at a time. Save this code as "qdrfm.R" in your
R working directory: It includes the comments I added last time and
fixes a bug in the recency scoring.

 date.format="%Y-%m-%d",weights=c(1,1,1),finish=NA) {

 # if no finish date is specified, use current date
 if( finish<-as.Date(date(), "%a %b %d %H:%M:%S %Y")
 cat("Range of purchase recency",range(x$rscore),"\n")
 cat("Range of purchase freqency",range(table(x[,1])),"\n")
 cat("Range of purchase amount",range(by(x[,2],x[,1],sum)),"\n")
 # initialize a data frame to hold the output
 # categorize the minimum number of days
 # since last purchase for each customer
 # categorize the number of purchases
 # recorded for each customer
 # categorize the amount purchased
 # by each customer
 # calculate the RFM score from the
 # optionally weighted average of the above

Now you can load the function into your workspace like this:


Load your data:


Run the function with the defaults except for the finish date:

Range of purchase recency 31 122
Range of purchase freqency 1 4
Range of purchase amount 5.97 127.65

Your problem is now apparent. If I use the following breaks, I will
generate NA values in all three scores:


As I wrote before, the breaks _must_ cover the range of values if you
want a sensible analysis:


Looking at df.rfm3, it seems that the recency score is the only one
discriminating users. This suggests to me that the data distributions
are causing a problem.  First, you have 946 users in a dataset of 1000
rows, meaning that almost all made only one transaction. Second, your
purchase amounts are concentrated in the 0-20 range. Therefore if I
change the breaks to reflect this, I get a much better separation of


Maybe this will get you going.


On Wed, Oct 11, 2017 at 4:43 PM, Hemant Sain <[hidden email]> wrote:

> Also try to put finish date as 2017-08-31.
> and help me with the complete running r code.
> On 11 October 2017 at 10:36, Hemant Sain <[hidden email]> wrote:
>> Hey Jim,
>> i'm attaching you the actual dataset i'm working on and i want RFM breaks
>> as
>> r=(10,30,50), f=(1,2,3),m=(8,14,400).

[hidden email] mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.