
My 3 years journey with Data
Hi ,
I am Chiranjit , I was working for Redpill as a data miner. I joined Repill in Jan ’07. Got wonderful opportunity to play with data and solve the business problems. Among all the statistical solutions we have given to the client,70% is scorecards and 25% is segmentation and the remaining is KPI analysis.
Now we will discuss few things about the scorecard building. No doubt this is the bread and butter for us.
Before starting the details or process, lets take a glance of what exactly we have done till now.
In Asia Pacific & Middle East, we have given statistical solutions to n number of banks and telecom companies. Now Bank is a big word. Every Bank is having different Line of Businesses or LOBs and we have worked with most of the LOBs.
Like credit card, starting from credit cards, how to acquire good customers, How to increase the revenue from the existing base, How to manage the loss, how to stop attrition all these.
Same things for Retail banking, where they have Saving, Current accounts, Term deposits, Loans etc , SME and microfinance we have worked with. Also we have helped telecom companies in attrition management.
Let me take you to the vantage point now and you will get a bird eye view of a statistical scorecard that we sell to the client. All our analysis is just to come to this point , we find the best characteristics for the customers depending on the objective , then for each category we estimate the weight and summarizing all these we get a score or probability . It helps in the solution.
Now I want to do a little quiz before starting what is Scorecard. Tell me, you are going to a HDFC branch and asking for your saving account balance to the Personal banker how much does it costs to the Bank?? You are going to ATM and taking your last 5 transactions slip how much does it cost?? On an average it costs 12-15 rupees for a bank to call a customer for Personal loan or Credit card.
So the point is , it is very important to prioritize their action . It is impossible for HDFC bank, ICICI or Citi bank to call all their salary account customers and sell Personal loan. If they take randomly 10000 customers which is their capacity may be they will get 100 customers taking PLS but if it is possible to find the highly likely 10000 customers , they will 500 -1000 customers taking , Cost is same but profit is 5-10 times. So the point is prioritization and that what scorecard is doing.
In broader sense, there are two different aspects of this prioritization. First is , customer centricity , because you are using historical data so you know what exactly the customer wants . If one customer is calling to phone banking service , the executive is having in the system what is the best product for him or her that bank can offer.
The Second benefit is, it will make the organization very very scientific. Standing in November you can forecast what is the total loss going to hit you in the next 1 year. Accordingly, you will stop hiring sales people and you will start searching people for collection.
But its very confusing because we are not taking about 10 thousand customers, the banking base can be 5 mio , 10 mio may be more than that. How will you manage the whole thing? It is very difficult to channelize everything. China’s 3rd biggest bank’s loan portfolio is higher than Indian’s GDP. So you can understand it is tough to do this prioritization across the Bank.
That’s why every bank divides the analytical projects into two parts. One, risk scores where your objective is to decrease the risk and another one marketing score, where your objective is to increase the revenue .Now if you marry both, you will get a stable Revenue growth, manageable loss numbers and as a result bank will get good profit which means good salary, bonus to its employees, Good return its share holders. The cherished Goal that every organization is having.
Now the question is, what the things that we can offer in risk and marketing. In risk the objective is to manage the loss. Now the products that we can offer in risk are
Application scorecard: If you are asking for PL or credit card or any loan products should I give you the product or not, because there is a change that you will not pay.
Behavior scorecard: bank is having good existing customers thinking to sell a loan to them , so whom they should and whom they should not.
Collection Scorecard: Now bank is having customers not paying their credit card minimum fee , now few customers are good .They just forgot to pay and few customers are genuinely bad. So ,the collection manager needs a strategy to call people.
Recovery Scorecard: From the list of total defaulters who are highly likely to pay
Bankruptcy Scorecard: Again helping you in the loss forecast.
For marketing where the objective is to increase revenue, you can do
Market Demand/Competition Analysis:
You can do an analysis if there is any segment in the model that you have not touched till now. Is it possible to design a product for them?
Look-a-like Model :
Till now you have not lunched any campaign and particular product but from the historical data you can find out what is the profile of customers taken this product in past , so you can go more focused now.
Cross-sell/Propensity Model: You want to know the customers’ propensity to take your product
Next best product algorithm: If the customer is already having a product what will be the next product to sell him.
Segmentation: You want to know how many different types of segments are there in your portfolio, so you can understand if there is any need for new product.
So that was a global idea about Scorecard and its usage. Now let me start how to build a good scorecard
There are 5 steps that you have to follow to build a successful scorecard.
First is the business understanding, when I was presenting my first Credit card X-sell scorecard in redpill , my age was 25 and the person I was presenting to , means client , he was having 25 years of experience in credit card. So, you can understand if he is having a problem he will not to my level. I have to reach there and understand it and have to give the solution in his business language. He liked my scorecard and they implemented it and the result was really good. But you know this good is just an English word, you have to calculate the Good = How much is incremental revenue? Because you have to show to the client what is the benefit they are getting from us.
Second is Data, If your data is wrong all the other 3 steps will be just waste. It’s like garbage is and garbage out.
Third, Variables creation. Show your creativity and try to create variables which will be surrogates of the customers’ characteristics.
Then the scorecard building where you will calculate the propensity using regression model . You have to check the statistical statistics to get the best one.
Last is validation, you have to prove that your solution is time independent.
Let me discuss the points in details. When client will give you the problem, it will be in their languages. Your business understanding will translate the problem in a statistical problem and through the statistics you will try to solve the statistical problem.
Like,
• My credit card portfolio annual attrition rate of 20%, that’s it .Client wants the solution.
Now, because you know their business, you will convert it “Need to identify customers highly likely to attrite”. Because they can action on that. Same like
• My SME portfolio is having only 10% loan and 90% Saving account customers
• Only 6% of my Saving account base use debit card every month
• My personal loan delinquency has crossed 8%
• 12% of my personal loan customers are foreclosing their loans
So the statistical solutions will be:
• Need to identify SME customers highly likely to take loans
• Need to identify customers less propensity to use debit card
• Need to identify customers highly likely to default
• Need to identify customers having higher propensity to prepay ( indirect attrition)
After understanding the problem, you will go to the data. Fix the observation period and prediction period, means you will take the data about the customer characteristics from the observation period and you will do your independent variable tagging (good / Bad or Takers / Non takers) in the prediction. We have an automated data extraction query, written in SAS / SQL. After getting the data we do cleaning, validation. We do basic statistics to find out the outliers. Next you have to do a scorecard building. There are few patent statistics that you have to check. Final step is validation. You have to prove that your solutions are time independent.
Analytics is not only about giving solutions or earning money, it is to become more and more logical and to teach your mind that for everything there is a reason.












