BayHouse
BayHouse Home BayHouse FAQ BayHouse Services

Forum   Topics   Tree View   Keyword Search
Credit Forum    CreditCourt Forum   2003 Credit Suit   CreditFactors   Order Credit Reports



Credit scoring is NOT simple Math - the formulas are so complex , NOBODY can verify the accuracy of credit scores

BayHouse Credit Forum: Fair Isaac FICO and NextGen Credit Scoring: Credit scoring is NOT simple Math - the formulas are so complex , NOBODY can verify the accuracy of credit scores
Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Wednesday, March 07, 2001 - 05:22 pm Click here to edit this post
Shylock previously posted:

"From the Fair Isaac website there is a list of things that control your credit score and a breakdown of how much weight each factor has.

Payment History: approximately 35% of your score.
Amounts Owed: approximately 30% of your score.
Length of Credit History: Approximately 15% of your score.
New Credit: approximately 10% of your score.
Types of Credit in Use: approximately 10% of your score.

So you see, it's simple math. If a person isn't scoring 690 they have a problem either on their payment record (derogatories) or they owe too much."

What Shylock posted is NOT math at all. Fair Isaac published some approximate percentages.

Fair Isaac also states that they use extremely complex algorithms.

I can't say whether Fair Isaac's scoring software is poorly designed from the ground up, or whether the software has a bunch of bugs. But it's obvious that scores are NOT consistent.

See FICO is FRAUD

I sure don't appreciate that we pay MORE and are DECLINED insurance and credit because of Fair Isaac's buggy software. We have NO way of proving that our scores are wrong.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Erik (Erik)

Wednesday, March 07, 2001 - 08:17 pm Click here to edit this post
We have no way of knowing how simple or complex the formula is without comparing many reports w/ scores. Fair, Isaac might be just saying they use extremely complex algorithms to impress people and to discourage anyone from trying to figure it out. The fact that they have developed a new score tells me that they know the old scores have serious problems (perhaps because it's too simple a formula?).

I personally think they have done a better job marketing their risk scores than actually creating a useful score. Where are the studies that show they are accurate? Are the studies a trade secret too?

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Zeddicus Zuul Zorander (Zeddicuszuul)

Thursday, March 08, 2001 - 02:37 am Click here to edit this post
Based on some of the information that Fico has publicly released, I think you could make an educated guess that the actual formula itself isn't any more complicated than what you'd find in a college algebra class. Think (factor) times (variable), plus (factor) times (variable), etc., with a few "if" statements thrown in for good measure (e.g. "if recent bankruptcy, use this formula; if thin file, use this formula....")

The variable definitions themselves can be complicated. Think about how Fico's handling of inquiries is complicated if you aren't used to working with them. Or, think about the questions many of us have about balance-limit ratios.

The work that goes into developing any scoring model is pretty danged complicated. All of the real model-builders I've met and talked with have Masters and Doctorates in statistics, and it shows in the work. Finding the "best" way to choose from among literally thousands of possible variables, and come up with a single formula is pretty bleeding-edge stuff in some parts of the stats world.

That's also probably part of the reason why Fico is releasing NextGen models. In the howevermany years since their classic models were released, more computing power has become available, and better statistical tools have been developed. Classic models still work...but the NextGen models do an even better job at predicting delinquencies and defaults.

I have only seen one study ever released to the public which includes any information like what banks get to see when considering when/how they want to use Fico scores. I think many people here are familiar with the mocked-up report at http://www.ficoguide.com/sample_report.cfm. The "delinquency rates by Fico score" table is the probably as close as you're going to get in the public eye. Everything else is considered proprietary by Fico, or uses individual institutions' internal (and therefore secret) data.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Shylock (Shylock)

Thursday, March 08, 2001 - 04:32 pm Click here to edit this post
FICO scores models may be long and may take hundreds of things into consideration, but when it all boils down anyone with the proper scoresheet (out of the 10 or 21 they use) and the ability to add could work out the proper score. It's just addition, not calculus. One point for this, four points for that, minus 2 for that ... etc.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Saturday, March 17, 2001 - 12:16 pm Click here to edit this post
With the impending release of scores to the public, people think that we are going to look at our reports and we'll be calculating the scores ourselves soon.

I'm willing to bet some cash that this will NOT happen.

I say that the ONLY way we will see the Fair Isaac formula is subsequent to a COURT ORDER.

I'll bet $500 that we won't be able to "reverse engineer" the formula and that even if we get the formula through a court order, it will be so complicated that nobody here will be able to calculate their credit scores. It takes a computer and a tremendous amount of calculations.

I'll pay up as soon as you post the formula and your calculations and we verify the accuracy on a few reports. If you can't do so within a year, you pay me. :)

Any takers? I'm VERY sure about this.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Zeddicus Zuul Zorander (Zeddicuszuul)

Saturday, March 17, 2001 - 01:32 pm Click here to edit this post
Christine --

I won't take that bet as you've phrased it, but given 50,000 - 100,000 random credit reports (preferably in raw computer-data form, the way that the modelers get it; but it could be done using paper reports) and matching scores, any statistician worth his or her salt could do it within a couple of weeks.

If you knew in advance what the sub-scorecards are in the Fico models, and if you could pick-and-choose the credit reports some way (e.g., "I want 5,000 reports with bankruptcies within the past 36 months, plus...."), it could be done with fewer files.

If you knew exactly what the variables were, in full detail...then you could do it with a few hundred files.

That's the reason that today, Fico has agreements with the individual credit bureaus to NOT provide banks, etc. with data for research in the detail (raw data that can be linked to scores) needed to do exactly what I described.

The problem isn't that the formula is complicated (it probably isn't); it's that there are just so many variables that *could* be in there.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Saturday, March 17, 2001 - 04:02 pm Click here to edit this post
Zed, first off, I very much appreciate your postings.

You are certainly right that given access to thousands of raw data credit reports, a good statistician with a computer could reverse engineer the Fair Isaac formula. Of course he needs the corresponding FICO scores for each report.

That's not happening, not without a court order.

The variables are part of the formula. That's what makes it complicated, and that EACH bit of data has to be "weighed" in relation to the REST of the report. Each reported account has what, 30 or 40 pieces of data?

There must be hundreds of "If ... is true, greater, less ... do that" statements in the scoring program. It all starts with determining which model to use, and branches out from there.

I once saw a SMALL part of the Fair Isaac formula, a long time ago when I attended their seminar for mortgage brokers. There was NO WAY that I could resolve that very short equation.

People think they'll get their reports and their credit scores and they'll be able to "reverse engineer" the formula.

There is absolutely NO WAY that this will happen.

That's the reason that today, Fico has agreements with the individual credit bureaus to NOT provide banks, etc. with data for research in the detail (raw data that can be linked to scores) needed to do exactly what I described.

If it is true that Fair Isaac has these agreements, why aren't the banks suing them?

What are the banks getting from Fair Isaac in return for not suing?

I don't think many people here will be able to afford to order their report and score every day for months. Even if they do, it will NOT enable them to figure out Fair Isaac's formula.

What I am hoping for is that people will be able to collect more evidence as at FICO is FRAUD.

And that they'll sue.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Erik (Erik)

Saturday, March 17, 2001 - 08:46 pm Click here to edit this post
Isn't it just 20 variables? In 6 months there will be a formula that will get within 5 points of the FICO score. In 12 months within 1 point. I won't bet for anything other than "I told you so!" bragging rights.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

douglas pratt (Dougpratt)

Saturday, March 17, 2001 - 11:04 pm Click here to edit this post
fico scoring is quackery-- if its authors are to continue getting richer at it, the formulae will have to remain proprietary, and having researched the infrastructure of UNfair isaac entrenchement in this industry, it will take a congressional order or presidential mandate to allow the public to know exactly how they are being judged and manipulated as they are by this thing-- a federal court order would be defied long enough for the quackware writers to change a few parameters of the formulae, and then the information we would get would be rather useless, as is knowing what one's numerical score actually is today. you can have the score, but if you don't know how it was calculated, it doesn't offer much help. shylock replied to one of my other postings that i am now seen as a higher risk in the eyes of the lenders; this is impossible, because no eyes are seeing my credit profile, no ears are hearing me explaining that i ran up $130,000 in debt renovating a house and my intention is to pay all of that off with the proceeds from refinancing it. the eyes that do see and ears that do hear are telling me FICO is giving me the shaft, coarse sandpaper elephant condoms without vaseline-:(*. being self employed requires me to take no-income verification loans, standard in the industry for many years at slight premiums in interest rate that i have always paid without complaint. the computer model isn't told to consider the loan to value ratio of a mortgage or the dollar amount of equity in the property-- if it were, i wouldn't be having trouble getting loans. the lenders are telling me that investors on the secondary markets won't buy the paper, and my credit score is the one and only reason-- they hate to be losing my business and tell me so all the time. the alternative for me is now portfolio lenders who charge substandard borrowers the high rates of interest reflected by the actual greater likelihood that such borrowers will default based upon a review of their past conduct with respect to handling credit. buying property in the boston area requires large down payments- prices of real estate here are among the highest in the country, along with new york city and san francisco. now i am stuck paying b/c borrower interest because the locks on the adjustable mortgages i have taken to improve cash flow expired, ditto for the consumer debt, and FICO prevents me from restructuring it all into new, competitively priced tax deductible loan packages collatoralized by the property that produced the debt in the first place. bankers can tell me i'm being stiffed, but money they have to lend comes at a premium normally paid by the high risk borrower sector of the industry, which is a big crowd of people out there. why lend me money at 7% when joe jones who filed bankruptcy 2 years ago will pay it back at 10%, or lose his house??

guess what, shylock?? FICO-BOUND LENDERS ARE NOT LENDING THEIR OWN MONEY!! they are intermediaries between the consumers and investors who put large sums of money into mortgage-backed securities on what is commonly called the secondary market, one which has always been considered extremely safe, like federal reserve treasury bonds. amounts of money going into these markets are astronomical, and form the basis for a large part our nation's economic stability. the decision to mandate FICO as the major (in my case, the only) basis to make lending decisions is/was/has never been in these people's hands-- that's what they pay bankers and underwriters to do. the whole thing is a chain of participants doing business, each taking a little slice of the pie as it goes around the table. the differce between what the borrower pays and yield the investor receives makes this possible-- 1% of $3,000,000,000,000 every month is an awful lot of money, quite enough to keep the wheels turning.

something's gone rotten in very high places for a company such as (UN)fair isaac to be manipulating an industry of this magnitude all by using secret computer models which can't even give an accurate 3 day weather forecast-- what a place to put this country's economic stability and prosperity at risk!?-:(*-- mysterious software silently taking the human element out of the equation bit by bit, all under government sanction at that-- more!?*. when the finest lawyers money can buy tell me to write my congressman first, sue later, we're not dealing with just a bunch of computer programmers out of MIT, no matter how good they are. if they had a product that really worked, they would own wall street, not be watching their stock traded there every day, getting a bit richer every time somebody applies for a mortgage or credit card.

this thing has permeated the system to the extent it sits comfortably above the law- when it fails, people suffer and can do nothing about it. it may work most of the time-- there should be provision for a machine to be turned off and resonsibility handed back to a real person if it isn't working properly. not long ago, i made a rather cynical posting here comparing FICO modeling to flying a plane. one very wise airline pilot responded, explaining that this can already happen, and sometimes does. if a computer is going to crash the plane, should the pilot be allowed to be able do something about it, or even try?? shylock says no, let it crash, because next time the machine will save 2 planes. even so, how about saving all 3?-:)*- this is the outcome when the pilot saves one and the computer saves 2 more.

fico still sucks-- always has, always will--:(
goodnight--

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Sunday, March 18, 2001 - 12:48 am Click here to edit this post
Erik, Isn't it just 20 variables

What do you mean? WHAT variables?

And when I posted "Each reported account has what, 30 or 40 pieces of data?" I was WAY off.

Date of status
Last reported
Type of account
Terms
Monthly payment
Credit limit
High balance
Current balance
Recent payment
Current Status
Account status history (the delinquencies) for the last 7 years, that's 84 pieces of data
The 24 month balance history: another 24 pieces of data to calculate into the score.

I have trouble adding it up, I think 118. So if you have a report with 10 accounts (I rarely see reports with less than 15 accounts) you're looking to evaluate about 1,200 pieces of data. All related to each other in several different ways, with the added thrill of deciding how to weigh the historical data.

Don't forget to calculate the inquiries, various addresses, public records, ...

Make sure you define the various models and don't miss any of the 80 risk factors.

I can't wait to see that formula.

I won't bet for anything other than "I told you so!" bragging rights.

Erik, that won't even buy me lunch! :)

How about Shylock?

This is one of those "win-win" bets. It would be too cool if I could calculate the scores myself.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Erik (Erik)

Sunday, March 18, 2001 - 05:34 am Click here to edit this post
After reading that NextGen Brochure it says that NextGen scores have 80 variables which is a almost twice that of the the Classic scores. So the classic scores have a little over 40. It says the classic scores have 10 models.

A couple months ago I started to write a program where I would enter my guess as to what the formula is and the program would compare it to the data on the credit reports (stored in a database). I got fairly far along with the framework of the program but took a break since I was busy and we don't have the scores yet. I think the reason codes give strong clues as to what the appropriate variables are. So perhaps 20 of the variables are:

DIM nTotalAmountOwed ' 01 Amount Owed on Accounts Too High
DIM nNumberOfAccountsWithDelinquency ' 02 Delinquency on Accounts
DIM nTotalRevolvingBankAccounts ' 03 Too Few Bank Revolving Accounts
DIM nTotalOpenRevolvingAccounts ' 04 Too Many Revolving Accounts
DIM nTotalOpenAccountsWithBalance ' 05 Too Many Accounts With Balances
DIM nTotalOpenFinanceCompanyAccounts ' 06 Consumer Finance Accounts
' NEED TO ADD THIS ' 07 Account payment History Too New To Rate
DIM nTotalInquiriesInPastYear ' 08 Too Many Recent Inquiries Last 12 Months
DIM nTotalAccountsOpenedInPastYear ' 09 Too Many Accounts Opened Last 12 Months
DIM nBalanceToCreditLimitRatio ' 10 Proportion of Balances to Credit Limits Too High
DIM nTotalRevolvingBalance ' 11 Amount Owed on Revolving Accounts Too High
DIM nDaysOfRevolvingHistory ' 12 Length of Revolving History Too Short
DIM nDaysSinceLastDelinquent ' 13 Time Since Delinquent is Too Recent
DIM nDaysOfCreditHistory ' 14 Length of Credit History Too Short
DIM nDaysSinceBankRevolvingAccountUpdated ' 15 Lack of Recent Bank Revolving Information
DIM nDaysSinceRevolvingAccountUpdated ' 16 Lack of Recent Revolving Account Information
DIM nDaysSinceNonMortgateAccountUpdated ' 17 No Recent Non-Mortgage Balance Information
DIM nNumberOfAccountsWithDelinquency ' 18 Number of Accounts With Delinquency
DIM nNumberOfAccountsPaidAsAgreed ' 19 Too Few Accounts Paid as Agreed
DIM nDaysSinceLastCollectionOrPublicRecord ' 20 Time Since Derogatory Public Record or Collection

--------
Damn I should have bet for a free lunch, it's a sure bet!

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Sunday, March 18, 2001 - 10:15 am Click here to edit this post
Erik, we're on, lunch it is!

And I don't understand where your * 20 * for the variables comes from.

We know exactly what factors are, for NextGen, or Classic.

I already posted the DATA that goes into the calculations. You have to realize that EVERY month of history (good or bad) for EVERY account is part of that calculation.

If you think that all you need to do is deduct points for LATE payments, that's not how it works.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Zeddicus Zuul Zorander (Zeddicuszuul)

Sunday, March 18, 2001 - 04:33 pm Click here to edit this post
Christine:

The reason codes are *not* the variables themselves. For example, consider the reason code "Too many inquiries last 12 months".

From information Fico has released we know that there's more to that variable...or those variables (could be more than one that feed that code)... than just counting up the number of inquiries on a credit file in the last 12 months.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Sunday, March 18, 2001 - 05:07 pm Click here to edit this post
You're right. I was wondering what Erik meant by the "20 variables" in his earlier posting and then he referred to the score factors.

I can't even begin to think of a formula, or all the variables. I think the only way to do this is to feed the raw credit report data in a computer. Knowing the solution (the Fair Isaac score for that report.) The computer can then "develop" the formula.

Realistically speaking, someone with daily access to scores and the raw data could probably do it.

Unfortunately most or all consumer reports do NOT contain all the data.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Sunday, March 18, 2001 - 05:23 pm Click here to edit this post
And when I went to school variables were defined such as:

aic = number of accounts, installment, current
aid = number of accounts, installment, delinquent
ais = number of accounts, installment, seriously delinquent
$ai = amount owed on installment accounts
$aid = amount past due on installment accounts

There have to be way more than 20 variables.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Zeddicus Zuul Zorander (Zeddicuszuul)

Sunday, March 18, 2001 - 08:03 pm Click here to edit this post
The only things I can think of that are sometimes missing from a consumer report that appear in the raw data stream are "KOB" codes. "KOB" = "Kind of Business" associated with a given trade or inquiry -- "auto", "real estate", "retail", etc. I've seen them on a couple of printed reports, but I think I've seen them omitted on other reports.

With the (non-Fico) models I work with, I can usually calculate scores by hand from a printed report, if the KOB's are given, or if I can make educated guesses.

And I agree that there have to be more than 20 variables. I suspect the real number in Beacon/Empirica classic scores is at least 3 times that number. That's just my guess, however.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Erik (Erik)

Sunday, March 18, 2001 - 09:39 pm Click here to edit this post
20 was my mistake. I should have said a little over 40 as I was trying to remember that quote from that NextGen brochure. I hadn't looked at that for awhile. Upon rereading it, it really says a little over 40. I only wrote the 20 from my incomplete program because it seemed like a nice round number and I hadn't really tried to make variables for many of the other reason codes.
By the way reason code #29 is date of last inquiry is too recent. So that would be a variable that I would write as nDaysSinceLastInquiry which could go along with the nTotalInquiriesInPastYear that Zed was complaing about.

If you look at the reason codes:
http://www.bayhouse.com/FairIsaac-FICO-risk-factors.shtml
You will see 46 listed. Of course the word "variable" can be applied to the raw data in the reports but when Fair, Isaac is using the term (really "predictive variables") I think they mean the variables that can be deduced from the reason codes. I may be wrong but it makes sense to me.

You're right about the Kind of Business thing. That is something that consumers need to just know on their own and hope that it is coded right since the CRAs withhold that info.

Christine wrote:
"I already posted the DATA that goes into the calculations. You have to realize that EVERY month of history (good or bad) for EVERY account is part of that calculation."
Are you talking about the reason codes or something different? I'm not really sure where you are coming from here. I really think the first thing the formula does is look at the report and calculates the "predictive variables" it wants. So if you have 7 open revolving accounts on your report it will just count each one as one and the "predictive variable" of nTotalOpenRevolvingAccounts will equal 7. Then it looks at all those variables and does whatever models and calculations it uses to come out with the score.

Top of pagePrevious messageNext messageBottom of pageLink to this message  

Christine Baker (Admin)

Sunday, March 18, 2001 - 11:06 pm Click here to edit this post
Zed, there could well be 60 variables.

There are 2 types of data often missing:

1) the TYPE of account, i.e. auto, bank card, or FINANCE company account.

2) the 24 months history for the balances (and payments?) for each account.

And Eric,

Then it looks at all those variables and does whatever models and calculations it uses to come out with the score.

Yeah, that "whatever" is the difficult part. Using your example, exactly WHAT will the 7 mean? Too many accounts, or not enough accounts?

THAT is the problem.


Add a Message


This is a private posting area. A valid username and password combination is required to post messages to this discussion.
Username:  
Password:



Topics     Tree View     Keyword Search     Program Credits   Administration

Credit Forum    CreditCourt Forum   2003 Credit Suit   CreditFactors   Order Credit Reports