Proc discrim is used to conduct discriminant analysis.

The purpose of discriminant analysis is to classify an experimental unit as being from one of two (or more) populations on the basis of observations obtained on that unit. For example, a bank might attempt to classify a new loan applicant as a `good payer' or `dead beat' on the basis of the answers given on a loan application.

The common syntax is:

```
proc discrim
data=bank
method=normal
pool=yes
slpool=0.001
posterr
out=results;
class type;
var hist balance income;
run;
```

The `data= line`

specifies the data set used in the analysis.

The `method=normal`

option selects discrimination based on the
assumption that the data is from multivaiate normal
populations.

The `pool`

option can be set to either `yes`

, `no`

, or
`test`

. This controls whether the covariance matrices are assumed
equal in the analysis (`yes`

), assumed to be UNequal (`no`

), or
tested for equality (`test`

) with subsequent analysis performed
according to the outcome of this test.

The `slpool=`

statement sets the significance level of the test for
equality of covariance matrices when `pool=test`

is used. It is
otherwise disregarded.

The `posterr`

option prints out estimated error probabilities for
the computed discrimination rule.

The `out=`

option creates a new data set which contains the
variables in the original one together with a new variable
called `_into_`

of the same type as the `class`

variable. The
`_into_`

variable gives the class into which the observation is
assigned by the discrimination rule.

The `class`

statement, which MUST BE USED, specifies the variable
for which classification is to occur.

The `var`

statement specifies the variables to be used for
discrimination. If omitted, all variables in the data set
(except the one specified in the `class`

statement) are used. It
is best to specify the variables explicitly to avoid the
unintended use of extraneous variables in the analysis.

Sometimes one set of data is used to construct the discrimination rule and a second set is used to test the rule. This can be accomplished as follows.

```
proc discrim
data=construc
testdata=tryout
method=normal
pool=yes
slpool=0.001
posterr;
class type;
var hist balance income;
run;
```

The `testdata=`

option specifies a SAS data set which is used to
tryout the discrimination rule found by analyzing the `data=`

data
set. It is assumed that the `testdata=`

data set contains the same
independent variables and class variable as the `data=`

data set.
Diagnostics of the discrimination rule on the `testdata=`

data set
are printed.

To use one data set to construct a discrimination rule using
`method=normal`

for later use, use the `outstat`

option:

```
proc discrim
data=first
outstat=calib
method=normal
.....
```

The special data set `calib`

then contains the discrimination rule
constructed from the data set `first`

. To apply this rule to the
data set `second`

:

```
proc discrim
data=calib
testdata=second
method=normal
.....
```

For further information see the SAS/STAT User's Guide, volume 1.

Copyright © 1997 by Jerry Alan Veeh. All rights reserved.