BMDP
BMDP is a robust statistical package that has been written to run with
less-than-perfect data. It will handle missing data, outliers and non-normal
information. This statistical package is available for a number of machines
including Unix, Vaxes, and Personal Computers. At NAU, BMDP is accessible
on Unix (Dana & Jan).
In order to run a BMDP job you must have a data file and also a command
file that consists of the instructions to BMDP on what to do with your
data. You may refer to the Editing ASCII Data document for information
on building data files. The following information details getting started
with building command files for BMDP.
BMDP instructions are based on easy to use English like commands. These
instructions are grouped as paragraphs and sentences. Each
paragraph consists of a leading slash ("/") and one or more sentences.
A sentence is a qualifier in a paragraph that helps to clarify the action
that is to be taken. All sentences end with a period. Each statistical
run may be performed with a basic set of paragraphs(1):
-
INPUT
-
this paragraph describes how data is formatted in your input file.
-
VARIABLE
-
this paragraph allows you to name the variables in your data set.
-
GROUP
-
this paragraph allows you to group your variables on specific criteria.
-
END
-
this paragraph instructs BMDP to complete the statistical run.
All BMDP statistical runs will use the INPUT, VARIABLE and END paragraphs.
The GROUP paragraph is optional. Figure 1 shows a sample instruction set.
/input variables=3.
format=free.
/variable names are prof, time, score.
/end
Certain rules should be followed when writing BMDP instruction sets. These
include:
-
The paragraph name always comes first.
-
Paragraphs are separated by a slash.
-
Any given paragraph may only be used once in each problem unless otherwise
stated in the manual.
-
Except for the END paragraph, paragraphs may be in any order.
-
Every sentence must end in a period.
-
Values on a list must be separated by a comma.
-
Names may contain up to eight characters and should begin with a-z.
-
Case is ignored in the instructions.
-
Indenting is optional in the instruction set.
To start BMDP, please refer to your machine specific documentation. The
general syntax for starting BMDP is:
BMDP 1d myprog.bmd myprog.lst
Where 1d is the statistical routine (program) to run, myprog.bmd is your
instruction set and myprog.lst is the name of a file to write the output
to.
The following is a list of some of the BMDP programs and the type of
statistical analysis that they perform:
-
1D simple data description
-
2D detailed data description including frequencies
-
3D t tests
-
5D histograms and univariate plots
-
6D Bivariate (scatter) plots
-
7D One- and Two-way Analysis of Variance
-
9D multiway description of groups
-
4F two-way and multiway frequency tables
-
4M Factor analysis
-
2R stepwise regression
-
AR derivative free nonlinear regression
-
2T Box-Jenkins time series analysis
-
2V Analysis of variance and covariance with repeated measures
The INPUT Paragraph
The INPUT paragraph is used to describe your data to BMDP. Three input
commands are essential for a working run:
-
Variables
-
used to specify the number of variables for each case
-
Format
-
to specify the layout of the data values on each record
-
File or Unit
-
used to identify the location of the data file on disk
For example, if you have a file on disk called mydata.dat and you have
three variables per case in the file, you may use the instructions shown
in Figure 2 to access the file.
/input variables=3.
format is free.
file is 'mydata.dat'.
Figure 2.
The following optional commands may also be used:
-
Title
-
specifies a title for the problem
-
Case
-
specifies the number of cases to read in
-
Multiple
-
allows the program to read in multiple records per case
The VARIABLES Paragraph
This paragraph is used to describe your variables to BMDP. There are a
number of commands that will be useful in doing so. You should familiarize
yourself with the following:
-
Name
-
used to assign names to your variables.
-
Use
-
used to select a subset of your variables for analysis.
-
Missing
-
Missing value codes may be assigned to your data set.
-
Max, Min
-
used to restrict the range of data used in an analysis.
-
Label
-
allows the labeling of cases in a data set.
Figure 3 shows a sample of the VARIABLES paragraph. It lists three variables;
id, sex and income. An example of setting up a missing value for sex is
also given.
/variables names=id, sex, income.
missing=(sex)9.
Some procedures will require that your data be grouped or sorted before
the run may be accomplished. The group command may be used in the VARIABLE
paragraph to accomplish this.
Transforming your Data
The TRANSFORM paragraph is used to create new or transform old variables.
The use of arithmetic operators, BMDP arithmetic functions, summary functions
and conditional statements will allow you to manipulate your data. The
general syntax of a transform sentence is an algebraic equation. The variable
name given on the left side of the equal sign is assigned the value specified
or computed on the right hand side of the equation. For example, if you
are working with student test scores and want to produce an overall score
from a verbal and math test score you could use the following:
score = 2 * verbal + math
For a full explanation of the use of algebraic equations in BMDP, please
refer to your manual.
You may also conditionally process variables in the TRANSFORM paragraph.
The if - then statement is used to do this. For example, to transform
an interval variable to ordinal you could do the following:
if(income EQ 4000) then (income=4).
In this example, whenever a value of 4000 is seen in the variable income,
it is changed to a 4. You may also recode non-numeric variables to numeric
using the if statement. For example, to recode sex from non-numeric
to numeric you could use the following:
if (sex eq char(M)) then (sex=1).
if (sex eq char(F)) then (sex=5).
This will change all occurrences of M to 1 and F to 5 in the variable sex.
To work with temporary variables, you must use the add statement
in your VARIABLE paragraph. You may then calculate new variables in the
TRANSFORM paragraph.
Producing Descriptive Statistics
The 1D program provides standard descriptive statistics. The necessary
paragraphs include INPUT and VARIABLE. In the VARIABLE paragraph you should
include the use command to list the variables that you wish to include
in the statistical run. 1D prints out the following stats: mean, standard
deviation, standard error of the mean, coefficient of variation, z-score
and range. Figure 4 shows a sample 1D output.
VARIABLE TOTAL STANDARD ST.ERR COEFF. OF S M A L L E S T L
A R G E S T
NO. NAME FREQUENCY MEAN DEVIATION OF MEAN VARIATION VALUE Z-SCORE VALUE Z-SCORE RANGE
&nb
sp;
2 race 1473 1.186 0.471 0.0123 0.39742 1.000 -0.39
3.000 3.85 2.000
4 sex 1473 1.594 0.491 0.0128 0.30818 1.000 -1.21 &
nbsp; 2.000 0.83 1.000
15 income82 1473 16.510 20.016 0.5215 0.21239 1.000 -0.77 99.000
; 4.12 98.000
Figure 4.
Frequency Tabulations
At times it is advantageous to produce a frequency distribution for variables.
This is performed by the 2D program under BMDP. Like the 1D program, the
INPUT and VARIABLE paragraphs are required to produce the run. Figure 5
shows output from the 2D program.
PAGE 4 BMDP2D 23-AUG-89 08:24:59
************
* income82 * MAXIMUM
99.0000000
************ MINIMUM
1.0000000
&nb
sp; RANGE 98.0000000 H
;
VARIABLE NUMBER . . . . . . 15 VARIANCE 400.6536560
H
NUMBER OF DISTINCT VALUES . 20 ST.DEV. 20.0163345 &n
bsp; H
NUMBER OF VALUES COUNTED. . 1473 (Q3-Q1)/2 3.5000000
; HH EACH 'H'
NUMBER OF VALUES NOT COUNTED 0 MX.ST.SC. 4.12
HHH REPRESENTS
&nb
sp; MN.ST.SC. -0.77 HHH &nbs
p; 56
&nb
sp;
HHHH COUNT(S)
&nb
sp; 95% CONFIDENCE HHHH &nbs
p;
ESTIMATE ST.ERROR LOWER UPPER&nb
sp; HHHH H
;
MEAN 16.5098438 0.5215347 15.4868135 17.5328751 &nbs
p; HHHH H
MEDIAN 13.0000000 0.2886753
L-------------------------------U
MODE 15.0000000
&nb
sp;
EACH '-' ABOVE = 5.0000
&nb
sp;
L= 0.0000
&nb
sp;
U= 155.0000
&nb
sp;
CASE NO. OF MIN. VAL. = 146
&nb
sp;
CASE NO. OF MAX. VAL. = 304
&nb
sp;
Q1= 9.0000000
&nb
sp;
VALUE VALUE/S.E. Q3= 16.0000000
&nb
sp; SKEWNESS 3.
62 56.66 S-= -3.5064907
&nb
sp; KURTOSIS 11.97&nb
sp; 93.77 S+= 36.5261803
&nb
sp;
EACH '.' BELOW = 1.0000
S Q Q &nb
sp; S &nbs
p;
- M 1 M MM &nbs
p; +
; M
.....I...........E.OE..................................................................................A
N D DA &n
bsp;
; X
I EN &nbs
p; &
nbsp;
PERCENTS
PERCENTS PERCENTS &nb
sp; PERCENTS
VALUE COUNT CELL CUM VALUE COUNT CELL CUM VALUE COUNT CELL CUM
VALUE COUNT CELL CUM
1. 20 1.4 1.4 6. 30 2.0 13.8 &nbs
p; 11. 89 6.0 40.1 16. 168 11.4 83.0
2. 32 2.2 3.5 7. 41 2.8 16.6 &nbs
p; 12. 65 4.4 44.5 17. 120 8.1 91.1
3. 44 3.0 6.5 8. 55 3.7 20.4 &nbs
p; 13. 108 7.3 51.8 18. 51 3.5 94.6
4. 31 2.1 8.6 9. 104 7.1 27.4 &nb
sp; 14. 85 5.8 57.6 98. 78 5.3 99.9
5. 47 3.2 11.8 10. 97 6.6 34.0 &n
bsp; 15. 206 14.0 71.6 99. 2 0.1 100.0
Figure 5.
Crosstabulations
Program 4F produces multiway frequency tables that help in summarizing
categorical data. Statistics including mean, standard deviation, frequency
of values, and percent of missing is printed for each variable in the table.
However, there are a number of limitations and options that you will need
to learn to effectively use this procedure. By default, 4F can only handle
10 distinct values per variable. By the use of the codes and cutpoints
commands (CATEGORY paragraph), these limitations can be overcome. Please
refer to your manual for a full explanation of these commands. Other than
the standard INPUT and VARIABLE paragraphs, you will need to include the
TABLE paragraph and optionally the CATEGORY paragraph. The TABLE paragraph
is used to set up the columns and rows for your tables. For example, if
you have two variables, sex and degree, and wish to produce a crosstabulation
of those two variables you could use the following TABLE paragraph:
/TABLE column=sex. row=degree.
This command will produce the output shown in figure 6.
PAGE 4 BMDP4F 23-AUG-89 08:36:37
VARIABLE STATED VALUES FOR GROUP CATEGORY INTERVALS
NO. NAME MINIMUM MAXIMUM MISSING CODE INDEX NAME .GT. .LE.
---- -------- ------- ------- ------- ------ ----- -------- ------ -------
4 sex 1.000 1 *1&nbs
p;
2.000&nb
sp; 2 *2
11 degree 0.000 1 *0
1.000&nb
sp; 2 *1
2.000&nb
sp; 3 *2
3.000&nb
sp; 4 *3
4.000&nb
sp; 5 *4
9.000&nb
sp; 6 *9
NOTE: CATEGORY NAMES BEGINNING WITH * WERE CREATED BY THE PROGRAM.
------------------------------------------------------------------------------
************************
* TABLE PARAGRAPH 1 *
************************
***** OBSERVED FREQUENCY TABLE 1 &
nbsp;
degree sex
------ ------
1 2 TOTAL
------------------------------------
0 162 238 | 400
1 290 474 | 764
2 15 39 | 54
3 80 95 | 175
4 49 28 | 77
9 2 1 | 3
--------------------------|---------
TOTAL 598 875 | 1473
ALL CASES HAD COMPLETE DATA FOR THIS TABLE.
MINIMUM ESTIMATED EXPECTED VALUE IS 1.22
STATISTIC VALUE D.F. PROB.
PEARSON CHISQUARE 25.581 5 0.0001
NUMBER OF INTEGER WORDS OF STORAGE USED IN PRECEDING PROBLEM 2006
CPU TIME USED 4.770 SECONDS
Figure 6.
If you have any questions about using BMDP at NAU, please call Academic
and Personal Computing at x1511.
Using BMDP on Unix
You will use the following syntax to start the BMDP program:
bmdp <program_name> [<input_file>] <output_file>]
where program_name is the name of the BMDP program/module that you wish
to use (e.g., 1D), input_file is the file containing your BMDP instructions
(the default is terminal input) and output_file is an optional file name
for BMDP to write your output to.
Example:
The input file 7d.bmd is used to control the run and the output is written
to the file 7d.lst.
Bimed may also be run interactively. To do so you type in bmdp
at the unix prompt followed by the name of the statistical module that
you wish to use. For example to start Bimed using the 1D module,
you would use the following command:
When you are queried for the instruction language file and the output file
name simply type the enter or retrn key for each question.
-
Notes:
-
Where one of the above examples shows text enclosed in square brackets
("[ ]"), this is an optional item. If it is enclosed in angle brackets
("< >"), you must supply the necessary information.
1. Not all paragraphs and sentences available
for your use will be discussed in this document. You should refer to your
BMDP manual for a full description of any commands that are not fully described
in the ITS documentation.