
Learner's Guide to SPSS 11.0
This appendix will provide you with a brief overview of the Statistical Package for the Social SciencesTM (SPSS). For many years, SPSS has been the most commonly used program for quantitative data analysis in the social sciences. It has gone through many versions for both the Windows and Macintosh platforms. This appendix will use SPSS 11.0 along with data from the 2000 General Social Survey. If you're using a different version of SPSS or a different data set, you'll need to make some adjustments. This appendix introduces you to the overall logic and application of SPSS. Whatever version you have, consult the user manual for whatever additional assistance you need.
In addition, several books have recently been written to introduce social researchers
to SPSS. One is Earl Babbie, Fred Halley, and Jeanne Zaino, Adventures in Social Research,
Thousand Oaks, CA: Pine Forge Press, 2000.
When you first open SPSS 11.0 for Windows on your computer, figure 1 below will appear. You have several options to start with. You can click on "Run tutorial" and click OK. This is a good step for you to get familiar with some of the basic features of SPSS. You also have the option to create your own data. You would choose this option if you had collected your own survey and were ready to transpose your respondents' answers into an SPSS data spreadsheet. For you who are using the GSS data, you will need to ask SPSS to import a data file already formatted and ready for analysis. GSS files will often come as SPSS files (they are recognizable by their .sav extension) or as SPSS portable files (saved as .por files).
Figure # 1
The next step is to load some data into the program.
When "open an existing data source" is selected click OK. You will have to browse your computer and select the location in which your GSS file is saved. Notice the bottom of this window. You skip this dialog window in the future. When you start SPSS, the window in the background will appear and you can simply click on "open" and select the GSS data file you want to analyze. The advantage of this dialogue box is that after your first opening of the GSS file, SPSS will list your file in place where "More files…" is highlighted. In other words, if you need to work on this file again you can simply select it with your mouse and click OK. SPSS will open your file in one simple step.
Let me guide you through the process of opening the GSS file included in your textbook package. Simply close the window in figure 1. Look at the top row in the figure below. It contains several menus: , , , , , , , , , and . We'll use these menus throughout this appendix.
Notice that the first letter of each menu name is underlined (e.g., ). This means you can activate that menu by holding down the (control) key and typing the underlined letter. Thus if you press , you activate the menu. You can accomplish the same thing by simply clicking the menu name. In either fashion, you would end up with the screen shown in figure 2.
Figure # 2
As you can see, the menu offers several possible actions, but right now we're only interested in opening a file. Click to indicate that you want to load a data file into the program. As you may recognize from the notation to the right of the command, we could have accomplished the same thing by striking without even going into the menu.
Next you'll see a dialogue box asking you to select among several options. Open the "data" menu. Your steps in browsing your computer to open the data file are illustrated in figures 3 and 4 below.
Figure # 3
Select the data set you want by double-clicking it or by single-clicking it and then clicking . We've selected the 2000 General Social Survey (GSS) data set (located on the zip drive under the folder "2000"), containing hundreds of variables collected from 2,817 respondents. The research was conducted by the National Opinion Research Center at the University of Chicago to establish a representative sample of U.S. residents 18 and older. You, of course, may be working with a different data set, such as the one that came with this book. In your case, open the CD drive to open your data file.
Figure # 4
It is important to note that SPSS is set to open .sav files by default. In my case, the GSS 2000 data file was created as a .por file. I had to change the file type to "SPSS Portable (*.por)" as illustrated in figure 4 above. When my GSS200.por file appeared I made sure it was highlighted and clicked "open". See figure 5 below.
Figure # 5
When you modify your data set-creating recoded variables for example-you'll probably want to save those changes for later use. Realize that any unsaved changes will stay in effect throughout this SPSS session, but when you exit the program (see the last section of these SPSS Guidelines) you can lose all your changes. It's wise to save changes as soon as you're sure you want to.
Saving an altered file is hardly rocket science. First, select the window or the window (not the window which contains all statistical jobs you asked SPSS to perform on your data). Then, under the menu, select . Alternatively, you can simply press . From now on, when you open this file, it will contain your alterations.
Realize that when you save the file in this fashion, the changed file replaces the original one. So if you madly deleted data or altered variables using their original names, you'll have put the original file forever out of reach.
If you wish to save the original file as well as your changes, choose under the menu (see Figure 6.1). This time, SPSS will ask you to supply a name for the data set about to be saved. Use some name other than that of the original data set. Also, pay attention to where on your disk it is saved so you can find it later on.
Figure 6.1
You may also want to save your data file as an SPSS file if it is a portable file or other type of data spreadsheet. In figure 6.2 I saved my GSS2000.por as a GSS2000.sav which will save me some time in processing this file in the future. There won't be a need to wait for conversion time.
Figure 6.2
By default, the window that appears is the "Data Editor" window. No matter what data file you open, the setting of this spreadsheet is always the same. Each of the columns represents a variable, such as the respondent's gender, age, or attitude about abortion. Each row represents a particular respondent. Thus each cell of the matrix stores an item of information about a person. In Figure 2, all the cells are empty.
Once you've instructed SPSS to open your data set, the original matrix will be filled with data, the way it is in Figure 6 below. Notice that the row just above the matrix now contains the names of the variables comprising the data set: , , and so forth. SPSS uses abbreviated labels, each no more than eight characters long.
Figure 6
Notice in Figure 6 that the upper-right cell, which is the cell that links together case number one (respondent number 1) and the variable labeled . The first respondent has a value of 2 on the variable . But what on earth does that mean?
This is where SPSS 11.0 is different from earlier versions. In addition to the "Data View" sub-window, the Data Editor window has a Variable View sub-window. You can simply click on "Variable View" at the bottom of the Data Editor window. In figure 7 below, you can see how this window is organized. This time, the rows represent the variables and the columns are the various categorizations associated with each variable. The columns are as follows: Name, Type, Width, Decimals, Label, Values, Missing, Columns, Align, and Measure. Name is the abbreviated name of the variable (it is never longer than eight characters). The Type of the variable is often "numeric" but could be "string" if you wanted to input words as data instead of numbers. Label is the description of the variable and indicates more clearly what the question on the questionnaire was about. In the column "Value" you can find the values associated with each possible answer for each variable. Notice that you can increase the width of any of these columns on the variable view of the data editor. This feature is particularly useful if you want to read the variable label in its totality.
Figure # 7
You can find the meaning of a particular variable label in several ways. The first, and easiest, is by finding the variable name on the variable view. On the data view, you can double-click the variable name in the column heading and the variable view will open automatically and highlight the row of the variable you just double-clicked. Here's another way to learn about variables in the data set. Go to the menu above the data matrix and select the first option: . See figure 8 below for an illustration. A list of all the variables and the way they were formatted will appear and you can simply select the variable of your choice from this window by clicking on it. Here I selected the variable.
Figure # 8
Variables are listed in the order they were imported from the GSS site. However, if you want to see them listed alphabetically, open the menu and select the submenu. Then as in figure 9 below, select from the option in the sub-window. Then click OK twice. Note: The list in the left column may consist of the variable names instead of the abbreviated labels, but you can change this easily. In the menu, select . In the tag, find the section on in the right-hand column. Click . You'll have to reload the data set, but it will be worth the effort, because you'll be able to track down the abbreviated name you're looking for.
Figure # 9
Now reopen the variable information window from the menu. All variables are listed alphabetically. Notice the words next to : "EVER BEEN DIVORCED OR SEPARATED". While this is still abbreviated, you may figure out that it represents whether this person has been divorced or separated. You can also view the value labels instead of the numeric value into which they have been coded. See figure 10 below. Simply make sure you are on the data view, select from the menu and click on . A check mark will appear next to this menu selection and you will be able to read directly what your respondents' answers were for each variable. For instance, we now can see that respondent number 1623 is female, 50 years old, and married to a man who is 51 years old. To turn it off, by the way, simply open and click again. Notice that the check mark indicates whether the feature is on or off.
Figure # 10
In figure 11 you can see the numeric values "1" and "2" for replaced by "Yes" and "No" respectively.
Figure # 11
You can obtain the full wording most easily from the GSS Web site. The codebook index (by variable name) is located at http://www.icpsr.umich.edu/GSS/. Click on "Mnemonic". That will take you to a list of variables beginning with "A". You can see any other variable by simply clicking on the first letter of the variable you wish to find.
Let's take a close look at the variable , which we will use in our analysis further on. Now you can see more clearly what this variable represents. Respondents were asked a battery of questions concerning their attitudes toward abortion-specifically, the conditions (e.g., rape, danger of birth defects) under which they felt a woman should be able to obtain one legally. In this case, they were asked if they would support a woman's right to a legal abortion as a purely personal choice: "for any reason."
Besides presenting the actual wording of the question, this Web page also reports the answer categories and the results of several surveys that asked the question over the years. Notice that a 1 "punch" stands for saying "yes". Now we know that the first person in the data set feels that a woman should be able to choose an abortion for any reason.
Let's go back to variable . To find out what "0" means, let's learn how to examine variable codes within SPSS. Double-click in the column heading. The variable view window opens and the variable divorce is automatically highlighted. Now click on the little square in the value label cell for abany as shown in figure 12 below.
Figure # 12
As you can see in Figure 13, "0" stands for "NAP", which means "not applicable". In other words, this particular question was not asked of some respondents.
Figure # 13
Another method to learn about the value labels of a particular variable is to select from the menu. A list of variables appear alphabetically as you can see in figure 14 below. Figure 14 presents the result of this action. First you'll see that we have pretty much the same information we obtained before. Notice the column to the left, however. It's the beginning of a list of all the variables in the data set. (You can use the scroll bar to see the rest of the list.) Find the name of a variable you're interested in and click it. You'll instantly get the variable and value labels.
Again you can view all value labels for abany by simply selecting this variable. The advantage of this subcomand is that you also have the option to find the variable quickly on your SPSS data editor by clicking on . The column will be selected and highlighted instantly on your data view window.
Figure # 14
Person 3, for example, was not asked this question. By asking different sets of questions of different people in the sample, the researchers can collect data for hundreds of variables without driving any of the respondents to suicide or homocide.
Now that we've seen what the abbreviated variable labels and numerical code categories stand for, we're ready to examine some public opinion. Think about the question we've looked at so far. How do you suppose people in the United States feel about a woman's right to an abortion? That is to say, what percentage do you suppose said "yes" and what percentage said "no"? To start finding out, select ... from the menu in the general menu(see Figure 15).
Figure # 15
This command will get you a list of variables to choose from, as illustrated in Figure 16.
Figure # 16
Now you can double-click a variable label or else single-click it and then click the right-pointing triangular arrow. Either of these actions will move labels from the left-hand to the right-hand column. Figure 17 shows the results of three variables being selected this way.
Figure # 17
Next we click the OK button. SPSS will click and whirr as it determines the distributions of responses to each of the three variables in our example. It will then produce the frequency tables shown in Figure 18.
Figure # 18
Notice that SPSS has now opened a new window. The first was a data window and the new one is labeled Output. As you continue with SPSS, you'll often work back and forth between these two windows; often the program alternates them automatically.
The left-hand frame in Figure 18 presents an outline of the results. Click on any item in the outline to fill the right-hand frame with the data you've requested. Here is where you can easily cut and paste from SPSS to a MS Word document. Click on the outline in the left-hand-side. All the frequency tables are now selected. Select from the menu. Now open your Word document and select from the menu. Figures 19 and 20 illustrate this process. In this case I was only interested in copying and pasting the frequency table for .
Figure # 19
Figure # 20
In Figure 18 the right-hand side presents two tables. The first one summarizes the three variables we chose originally. All we are told here, however, is the number of respondents with valid responses and those without. The second table gives the distribution of data for the question of whether a woman should have the right to an abortion for any reason. In addition to "yes" and "no", the table reports three other possibilities:
NAP: "Not applicable" (the question was not asked)
DK: Respondents who said they "Don't know"
NA: Respondents who were asked but gave "No answer"
In the second table's Frequency column you can learn how many respondents fall under each of the categories. The Percent column puts the information into a more useful form by showing the percentage represented by each category. The most useful column is Valid Percent. This column tells us that of the 1,768 respondents who gave a valid response, 39.9 percent said "yes" and 60.1 percent said "no". We might interpret these results by saying that opinions on this issue are almost evenly divided.
By scrolling down the window or using the outline in the left-hand frame, you can check the results for the other variables. For now let's move along to more complex analyses.
The frequency distributions we've just undertaken are called univariate analyses (analyzing one variable at a time). Now we'll turn to bivariate analyses (two variables at a time).
Let's stay with the issue of "abortion for any reason". We've seen that U.S. residents are about evenly divided on the issue. What do you suppose accounts for this difference? People often guess that women would be more likely than men to support abortion as a woman's right. Let's see how to determine the accuracy of that guess.
Return to the menu, but check this time. This brings you a somewhat different dialog box, as indicated in Figure 21.
Figure # 21
We are now going to set up a percentage table involving two variables: and . The table will have both columns and rows. While there are many ways to construct such a table, we're going to assign the categories of (male and female) to the columns. Then we'll look at the opinions on within each of those categories. In the logic and language of SPSS, that makes the "row variable" and the "column variable". To assign categories, select variable labels from the list and drag them to the appropriate windows on the right. Figure 22 shows this step.
Figure # 22
To find a variable label in the list, you can either scroll through the list or click any label in the list and then type the variable you want. It may take a little experimentation to discover how quickly you must type to have it work.
Thus far, we've told SPSS to organize the table like this:
| Men | Women | |
| Approve | ||
| Disapprove |
To complete our request, we have to tell SPSS how to percentage the data. In this case, we'll ask for the percentage of men who approve of abortion and the percentage who disapprove, with the two percentages totaling 100 percent. Then we'll ask for the corresponding percentages of women. In other words, ask SPSS to "percentage down" the columns. The option provides a means for us to indicate that preference. Click the button in the dialog box (See Figure 23).
Figure # 23
When the dialog box opens, the box will already be checked. Leave it that way. In the section on , click the box. That instructs SPSS to percentage down the columns. Click to complete this dialog, and then click to launch the request for a crosstab. Once SPSS has completed the table, we'll be returned to the window, as in Figure 24.
Figure # 24
Let's see what the table tells us. We wanted to find out if men and women differed in their attitudes about whether a woman should be able to choose an abortion just because she wanted one. The table suggests that there's no appreciable difference. The same proportion of men (39.9%) and women (39.8%) say a woman should have the right to an abortion for any reason.
Let's try another variable that could affect people's attitudes toward abortion: political orientation. In the GSS, represents a standard item that asks respondents to characterize their political views as something between "Extremely Liberal" and "Extremely Conservative". Figure 25 shows impact of this variable on attitudes toward abortion.
Figure # 25
Because there are so many categories for political views, you may have to use the scroll bar at the bottom of the window to move back and forth across the table. Notice that we've scrolled all the way to the right in Figure 25.
The impact of political views on abortion attitudes is pretty clear. Overall, liberals support abortion more than do conservatives. The only exception to the pattern is that people who are "Extremely Liberal" are less supportive than those who are "Liberal". This result appears a lot, perhaps because of the different ways people interpret the two political terms.
It's often useful to recode variables with many categories, reducing the number to something more manageable. In the present case, we might want to combine the categories in polviews to make three: "Liberal", "Moderate", and "Conservative."
We can combine categories by hand from the kind of table presented in Figure 25. For example, we can easily calculate that 447 of the respondents in the table considered themselves liberals (62 + 203 + 182). Of those, 247 supported a woman's right to an abortion for any reason (42 + 110 + 95). Dividing these two numbers tells us that 55 percent of the liberals supported abortion. A similar calculation tells us that 152 of the 553 conservatives-27 percent-were supportive. The 42-percent support among moderates fits neatly between the liberals and conservatives.
Combining categories like this makes it easier to use the variable in further analyses. However, we should have SPSS create a new, recoded variable so that we don't have to undertake the job by hand each time. To do this, we must first return to the window. If you're in the window, you can simply click the window icon in the task bar at the bottom of your screen or select the SPSS Data Editor tag from the menu (see Figure 26 below).
Figure # 26
Once you've returned to the window, click the menu and move your pointer to the option. When you do that, you'll be presented with another choice, as Figure 27 shows.
Figure # 27
SPSS offers two options for recoding: Either it will modify the data contained under the existing variable label () or it will create a new variable for the modified results (). Choose , because the first option will destroy the original data.
Next you'll see a large dialog box like the one in Figure 28.
Figure # 28
Initially the right-hand frame will have nothing in it. To create the situation shown in Figure 29:
Figure # 29
To continue the process, click . This will bring you the dialog box shown in Figure 30.
Figure # 30
To tell SPSS how to create from , we identify values of and indicating what values they should get in . Let's start by creating a "Liberals" category that includes everyone with a "1", "2", or "3" on . We'll give the new category the value "1."
In Figure 31, we've chosen the "Range" option and indicated that anyone with a value of "1" through "3" on should be assigned a "1" on . Make sure you see where those instructions are entered in the dialog box.
Figure # 31
When you click the button, the transformation instruction is transferred to the field on the right-hand side of the dialog box, as you can see in Figure 32.
Figure # 32
We'll use a different option to create a new "Moderate" category. As you recall, they were scored "4" on . We'll give them a "2" in by entering the old and new values in the and fields. When we click again, the new instruction is added to the field. Now take a moment to figure out how you would create a "Conservative" category, transforming scores of 5, 6, and 7 on into a score of 3 on . Once you've done that, you should have the dialog box shown in Figure 33. All that remains now is to click , which will return you to the earlier dialog box, and then click .
Figure # 33
Let's tidy up our new variable. First return to the window. Next, scroll across the list of variables to the far-right end. SPSS places each new variable at the end of all the other variables. Since you just created a new variable called polviewr, SPSS created a new column located last on your spreadsheet. When you find , double-click the variable label at the head of the column. This will open up the window and polviewr will be autolatically selected at the bottom of your variable list. See Figure 34 below.
Figure # 34
Click on the righ-hand-side of the cell located in the column and row. A little square located in the cell will appear as shown in figure 35. You can then reduce to 0 the number of decimals for each value of . In other words, you can convert each 1.00 score to simply 1.
Figure # 35
Now click on the righ-hand-side button in the column and the row. A Value Labels dialog box will appear as shown in Figure 36. Now give names to the new category values:
Repeat the process to assign "Moderate" to the value of "2" and "Conservative" to "3", being sure to click each time. When you're done, the dialog box should look like Figure 36. Click , then .
Figure # 36
Click then on the righ-hand-side button in the column and the row. The dialog box shown in Figure 37.1. Type 9 in the first space available under Discrete Missing Values. You have just indicated to SPSS that any value 9 in the data set for polviewr should be considered "missing answer" and removed from statistical computation.
Figure 37.1
Finally you can change the measurement type for polviewr as shown in figure 37.2. I selected Ordinal for polviewr since the values can be ranked from low to high level of conservatism.
Figure 37.2
Now when you select and scroll through the list of variables, you'll find a new entry in the list: polviewr. Choose it to see the frequency distribution generated by our new categories (see Figure 38).
Figure # 38
Since we have gone to all this trouble to make our analysis simpler, let's see if it worked. Let's use to reexamine the relationship between political orientations and attitudes toward abortion. Use to create a table with and . Figure 39 illustrates what you should get.
Figure # 39
Notice how much easier it is to read this table, compared with the one presented in Figure 25. We see that 55 percent of the liberals, 42 percent of the moderates and 28 percent of the conservatives support a woman's right to an abortion for any reason. (It's good to round off the decimal points in percentages like these, since they're based on samples, which only provide estimates of populations in the first place.)
Bivariate tables are typically only the beginning of quantitative data analysis. For example, you might want to see if the observed relationship between politics and abortion holds equally for men and women. SPSS makes it a simple matter to satisfy your curiosity uabout such matters.
Return to and specify a third variable as shown in Figure 40.
Figure # 40
Notice that we've simply transferred the third variable, , into the bottom field in the dialog box. Press to see the result, illustrated in Figure 41.
Figure # 41
In a sense, this new table splits the one shown in Figure 39 into two parts. The top half shows the relationship between and for men, the bottom half shows the same relationship for women. We can see immediately that the original relationship is replicated for each of the gender categories.
In the far-right column, the summary statistics show you the relationship between gender and support for abortion. Overall (i.e., forgetting about political orientations), an equal 41 percent for men and women support a woman's right to an abortion for any reason-an interesting similarity. (Notice that I've rounded off the figures, 40.5% and 40.8%, presented in the table.) It seems that there is no effect on . The sex of the respondent does not matter for this GSS question.
Comparing men and women in the other columns of the table tells us that sex has little impact no matter what a person's political orientation is. Women are more supportive among liberals, men are slightly more supportive among moderates and among conservatives. None of the differences are very large, however.
SPSS allows you to go beyond trivariate tables, though they grow increasingly difficult to read and analyze. To experiment with this possibility, click the button near the bottom of the dialog box to add new of variables to the table.
In the previous example, I casually remarked that the percentage differences were not very large. This was a subjective assessment of the substantive significance of the differences.
As you know, tests of statistical significance can determine the likelihood that relationships observed in a sample are merely an artifact of sampling error rather than a reflection of a real difference in the population from which the sample was drawn. Let's take a look at how SPSS offers us the use of those tests.
Return to the dialog box, via . At the bottom of the box, click a button marked . Figure 42 illustrates the results.
Figure # 42
As you can see, SPSS offers several summary statistics, three of which you'll recognize from this textbook: chi square, lambda, and gamma. Recall that chi square is appropriate to nominal variables such as and , so let's use them to see how we can use SPSS to work with chi square.
Click the box. Then click and enter and in the appropriate places in the dialog box. In addition to the regular percentage table, SPSS now provides an additional table, shown in Figure 43.
Figure # 43
If you've had a statistics course, you'll recognize many of the tests presented in this table. For our purposes, let's focus on the first row of the results, the "Pearson Chi-Square". The third column tells us the probability that sampling error alone could have generated a relationship as strong as the one we've observed, if men and women in the whole population were exactly the same in their attitudes toward abortion. Specifically, it tells us that the probability is .972, 97 chances in 100. This probability level is extremely high. Thus, the chi square test confirms what we had concluded subjectively from our crosstabulations: Men and women do not differ at all in their support for a woman's right to an abortion.
The relationship between and was much stronger. Let's see how chi square evaluates that relationship. Repeat the above procedure, changing to . Notice that you don't have to select again, any more than you have to select . SPSS maintains those specifications until you shut down the program. When you start it up again, you'll have to specify such preferences again. And, of course, you can turn them off any time you no longer desire them.
Figure # 44
See Figure 44 above for the chi square evaluation of and . Notice that the significance in this case is calculated at ".000". SPSS only presents the first three decimal points in this calculation. Hence, the likelihood of the observed relationship being simply a product of sampling error isn't exactly zero-it could happen‚ but the chances are not very high that it did. Specifically, the probability is less than .001, or less than one chance in a thousand, which is a commonly used standard for statistical significance. Thus, we conclude that the relationship we've observed in this carefully selected sample very likely represents something that exists in the larger population.
Thus far we've been examining nominal and ordinal data, which constitute the bulk of social science data. SPSS can also help you work with interval and ratio data.
For example, you may have heard that highly educated people tend to have fewer children than do those with less education. Let's use SPSS to see if it's so. In the GSS, these variables are and . Under , select and, when asked, . (SPSS can undertake more complex correlational analyses, but we'll keep it simple for this introduction.) In the dialog box, select and and click . That will produce the result shown in Figure 45.
Figure # 45
As you can see, there is a correlation of -.210 (or a "negative correlation of .210") between the two variables. The negative correlation means that as the years of education increase, the number of children decreases.
Of course, this analysis cannot determine the causal direction, so we could also say that as the number of children increases, the amount of education completed decreases. Both interpretations make sense and probably apply in some cases. Some young people have to cut their education short to accommodate the demands of parenthood, and those who keep going to school may have to delay parenthood and have fewer children once they get started.
Whatever the explanation for the relationship, SPSS informs us that the correlation is significant at the 0.01 level. In other words, sampling error could account for a correlation like this one less than once in a hundred times.
Although we entered only two variables in this analysis, SPSS will accept as many at a time as you want and will create a correlation matrix in which every variable is correlated with every other variable. Experiment with this possibility.
Regression analysis builds on the logic of correlation and creates equations that predict values of one variable based on values of others. Here's how we could represent the relationship between and as a regression equation. Under , select , choose from among the alternatives offered. This will present you with the dialog box presented in Figure 46.
Figure # 46
Let's use the logic of accounting for the number of children people have; thus, is our dependent variable, and is the independent variable we'll use to account for differences in the number of children. Enter the two variables into the dialog box as show in Figure 46 above. Click , to get the result shown in Figure 47.
Figure # 47
SPSS will present you with three tables of calculations, but here we are only interested in the third one, . In fact, we're only interested in the first column of this table, . The first of these, the , represents the value of the dependent variable (number of children) when the value of the independent variable (years of education) is zero. Statisticians sometimes refer to this as the y-intercept or the point where the line crosses the y-axis, when the regression line is plotted on a graph.
The B value associated with the independent variable (-.121) indicates how much the
dependent variable changes with each added unit of the independent variable In our example, this
means what change in the number of children we should expect for each added year of education.
Stated as an equation, the regression looks like this:
Suppose a person has 10 years of education. We would predict she or he has
For college graduates with 16 years of education we'd predict they have
Clearly, these estimates represent statistical averages, because no one can have 2.197 or 1.471 children. Still if you were to bet on the number of children people had and knew only their education, this equation would be your best guide for betting. If you could make a lot of bets on this basis, you'd be a winner overall.
To explore regression further, try adding another independent variable. SPSS will provide you with a new y-intercept and coefficients for each of the independent variables. Be sure to interpret positive and negative signs correctly.
Chapter 6 discussed the creation of composite measures, such as indexes and scales. This section looks briefly at how to use SPSS to create a simple index.
Without reviewing the logic of index construction, let's create an index of sexual
permissiveness including the following GSS variables:
: sex before marriage
: sex with person other than spouse
: homosexual relations
In each of these items, respondents were asked whether the action was
Given the format of these three items, we can create a composite index quite simply. Although the values 1-4 used to represent the answers to these questions are merely labels-just as we used "1" for male and "2" for female-we can in this case take advantage of their numerical quality. In each of these items, the higher the numerical code, the higher the level of sexual permissiveness. If we add the values respondents received on the three items, the possible totals range from 3 to 12, with 12 representing the highest and 3 the lowest degree of sexual permissiveness.
We can now generate the index by using the menu option, as illustrated in Figure 48. Enter the information by typing or by selecting the variable names from the list and clicking the plus sign in the keypad provided in the dialog box (see Figure 48).
Figure # 48
When you're through, click in the dialog box. SPSS will create a new variable, , in your data set and will assign the appropriate values to each of the respondents. In the window, scroll to the far right and find the new variable in the last column. Scroll up and down to see the values assigned to respondents. Those with no values in the new column were missing data on the three items used to construct it.
For a more comprehensive view of the new variable, run the frequency distribution for (Figure 49).
Figure # 49
Having created a composite measure such as this one, it's always good to validate the scores if possible. That is, if the index scores truly distinguish levels of sexual permissiveness, then those scores should predict the answers people gave to other questions. For example, we might wonder if attitudes toward abortion are related to sexual permissiveness. We can find out by running a cross-tabulation of the index and, say, .
The result of this validation effort is presented in Figure 50. Notice that this table uses a somewhat different format than those we've created earlier. Given the large number of categories comprising , it's difficult to fit the table on the computer screen (and in this book). Thus, I have made the row variable, made the column variable, and requested that the table be percentaged by row rather than column. Thus we read this table "down", whereas we've been reading earlier ones "across".
Figure # 50
The relationship between these two variables is extremely strong and consistent. Of those with a score of 3 on the index (representing the lowest level of sexual permissiveness), only 19 percent support a woman's right to an abortion for any reason. This percentage increases steadily as index scores increase, reaching 67 percent among those with a score of 12 on the index.
Creating an index from variables that do not permit such a simple addition of code values is
a little more involved. To illustrate, let's create an index of where respondents stand on the
issue of guns. Two items in the GSS are relevant:
gunlaw: favor or oppose gun permits
It makes sense that those who have a gun and oppose requiring permits for owning guns are the most pro-gun, while those without a gun of their own and who favor gun permits would be the most anti-gun. Notice, however, that the pro-gun position is represented by a "2" on and a "1" on . Thus, we can't simply add the values. Here's how to generate a simple index from these two items.
Let's create a new variable, , for which higher scores indicate greater support for guns. To start this process, return to . Type in the and give everyone a starting score of "0", as in Figure 51. Click to create the variable. Then return to and change the "0" to "progun + 1" as indicated in Figure 52.
Figure # 51
Figure # 52
We're not going to add a point to everyone's index score, however. Click the button near the bottom of the dialog box, so we can specify the conditions under which we want to add a point. Next, click the button beside the phrase "include if case satisfies condition". Then create the condition shown in Figure 53. By doing this, we're telling SPSS to give an additional point to anyone who said they oppose gun permits (i.e., a "2" on ).
Figure # 53
Click to return to the earlier dialog box. Then click to instruct SPSS to take the action specified. When SPSS tells you that you're about to change an existing variable, say "yes."
Select again. Notice that the earlier instruction to add a point is still there. Leave it, but click in order to modify the condition. Change it to specify those who said they owned a gun ("1" on ) by indicating " = 1" as the condition. Click , then , then "yes", as before. Now those who had a score of 0 for favoring gun permits will now get 1 point (for a total of "1") if they own a gun; they still have 0 points if they don't have a gun. Those who scored a point for opposing gun permits will get another point if they own a gun (a total of "2") but will stay at 1 point if they don't have a gun. The resulting index is made up of the scores "0", "1", and "2."
There's only one problem with the index as it stands. Since everyone started with 0 points, those who didn't answer one or both of these questions will end up with a score of zero, thus seeming to oppose guns. The final step in creating this index involves culling out those with missing data.
First, let's create a "missing data" code. We'll use "99". Return to . In the first dialog box, type " = 99". Click to specify the condition: "" as shown in Figure 54. You need to first select the "" function and select gunlaw and click on the arrow. Click and , then repeat the procedure for "."
Figure # 54
If you examine the response possibilities for , however, you'll find that 23 people refused to answer and were coded "3". Return to and assign an index value of 99 for anyone with " = 3."
As a final step, we're going to recode the 99. Select , but this time choose . Once you reach the dialog box, convert the 99 to a as illustrated in Figure 55. Enter 99 as the old value, click '' then click the button.
To complete the action, click and . The index is now complete. You can check it out by running . To reassure yourself further, run a cross-tabulation between the two items to verify that the correct number of people received each of the scores.
Figure # 55
With the improvement of computer graphics, SPSS now offers many options for presenting data in nontabular formats. Let's explore a few of these, beginning with simple frequency distributions.
Figure 56 presents the distribution of GSS data on religious affiliation () as a pie chart. You can create this by (1) selecting under , (2) choosing , then (3) specifying as the variable to portray. Select "". Before clicking , click on the button and type the title of your graph in the dialog box. Here I typed "Pie Chart of Religious Affiliation."
Figure # 56
Figure 57 shows the results of this operation. As you can see, the pie chart is small and refers to all religious categories. This pie chart is not very useful and requires some simple formatting. We need to collapse into one larger label all these slices too small to make sense of the graph. Double click on the graph in your output window and a graph dialog box will appear. Select from the menu. A dialog box appears shown in Figure 58. Select " " (note that you can change this percentge if you want to include more categories under this new collapsed one). Under this same dialogue box select so that the percentage of each slice is indicated on the graph. Click and close the window.
Figure # 57
Figure # 58
Figure 59 shows the result of this formatting procedure. Only the "Catholic", "None" and "Protestant" slices remain unchanged. All other categories are collapsed under the "Other" pie slice. There are many options to explore and I invite you to experiment on your own in order to polish statistical visual representations in your papers. When you are ready to import a SPSS graph to your paper, simply select this graph from the outline menu on the left side of the output window. A frame around the graph will indicate that you have selected it. Then choose "" from the menu. Open your Word document and paste the graph wherever you want.
Figure # 59
If you're on a diet that rules out pies, see Figure 60 for a bar graph of .
Figure # 60
Ratio variables, such as the number of years of education, might be presented as line graphs. See Figure 61.
Figure # 61
These are just a few of the graphic options available to you in SPSS. Experiment with them to find the form of presentation most appropriate to your purposes.
Often, you will use SPSS to undertake quantitative analyses for a term paper, thesis, or other project. Although you can retype the results of SPSS into your paper, you can also take advantage of some energy-saving options. Depending on your word-processing system, you may have to experiment a bit.
As shown earlier, it is very easy to copy and paste from SPSS outputs to Word documents. Simply make sure you have selected the objects you want to copy (which should then be framed) and use the command from SPSS and then in the Word document of your choice.
Though the easiest strategy is to copy and paste from SPSS to Word, you can also export your statistical results. To try making a hard copy of a graph, create the pie chart for . Click the resulting graph. As explained above, you'll see a box appear around it, indicating that it has been selected by the computer. Then in the menu select . Figure 62 illustrates the resulting dialog box.
Figure # 62
You have several options here. You can export your output document with charts, without charts, or exclusively the charts of your output window. For our purposes, export the . You can then either change the name of the export file you are going to create or accept (and remember) the name and location SPSS has proposed. Again, for present purposes, choose to export the and choose as the export format. Once you've done all this, click .
Run your word-processing system and open the document that desperately needs this table. Click where you want the graph and select and then from the menu (this procedure may be different if you are using another word processor than MS Word). Browse your computer until you find your JPEG file. Remember to change the option to to view all documents and not exclusively Word documents.
To try making a hard copy of a table, create a frequency distribution for . The same procedure you used to export graphs is possible if you choose to export tables. However, you will lose the formatting of the tables you export. I suggest that you choose the Export Format option. This format preserves the best the layout of your tables with SPSS. Open then your Word document and select File from the Insert menu and browse until you find your output file. You should be rewarded with something like the table I've put in Figure 63.
Figure # 63
In all this, you may also want to take advantage of SPSS's multitude of table formats. To explore these, choose in SPSS and click on the tab. Once there, click the various options under and SPSS will give you a sample layout in the field to the right of the list. If you find a format that interests you, leave it highlighted when you close the dialog box and then create a new table. It will be done in the format you've specified.
As much as you may come to love SPSS, you'll have to quit the program eventually. Go to the menu and select . SPSS will respond with a question that asks whether you want to save the file you've created. If you give it a name and a disk location for saving it, you'll be able to open it later on and retrieve any data created in your analyses. If you've just been practicing SPSS, you'll probably want to say "".
If you've changed the data set, ie: by creating a recoded variable, SPSS will ask if you want to save the changes. Unless you want to get rid of the changes, say "". However, you should only alter the data file if you have permission to do that. If, for example, you're sharing a file with others in your class, it may not be appropriate for you to save your changes. Discuss this with your instructor if in doubt.
SPSS will now close. Have fun.