Learner's Guide to SPSS 11.0

Getting Started
Opening a Data File
Saving Changes
Getting Around with SPSS Windows
Frequency Distributions
Cross-Tabulations
Recoding Variables
Multivariate Tables
Tests of Statistical Significance
Correlation and Regression
Creating Indexes
Graphics
Making Copies of Results for a Paper
Shutting Down



This appendix will provide you with a brief overview of the Statistical Package for the Social SciencesTM (SPSS). For many years, SPSS has been the most commonly used program for quantitative data analysis in the social sciences. It has gone through many versions for both the Windows and Macintosh platforms. This appendix will use SPSS 11.0 along with data from the 2000 General Social Survey. If you're using a different version of SPSS or a different data set, you'll need to make some adjustments. This appendix introduces you to the overall logic and application of SPSS. Whatever version you have, consult the user manual for whatever additional assistance you need.

In addition, several books have recently been written to introduce social researchers to SPSS. One is Earl Babbie, Fred Halley, and Jeanne Zaino, Adventures in Social Research, Thousand Oaks, CA: Pine Forge Press, 2000.

Getting Started

When you first open SPSS 11.0 for Windows on your computer, figure 1 below will appear. You have several options to start with. You can click on "Run tutorial" and click OK. This is a good step for you to get familiar with some of the basic features of SPSS. You also have the option to create your own data. You would choose this option if you had collected your own survey and were ready to transpose your respondents' answers into an SPSS data spreadsheet. For you who are using the GSS data, you will need to ask SPSS to import a data file already formatted and ready for analysis. GSS files will often come as SPSS files (they are recognizable by their .sav extension) or as SPSS portable files (saved as .por files).

Figure # 1
Opening SPSS

The next step is to load some data into the program.

Table of Contents

Opening a Data File

When "open an existing data source" is selected click OK. You will have to browse your computer and select the location in which your GSS file is saved. Notice the bottom of this window. You skip this dialog window in the future. When you start SPSS, the window in the background will appear and you can simply click on "open" and select the GSS data file you want to analyze. The advantage of this dialogue box is that after your first opening of the GSS file, SPSS will list your file in place where "More files…" is highlighted. In other words, if you need to work on this file again you can simply select it with your mouse and click OK. SPSS will open your file in one simple step.

Let me guide you through the process of opening the GSS file included in your textbook package. Simply close the window in figure 1. Look at the top row in the figure below. It contains several menus: File , Edit, View , Data, Transform , Analyze, Graphs , Utilities, Windows, and Help. We'll use these menus throughout this appendix.

Notice that the first letter of each menu name is underlined (e.g., File). This means you can activate that menu by holding down the ALT (control) key and typing the underlined letter. Thus if you press ALT+F, you activate the File menu. You can accomplish the same thing by simply clicking the menu name. In either fashion, you would end up with the screen shown in figure 2.

Figure # 2
Opening the File Menu

As you can see, the File menu offers several possible actions, but right now we're only interested in opening a file. Click Open to indicate that you want to load a data file into the program. As you may recognize from the notation to the right of the command, we could have accomplished the same thing by striking CTRL+O without even going into the File menu.

Next you'll see a dialogue box asking you to select among several options. Open the "data" menu. Your steps in browsing your computer to open the data file are illustrated in figures 3 and 4 below.

Figure # 3
Picking a Data Set

Select the data set you want by double-clicking it or by single-clicking it and then clicking Open. We've selected the 2000 General Social Survey (GSS) data set (located on the zip drive under the folder "2000"), containing hundreds of variables collected from 2,817 respondents. The research was conducted by the National Opinion Research Center at the University of Chicago to establish a representative sample of U.S. residents 18 and older. You, of course, may be working with a different data set, such as the one that came with this book. In your case, open the CD drive to open your data file.

Figure # 4
Changing the File Type

It is important to note that SPSS is set to open .sav files by default. In my case, the GSS 2000 data file was created as a .por file. I had to change the file type to "SPSS Portable (*.por)" as illustrated in figure 4 above. When my GSS200.por file appeared I made sure it was highlighted and clicked "open". See figure 5 below.

Figure # 5
Choosing a SPSS Portable Data Set



Table of Contents

Saving Changes

When you modify your data set-creating recoded variables for example-you'll probably want to save those changes for later use. Realize that any unsaved changes will stay in effect throughout this SPSS session, but when you exit the program (see the last section of these SPSS Guidelines) you can lose all your changes. It's wise to save changes as soon as you're sure you want to.

Saving an altered file is hardly rocket science. First, select the Data View window or the Variable View window (not the Output window which contains all statistical jobs you asked SPSS to perform on your data). Then, under the File menu, select Save. Alternatively, you can simply press CTRL+S. From now on, when you open this file, it will contain your alterations.

Realize that when you save the file in this fashion, the changed file replaces the original one. So if you madly deleted data or altered variables using their original names, you'll have put the original file forever out of reach.

If you wish to save the original file as well as your changes, choose Save As under the File menu (see Figure 6.1). This time, SPSS will ask you to supply a name for the data set about to be saved. Use some name other than that of the original data set. Also, pay attention to where on your disk it is saved so you can find it later on.

Figure 6.1
Saving a SPSS portable Data Set

You may also want to save your data file as an SPSS file if it is a portable file or other type of data spreadsheet. In figure 6.2 I saved my GSS2000.por as a GSS2000.sav which will save me some time in processing this file in the future. There won't be a need to wait for conversion time.

Figure 6.2
Saving as a .sas file

Table of Contents

Getting Around with SPSS Windows

By default, the window that appears is the "Data Editor" window. No matter what data file you open, the setting of this spreadsheet is always the same. Each of the columns represents a variable, such as the respondent's gender, age, or attitude about abortion. Each row represents a particular respondent. Thus each cell of the matrix stores an item of information about a person. In Figure 2, all the cells are empty.

Once you've instructed SPSS to open your data set, the original matrix will be filled with data, the way it is in Figure 6 below. Notice that the row just above the matrix now contains the names of the variables comprising the data set: hrs1, wrkgovt, and so forth. SPSS uses abbreviated labels, each no more than eight characters long.

Figure 6
A Full Data Matrix

Notice in Figure 6 that the upper-right cell, which is the cell that links together case number one (respondent number 1) and the variable labeled wrkgovt. The first respondent has a value of 2 on the variable wrkgovt. But what on earth does that mean?

This is where SPSS 11.0 is different from earlier versions. In addition to the "Data View" sub-window, the Data Editor window has a Variable View sub-window. You can simply click on "Variable View" at the bottom of the Data Editor window. In figure 7 below, you can see how this window is organized. This time, the rows represent the variables and the columns are the various categorizations associated with each variable. The columns are as follows: Name, Type, Width, Decimals, Label, Values, Missing, Columns, Align, and Measure. Name is the abbreviated name of the variable (it is never longer than eight characters). The Type of the variable is often "numeric" but could be "string" if you wanted to input words as data instead of numbers. Label is the description of the variable and indicates more clearly what the question on the questionnaire was about. In the column "Value" you can find the values associated with each possible answer for each variable. Notice that you can increase the width of any of these columns on the variable view of the data editor. This feature is particularly useful if you want to read the variable label in its totality.

Figure # 7
A Full Variable Matrix

You can find the meaning of a particular variable label in several ways. The first, and easiest, is by finding the variable name on the variable view. On the data view, you can double-click the variable name in the column heading and the variable view will open automatically and highlight the row of the variable you just double-clicked. Here's another way to learn about variables in the data set. Go to the Utilities menu above the data matrix and select the first option: Variables. See figure 8 below for an illustration. A list of all the variables and the way they were formatted will appear and you can simply select the variable of your choice from this window by clicking on it. Here I selected the divorce variable.

Figure # 8
Decoding the Variable Divorce

Variables are listed in the order they were imported from the GSS site. However, if you want to see them listed alphabetically, open the Edit menu and select the Option submenu. Then as in figure 9 below, select Alphabetical from the Variable List option in the General sub-window. Then click OK twice. Note: The list in the left column may consist of the variable names instead of the abbreviated labels, but you can change this easily. In the Edit menu, select Options. In the General tag, find the section on Variable Lists in the right-hand column. Click Labels. You'll have to reload the data set, but it will be worth the effort, because you'll be able to track down the abbreviated name you're looking for.

Figure # 9
Sorting the Variable List

Now reopen the variable information window from the Utilities menu. All variables are listed alphabetically. Notice the words next to Variable Label: "EVER BEEN DIVORCED OR SEPARATED". While this is still abbreviated, you may figure out that it represents whether this person has been divorced or separated. You can also view the value labels instead of the numeric value into which they have been coded. See figure 10 below. Simply make sure you are on the data view, select View from the menu and click on Value Label. A check mark will appear next to this menu selection and you will be able to read directly what your respondents' answers were for each variable. For instance, we now can see that respondent number 1623 is female, 50 years old, and married to a man who is 51 years old. To turn it off, by the way, simply open View and click Value Labels again. Notice that the check mark indicates whether the feature is on or off.

Figure # 10
Viewing Variable Value Labels

In figure 11 you can see the numeric values "1" and "2" for divorce replaced by "Yes" and "No" respectively.

Figure # 11
Viewing Variable Value Labels on Data View

You can obtain the full wording most easily from the GSS Web site. The codebook index (by variable name) is located at http://www.icpsr.umich.edu/GSS/. Click on "Mnemonic". That will take you to a list of variables beginning with "A". You can see any other variable by simply clicking on the first letter of the variable you wish to find.

Let's take a close look at the variable abany, which we will use in our analysis further on. Now you can see more clearly what this variable represents. Respondents were asked a battery of questions concerning their attitudes toward abortion-specifically, the conditions (e.g., rape, danger of birth defects) under which they felt a woman should be able to obtain one legally. In this case, they were asked if they would support a woman's right to a legal abortion as a purely personal choice: "for any reason."

Besides presenting the actual wording of the question, this Web page also reports the answer categories and the results of several surveys that asked the question over the years. Notice that a 1 "punch" stands for saying "yes". Now we know that the first person in the data set feels that a woman should be able to choose an abortion for any reason.

Let's go back to variable abany. To find out what "0" means, let's learn how to examine variable codes within SPSS. Double-click abany in the column heading. The variable view window opens and the variable divorce is automatically highlighted. Now click on the little square in the value label cell for abany as shown in figure 12 below.

Figure # 12
Viewing Decoding Value Labels on Variable View

As you can see in Figure 13, "0" stands for "NAP", which means "not applicable". In other words, this particular question was not asked of some respondents.

Figure # 13
Code Labels for Abany

Another method to learn about the value labels of a particular variable is to select Variables from the Utilities menu. A list of variables appear alphabetically as you can see in figure 14 below. Figure 14 presents the result of this action. First you'll see that we have pretty much the same information we obtained before. Notice the column to the left, however. It's the beginning of a list of all the variables in the data set. (You can use the scroll bar to see the rest of the list.) Find the name of a variable you're interested in and click it. You'll instantly get the variable and value labels.

Again you can view all value labels for abany by simply selecting this variable. The advantage of this subcomand is that you also have the option to find the variable abany quickly on your SPSS data editor by clicking on Go To. The abany column will be selected and highlighted instantly on your data view window.

Figure # 14
Viewing Decoding Value Labels for Abany

Person 3, for example, was not asked this question. By asking different sets of questions of different people in the sample, the researchers can collect data for hundreds of variables without driving any of the respondents to suicide or homocide.

Table of Contents

Frequency Distributions

Now that we've seen what the abbreviated variable labels and numerical code categories stand for, we're ready to examine some public opinion. Think about the question we've looked at so far. How do you suppose people in the United States feel about a woman's right to an abortion? That is to say, what percentage do you suppose said "yes" and what percentage said "no"? To start finding out, select Frequencies... from the Descriptive Statistics menu in the Analyze general menu(see Figure 15).

Figure # 15
Getting Frequency Distributions

This command will get you a list of variables to choose from, as illustrated in Figure 16.

Figure # 16
Choosing a Variable for Frequencies

Now you can double-click a variable label or else single-click it and then click the right-pointing triangular arrow. Either of these actions will move labels from the left-hand to the right-hand column. Figure 17 shows the results of three variables being selected this way.

Figure # 17
Selecting Frequency Variables

Next we click the OK button. SPSS will click and whirr as it determines the distributions of responses to each of the three variables in our example. It will then produce the frequency tables shown in Figure 18.

Figure # 18
Frequency Distribution Tables

Notice that SPSS has now opened a new window. The first was a data window and the new one is labeled Output. As you continue with SPSS, you'll often work back and forth between these two windows; often the program alternates them automatically.

The left-hand frame in Figure 18 presents an outline of the results. Click on any item in the outline to fill the right-hand frame with the data you've requested. Here is where you can easily cut and paste from SPSS to a MS Word document. Click on the outline Frequency Table in the left-hand-side. All the frequency tables are now selected. Select Copy objects from the Edit menu. Now open your Word document and select Paste from the Edit menu. Figures 19 and 20 illustrate this process. In this case I was only interested in copying and pasting the frequency table for abany.

Figure # 19
Selecting Frequency Distribution Tables

Figure # 20
Copying Frequency Distribution Tables

In Figure 18 the right-hand side presents two tables. The first one summarizes the three variables we chose originally. All we are told here, however, is the number of respondents with valid responses and those without. The second table gives the distribution of data for the question of whether a woman should have the right to an abortion for any reason. In addition to "yes" and "no", the table reports three other possibilities:

NAP: "Not applicable" (the question was not asked)

DK: Respondents who said they "Don't know"

NA: Respondents who were asked but gave "No answer"

In the second table's Frequency column you can learn how many respondents fall under each of the categories. The Percent column puts the information into a more useful form by showing the percentage represented by each category. The most useful column is Valid Percent. This column tells us that of the 1,768 respondents who gave a valid response, 39.9 percent said "yes" and 60.1 percent said "no". We might interpret these results by saying that opinions on this issue are almost evenly divided.

By scrolling down the window or using the outline in the left-hand frame, you can check the results for the other variables. For now let's move along to more complex analyses.

Table of Contents

Cross-Tabulations

The frequency distributions we've just undertaken are called univariate analyses (analyzing one variable at a time). Now we'll turn to bivariate analyses (two variables at a time).

Let's stay with the issue of "abortion for any reason". We've seen that U.S. residents are about evenly divided on the issue. What do you suppose accounts for this difference? People often guess that women would be more likely than men to support abortion as a woman's right. Let's see how to determine the accuracy of that guess.

Return to the Descriptive Statistics menu, but check Crosstabs this time. This brings you a somewhat different dialog box, as indicated in Figure 21.

Figure # 21
Crosstabs Dialog Box

We are now going to set up a percentage table involving two variables: abany and sex. The table will have both columns and rows. While there are many ways to construct such a table, we're going to assign the categories of sex (male and female) to the columns. Then we'll look at the opinions on abany within each of those categories. In the logic and language of SPSS, that makes abany the "row variable" and sex the "column variable". To assign categories, select variable labels from the list and drag them to the appropriate windows on the right. Figure 22 shows this step.

Figure # 22
Selecting Variables for the Crosstab

To find a variable label in the list, you can either scroll through the list or click any label in the list and then type the variable you want. It may take a little experimentation to discover how quickly you must type to have it work.

Thus far, we've told SPSS to organize the table like this:

Men Women
Approve

Disapprove    

To complete our request, we have to tell SPSS how to percentage the data. In this case, we'll ask for the percentage of men who approve of abortion and the percentage who disapprove, with the two percentages totaling 100 percent. Then we'll ask for the corresponding percentages of women. In other words, ask SPSS to "percentage down" the columns. The Crosstab option provides a means for us to indicate that preference. Click the Cells button in the dialog box (See Figure 23).

Figure # 23
Specifying the Percentaging Method

When the dialog box opens, the Observed box will already be checked. Leave it that way. In the section on Percentages, click the Column box. That instructs SPSS to percentage down the columns. Click Continue to complete this dialog, and then click OK to launch the request for a crosstab. Once SPSS has completed the table, we'll be returned to the Output window, as in Figure 24.

Figure # 24
Crosstab of abany and sex

Let's see what the table tells us. We wanted to find out if men and women differed in their attitudes about whether a woman should be able to choose an abortion just because she wanted one. The table suggests that there's no appreciable difference. The same proportion of men (39.9%) and women (39.8%) say a woman should have the right to an abortion for any reason.

Let's try another variable that could affect people's attitudes toward abortion: political orientation. In the GSS, polviews represents a standard item that asks respondents to characterize their political views as something between "Extremely Liberal" and "Extremely Conservative". Figure 25 shows impact of this variable on attitudes toward abortion.

Figure # 25
Crosstab of abany and polviews

Because there are so many categories for political views, you may have to use the scroll bar at the bottom of the window to move back and forth across the table. Notice that we've scrolled all the way to the right in Figure 25.

The impact of political views on abortion attitudes is pretty clear. Overall, liberals support abortion more than do conservatives. The only exception to the pattern is that people who are "Extremely Liberal" are less supportive than those who are "Liberal". This result appears a lot, perhaps because of the different ways people interpret the two political terms.

Table of Contents

Recoding Variables

It's often useful to recode variables with many categories, reducing the number to something more manageable. In the present case, we might want to combine the categories in polviews to make three: "Liberal", "Moderate", and "Conservative."

We can combine categories by hand from the kind of table presented in Figure 25. For example, we can easily calculate that 447 of the respondents in the table considered themselves liberals (62 + 203 + 182). Of those, 247 supported a woman's right to an abortion for any reason (42 + 110 + 95). Dividing these two numbers tells us that 55 percent of the liberals supported abortion. A similar calculation tells us that 152 of the 553 conservatives-27 percent-were supportive. The 42-percent support among moderates fits neatly between the liberals and conservatives.

Combining categories like this makes it easier to use the variable in further analyses. However, we should have SPSS create a new, recoded variable so that we don't have to undertake the job by hand each time. To do this, we must first return to the Data window. If you're in the Output window, you can simply click the Data window icon in the task bar at the bottom of your screen or select the SPSS Data Editor tag from the Window menu (see Figure 26 below).

Figure # 26
Switching to Data View

Once you've returned to the Data window, click the Transform menu and move your pointer to the Recode option. When you do that, you'll be presented with another choice, as Figure 27 shows.

Figure # 27
Requesting a Recode

SPSS offers two options for recoding: Either it will modify the data contained under the existing variable label (Same Variables) or it will create a new variable for the modified results (Different Variables). Choose Different Variables, because the first option will destroy the original data.

Next you'll see a large dialog box like the one in Figure 28.

Figure # 28
The Recode Dialog Box

Initially the right-hand frame will have nothing in it. To create the situation shown in Figure 29:

  1. Select polviews in the variable list and move it to the center frame by double-clicking it or using the triangular arrow.
  2. Type polviewr in the space under Output Variable Name and click Change.
  3. Type in a descriptive label to identify what polviewr stands for.

Figure # 29
The Completed Recode Dialog Box

To continue the process, click Old and New Values. This will bring you the dialog box shown in Figure 30.

Figure # 30
Specifying How to Recode Categories

To tell SPSS how to create polviewr from polviews, we identify values of polviews and indicating what values they should get in polviewr. Let's start by creating a "Liberals" category that includes everyone with a "1", "2", or "3" on polviews. We'll give the new category the value "1."

In Figure 31, we've chosen the "Range" option and indicated that anyone with a value of "1" through "3" on polviews should be assigned a "1" on polviewr. Make sure you see where those instructions are entered in the dialog box.

Figure # 31
Creating

When you click the Add button, the transformation instruction is transferred to the field on the right-hand side of the dialog box, as you can see in Figure 32.

Figure # 32
Renumbering the

We'll use a different option to create a new "Moderate" category. As you recall, they were scored "4" on polviews. We'll give them a "2" in polviewr by entering the old and new values in the Old Value and New Value fields. When we click Add again, the new instruction is added to the field. Now take a moment to figure out how you would create a "Conservative" category, transforming scores of 5, 6, and 7 on polviews into a score of 3 on polviewr. Once you've done that, you should have the dialog box shown in Figure 33. All that remains now is to click Continue, which will return you to the earlier dialog box, and then click OK.

Figure # 33
The Recoding Instructions Completed

Let's tidy up our new variable. First return to the Data View window. Next, scroll across the list of variables to the far-right end. SPSS places each new variable at the end of all the other variables. Since you just created a new variable called polviewr, SPSS created a new column located last on your spreadsheet. When you find polviewr, double-click the variable label at the head of the column. This will open up the Variable View window and polviewr will be autolatically selected at the bottom of your variable list. See Figure 34 below.

Figure # 34
Finding Polviewr in Variable View

Click on the righ-hand-side of the cell located in the Decimals column and polviewr row. A little square located in the cell will appear as shown in figure 35. You can then reduce to 0 the number of decimals for each value of polviewr. In other words, you can convert each 1.00 score to simply 1.

Figure # 35
Changing Decimals Format for Polviewr in Variable View

Now click on the righ-hand-side button in the Values column and the polviewr row. A Value Labels dialog box will appear as shown in Figure 36. Now give names to the new category values:

  1. Type "1" in the Value field
  2. Type "Liberal" in the Value Label field
  3. Click the Add button

Repeat the process to assign "Moderate" to the value of "2" and "Conservative" to "3", being sure to click Add each time. When you're done, the dialog box should look like Figure 36. Click Continue, then OK.

Figure # 36
Assigning Value Labels to polviewr

Click then on the righ-hand-side button in the Missing column and the polviewr row. The dialog box shown in Figure 37.1. Type 9 in the first space available under Discrete Missing Values. You have just indicated to SPSS that any value 9 in the data set for polviewr should be considered "missing answer" and removed from statistical computation.

Figure 37.1
Assigning Missing Values to polviewr

Finally you can change the measurement type for polviewr as shown in figure 37.2. I selected Ordinal for polviewr since the values can be ranked from low to high level of conservatism.

Figure 37.2
Assigning Measurement Type to polviewr

Now when you select Analyze/Descriptive Statistics/Frequencies and scroll through the list of variables, you'll find a new entry in the list: polviewr. Choose it to see the frequency distribution generated by our new categories (see Figure 38).

Figure # 38
Frequency Distribution for polviewr

Since we have gone to all this trouble to make our analysis simpler, let's see if it worked. Let's use polviewr to reexamine the relationship between political orientations and attitudes toward abortion. Use Analyze/Descriptive Statistics /Crosstabs to create a table with abany and polviewr. Figure 39 illustrates what you should get.

Figure # 39
Crosstab of abany and polviewr

Notice how much easier it is to read this table, compared with the one presented in Figure 25. We see that 55 percent of the liberals, 42 percent of the moderates and 28 percent of the conservatives support a woman's right to an abortion for any reason. (It's good to round off the decimal points in percentages like these, since they're based on samples, which only provide estimates of populations in the first place.)

Table of Contents

Multivariate Tables

Bivariate tables are typically only the beginning of quantitative data analysis. For example, you might want to see if the observed relationship between politics and abortion holds equally for men and women. SPSS makes it a simple matter to satisfy your curiosity uabout such matters.

Return to Analyze/Descriptive Statistics /Crosstabs and specify a third variable as shown in Figure 40.

Figure # 40
Trivariate Table Request

Notice that we've simply transferred the third variable, sex, into the bottom field in the dialog box. Press OK to see the result, illustrated in Figure 41.

Figure # 41
Table of abany by polviewr by sex

In a sense, this new table splits the one shown in Figure 39 into two parts. The top half shows the relationship between polviewr and abany for men, the bottom half shows the same relationship for women. We can see immediately that the original relationship is replicated for each of the gender categories.

In the far-right column, the summary statistics show you the relationship between gender and support for abortion. Overall (i.e., forgetting about political orientations), an equal 41 percent for men and women support a woman's right to an abortion for any reason-an interesting similarity. (Notice that I've rounded off the figures, 40.5% and 40.8%, presented in the table.) It seems that there is no sex effect on abany. The sex of the respondent does not matter for this GSS question.

Comparing men and women in the other columns of the table tells us that sex has little impact no matter what a person's political orientation is. Women are more supportive among liberals, men are slightly more supportive among moderates and among conservatives. None of the differences are very large, however.

SPSS allows you to go beyond trivariate tables, though they grow increasingly difficult to read and analyze. To experiment with this possibility, click the Next button near the bottom of the Crosstabs dialog box to add new Layers of variables to the table.

Table of Contents

Tests of Statistical Significance

In the previous example, I casually remarked that the percentage differences were not very large. This was a subjective assessment of the substantive significance of the differences.

As you know, tests of statistical significance can determine the likelihood that relationships observed in a sample are merely an artifact of sampling error rather than a reflection of a real difference in the population from which the sample was drawn. Let's take a look at how SPSS offers us the use of those tests.

Return to the Crosstabs dialog box, via Analyze/Descriptive Statistics. At the bottom of the box, click a button marked Statistics. Figure 42 illustrates the results.

Figure # 42
Choice of Statistics in a Crosstabs Dialog Box

As you can see, SPSS offers several summary statistics, three of which you'll recognize from this textbook: chi square, lambda, and gamma. Recall that chi square is appropriate to nominal variables such as abany and sex, so let's use them to see how we can use SPSS to work with chi square.

Click the Chi-square box. Then click Continue and enter abany and sex in the appropriate places in the Crosstabs dialog box. In addition to the regular percentage table, SPSS now provides an additional table, shown in Figure 43.

Figure # 43
Chi Square for abany and sex

If you've had a statistics course, you'll recognize many of the tests presented in this table. For our purposes, let's focus on the first row of the results, the "Pearson Chi-Square". The third column tells us the probability that sampling error alone could have generated a relationship as strong as the one we've observed, if men and women in the whole population were exactly the same in their attitudes toward abortion. Specifically, it tells us that the probability is .972, 97 chances in 100. This probability level is extremely high. Thus, the chi square test confirms what we had concluded subjectively from our crosstabulations: Men and women do not differ at all in their support for a woman's right to an abortion.

The relationship between abany and polviewr was much stronger. Let's see how chi square evaluates that relationship. Repeat the above procedure, changing sex to polviewr. Notice that you don't have to select Chi-square again, any more than you have to select Columns. SPSS maintains those specifications until you shut down the program. When you start it up again, you'll have to specify such preferences again. And, of course, you can turn them off any time you no longer desire them.

Figure # 44
Chi square for abany and polviewr

See Figure 44 above for the chi square evaluation of abany and polviewr. Notice that the significance in this case is calculated at ".000". SPSS only presents the first three decimal points in this calculation. Hence, the likelihood of the observed relationship being simply a product of sampling error isn't exactly zero-it could happen‚ but the chances are not very high that it did. Specifically, the probability is less than .001, or less than one chance in a thousand, which is a commonly used standard for statistical significance. Thus, we conclude that the relationship we've observed in this carefully selected sample very likely represents something that exists in the larger population.

Table of Contents

Correlation and Regression

Thus far we've been examining nominal and ordinal data, which constitute the bulk of social science data. SPSS can also help you work with interval and ratio data.

For example, you may have heard that highly educated people tend to have fewer children than do those with less education. Let's use SPSS to see if it's so. In the GSS, these variables are educ and childs. Under Analyze, select Correlate and, when asked, Bivariate. (SPSS can undertake more complex correlational analyses, but we'll keep it simple for this introduction.) In the Correlations dialog box, select educ and childs and click OK. That will produce the result shown in Figure 45.

Figure # 45
Pearson's Product Moment Correlation

As you can see, there is a correlation of -.210 (or a "negative correlation of .210") between the two variables. The negative correlation means that as the years of education increase, the number of children decreases.

Of course, this analysis cannot determine the causal direction, so we could also say that as the number of children increases, the amount of education completed decreases. Both interpretations make sense and probably apply in some cases. Some young people have to cut their education short to accommodate the demands of parenthood, and those who keep going to school may have to delay parenthood and have fewer children once they get started.

Whatever the explanation for the relationship, SPSS informs us that the correlation is significant at the 0.01 level. In other words, sampling error could account for a correlation like this one less than once in a hundred times.

Although we entered only two variables in this analysis, SPSS will accept as many at a time as you want and will create a correlation matrix in which every variable is correlated with every other variable. Experiment with this possibility.

Regression analysis builds on the logic of correlation and creates equations that predict values of one variable based on values of others. Here's how we could represent the relationship between childs and educ as a regression equation. Under Analyze, select Regression , choose Linear from among the alternatives offered. This will present you with the dialog box presented in Figure 46.

Figure # 46
Regression Dialog Box

Let's use the logic of accounting for the number of children people have; thus, childs is our dependent variable, and educ is the independent variable we'll use to account for differences in the number of children. Enter the two variables into the dialog box as show in Figure 46 above. Click OK, to get the result shown in Figure 47.

Figure # 47
Linear Regression Predicting childs with educ

SPSS will present you with three tables of calculations, but here we are only interested in the third one, Coefficients. In fact, we're only interested in the first column of this table, Unstandardized Coefficients. The first of these, the Constant, represents the value of the dependent variable (number of children) when the value of the independent variable (years of education) is zero. Statisticians sometimes refer to this as the y-intercept or the point where the line crosses the y-axis, when the regression line is plotted on a graph.

The B value associated with the independent variable (-.121) indicates how much the dependent variable changes with each added unit of the independent variable In our example, this means what change in the number of children we should expect for each added year of education. Stated as an equation, the regression looks like this:

childs = 3.407 - (.121 ´ educ)

Suppose a person has 10 years of education. We would predict she or he has

3.407 - 1.21 = 2.197 children

For college graduates with 16 years of education we'd predict they have

3.407 - 1.936 = 1.471 children

Clearly, these estimates represent statistical averages, because no one can have 2.197 or 1.471 children. Still if you were to bet on the number of children people had and knew only their education, this equation would be your best guide for betting. If you could make a lot of bets on this basis, you'd be a winner overall.

To explore regression further, try adding another independent variable. SPSS will provide you with a new y-intercept and coefficients for each of the independent variables. Be sure to interpret positive and negative signs correctly.

Table of Contents

Creating Indexes

Chapter 6 discussed the creation of composite measures, such as indexes and scales. This section looks briefly at how to use SPSS to create a simple index.

Without reviewing the logic of index construction, let's create an index of sexual permissiveness including the following GSS variables:
premarsx: sex before marriage
xmarsex: sex with person other than spouse
homosex: homosexual relations
In each of these items, respondents were asked whether the action was

  1. Always wrong
  2. Almost always wrong
  3. Sometimes wrong
  4. Not wrong at all

Given the format of these three items, we can create a composite index quite simply. Although the values 1-4 used to represent the answers to these questions are merely labels-just as we used "1" for male and "2" for female-we can in this case take advantage of their numerical quality. In each of these items, the higher the numerical code, the higher the level of sexual permissiveness. If we add the values respondents received on the three items, the possible totals range from 3 to 12, with 12 representing the highest and 3 the lowest degree of sexual permissiveness.

We can now generate the index by using the Transform/Compute menu option, as illustrated in Figure 48. Enter the information by typing or by selecting the variable names from the list and clicking the plus sign in the keypad provided in the dialog box (see Figure 48).

Figure # 48
Adding the values of premarsx, xmarsex, and homosex

When you're through, click OK in the dialog box. SPSS will create a new variable, sexperm, in your data set and will assign the appropriate values to each of the respondents. In the Data window, scroll to the far right and find the new variable in the last column. Scroll up and down to see the values assigned to respondents. Those with no values in the new column were missing data on the three items used to construct it.

For a more comprehensive view of the new variable, run the frequency distribution for sexperm (Figure 49).

Figure # 49
The Frequency Distribution for sexperm

Having created a composite measure such as this one, it's always good to validate the scores if possible. That is, if the index scores truly distinguish levels of sexual permissiveness, then those scores should predict the answers people gave to other questions. For example, we might wonder if attitudes toward abortion are related to sexual permissiveness. We can find out by running a cross-tabulation of the index and, say, abany.

The result of this validation effort is presented in Figure 50. Notice that this table uses a somewhat different format than those we've created earlier. Given the large number of categories comprising sexperm, it's difficult to fit the table on the computer screen (and in this book). Thus, I have made sexperm the row variable, made abany the column variable, and requested that the table be percentaged by row rather than column. Thus we read this table "down", whereas we've been reading earlier ones "across".

Figure # 50
Validating the sexperm Index

The relationship between these two variables is extremely strong and consistent. Of those with a score of 3 on the index (representing the lowest level of sexual permissiveness), only 19 percent support a woman's right to an abortion for any reason. This percentage increases steadily as index scores increase, reaching 67 percent among those with a score of 12 on the index.

Creating an index from variables that do not permit such a simple addition of code values is a little more involved. To illustrate, let's create an index of where respondents stand on the issue of guns. Two items in the GSS are relevant:
gunlaw: favor or oppose gun permits

  1. Favor
  2. Oppose
owngun: have gun in home
  1. Yes
  2. No

It makes sense that those who have a gun and oppose requiring permits for owning guns are the most pro-gun, while those without a gun of their own and who favor gun permits would be the most anti-gun. Notice, however, that the pro-gun position is represented by a "2" on gunlaw and a "1" on owngun. Thus, we can't simply add the values. Here's how to generate a simple index from these two items.

Let's create a new variable, progun, for which higher scores indicate greater support for guns. To start this process, return to Transform/Compute. Type in the Target Variable and give everyone a starting score of "0", as in Figure 51. Click OK to create the variable. Then return to Transform/Compute and change the "0" to "progun + 1" as indicated in Figure 52.

Figure # 51
Initializing progun

Figure # 52
Adding a Point to progun

We're not going to add a point to everyone's index score, however. Click the If button near the bottom of the dialog box, so we can specify the conditions under which we want to add a point. Next, click the button beside the phrase "include if case satisfies condition". Then create the condition shown in Figure 53. By doing this, we're telling SPSS to give an additional point to anyone who said they oppose gun permits (i.e., a "2" on gunlaw).

Figure # 53
Adding a Point for Opposing Gun Permits

Click Continue to return to the earlier dialog box. Then click OK to instruct SPSS to take the action specified. When SPSS tells you that you're about to change an existing variable, say "yes."

Select Transform/Compute again. Notice that the earlier instruction to add a point is still there. Leave it, but click If in order to modify the condition. Change it to specify those who said they owned a gun ("1" on owngun) by indicating "owngun = 1" as the condition. Click Continue, then OK, then "yes", as before. Now those who had a score of 0 for favoring gun permits will now get 1 point (for a total of "1") if they own a gun; they still have 0 points if they don't have a gun. Those who scored a point for opposing gun permits will get another point if they own a gun (a total of "2") but will stay at 1 point if they don't have a gun. The resulting index is made up of the scores "0", "1", and "2."

There's only one problem with the index as it stands. Since everyone started with 0 points, those who didn't answer one or both of these questions will end up with a score of zero, thus seeming to oppose guns. The final step in creating this index involves culling out those with missing data.

First, let's create a "missing data" code. We'll use "99". Return to Transform/Compute. In the first dialog box, type "progun = 99". Click If to specify the condition: "MISSING(gunlaw) " as shown in Figure 54. You need to first select the "MISSING(variable)" function and select gunlaw and click on the arrow. Click Continue and OK, then repeat the procedure for "MISSING(owngun)."

Figure # 54
Missing Data as a Condition

If you examine the response possibilities for owngun, however, you'll find that 23 people refused to answer and were coded "3". Return to Transform/Compute and assign an index value of 99 for anyone with " owngun = 3."

As a final step, we're going to recode the 99. Select Transform/Recode, but this time choose Same Variables. Once you reach the dialog box, convert the 99 to a SYSMIS as illustrated in Figure 55. Enter 99 as the old value, click 'system-missing' then click the Add button.

To complete the action, click Continue and OK. The index is now complete. You can check it out by running Analyze/Descriptive Statistics/Frequencies. To reassure yourself further, run a cross-tabulation between the two items to verify that the correct number of people received each of the scores.

Figure # 55
Converting 99 to SYSMIS

Table of Contents

Graphics

With the improvement of computer graphics, SPSS now offers many options for presenting data in nontabular formats. Let's explore a few of these, beginning with simple frequency distributions.

Figure 56 presents the distribution of GSS data on religious affiliation ( relig) as a pie chart. You can create this by (1) selecting Pie under Graphs, (2) choosing Summaries for Groups of Cases, then (3) specifying relig as the variable to portray. Select "% of cases". Before clicking OK, click on the Titles button and type the title of your graph in the dialog box. Here I typed "Pie Chart of Religious Affiliation."

Figure # 56
Pie Chart of Religious Affiliation

Figure 57 shows the results of this operation. As you can see, the pie chart is small and refers to all religious categories. This pie chart is not very useful and requires some simple formatting. We need to collapse into one larger label all these slices too small to make sense of the graph. Double click on the graph in your output window and a graph dialog box will appear. Select Option from the Chart menu. A Pie Options dialog box appears shown in Figure 58. Select " Collapse (sum) slices less than 5%" (note that you can change this percentge if you want to include more categories under this new collapsed one). Under this same dialogue box select Percents so that the percentage of each slice is indicated on the graph. Click Ok and close the SPSS Chart Editor window.

Figure # 57
Pie Chart of Religious Affiliation in Output Window

Figure # 58
Formatting Pie Chart of Religious Affiliation

Figure 59 shows the result of this formatting procedure. Only the "Catholic", "None" and "Protestant" slices remain unchanged. All other categories are collapsed under the "Other" pie slice. There are many options to explore and I invite you to experiment on your own in order to polish statistical visual representations in your papers. When you are ready to import a SPSS graph to your paper, simply select this graph from the outline menu on the left side of the output window. A frame around the graph will indicate that you have selected it. Then choose "Copy Objects" from the Edit menu. Open your Word document and paste the graph wherever you want.

Figure # 59
Copying Pie Chart of Religious Affiliation

If you're on a diet that rules out pies, see Figure 60 for a bar graph of relig.

Figure # 60
Bar Graph of relig

Ratio variables, such as the number of years of education, might be presented as line graphs. See Figure 61.

Figure # 61
Line Graph of educ

These are just a few of the graphic options available to you in SPSS. Experiment with them to find the form of presentation most appropriate to your purposes.

Table of Contents

Making Copies of Results for a Paper

Often, you will use SPSS to undertake quantitative analyses for a term paper, thesis, or other project. Although you can retype the results of SPSS into your paper, you can also take advantage of some energy-saving options. Depending on your word-processing system, you may have to experiment a bit.

As shown earlier, it is very easy to copy and paste from SPSS outputs to Word documents. Simply make sure you have selected the objects you want to copy (which should then be framed) and use the Edit/Copy objects command from SPSS and then Paste in the Word document of your choice.

Though the easiest strategy is to copy and paste from SPSS to Word, you can also export your statistical results. To try making a hard copy of a graph, create the pie chart for polviewr. Click the resulting graph. As explained above, you'll see a box appear around it, indicating that it has been selected by the computer. Then in the File menu select Export. Figure 62 illustrates the resulting dialog box.

Figure # 62
Export Dialog Box

You have several options here. You can export your output document with charts, without charts, or exclusively the charts of your output window. For our purposes, export the Chart Only. You can then either change the name of the export file you are going to create or accept (and remember) the name and location SPSS has proposed. Again, for present purposes, choose to export the Selected Objects and choose JPEG File (*.JPG) as the export format. Once you've done all this, click OK.

Run your word-processing system and open the document that desperately needs this table. Click where you want the graph and select Picture and then From File from the Insert menu (this procedure may be different if you are using another word processor than MS Word). Browse your computer until you find your JPEG file. Remember to change the Files of Type option to All Files to view all documents and not exclusively Word documents.

To try making a hard copy of a table, create a frequency distribution for gunlaw. The same procedure you used to export graphs is possible if you choose to export tables. However, you will lose the formatting of the tables you export. I suggest that you choose the Export Format HTML file (*htm) option. This format preserves the best the layout of your tables with SPSS. Open then your Word document and select File from the Insert menu and browse until you find your output file. You should be rewarded with something like the table I've put in Figure 63.

Figure # 63
Text Version of gunlaw Frequency Distribution

In all this, you may also want to take advantage of SPSS's multitude of table formats. To explore these, choose Edit/Options in SPSS and click on the Pivot Tables tab. Once there, click the various options under TableLook and SPSS will give you a sample layout in the field to the right of the list. If you find a format that interests you, leave it highlighted when you close the dialog box and then create a new table. It will be done in the format you've specified.

Table of Contents

Shutting Down

As much as you may come to love SPSS, you'll have to quit the program eventually. Go to the File menu and select Exit. SPSS will respond with a question that asks whether you want to save the Output file you've created. If you give it a name and a disk location for saving it, you'll be able to open it later on and retrieve any data created in your analyses. If you've just been practicing SPSS, you'll probably want to say "No".

If you've changed the data set, ie: by creating a recoded variable, SPSS will ask if you want to save the changes. Unless you want to get rid of the changes, say "Yes". However, you should only alter the data file if you have permission to do that. If, for example, you're sharing a file with others in your class, it may not be appropriate for you to save your changes. Discuss this with your instructor if in doubt.

SPSS will now close. Have fun.

Table of Contents