Regression and Factor Analysis
Please use the data attached to tackle the case of "Customer Satisfaction at Harver & Boecker". Please answer the following questions:- Using regression analysis, locate those variables that best explain the customers' overall satisfaction. Evaluate the model fit and assess the impact of each variable on the criterion variable. Remember to use collinearity diagnostics.
- Determine the factors that characterize the respondents by means of a factor analysis. Consider the following issues:
- Use the factor scores and regress the customers' overall satisfaction (overall) on these.
Solution
proc import datafile=”c:\myfiles\Accounts.xls”
out=sasuser.accounts
sheet=”Prices”;
getnames=no;
run;
proc print data=sasuser.accounts(obs=10);
run;
proc import datafile=”C:/Users/Mohamed/Desktop/data.xls”
DBMS=excel
out=Work.data ;
run;
proc print data=sasuser.accounts(obs=10);
run;
PROC IMPORT DATAFILE= “C:\Users\Mohamed\Desktop\data.xlsx”
OUT= WORK.data
DBMS=XLS
REPLACE;
SHEET=”Sheet1″;
GETNAMES=YES;
RUN;
libname assign ‘C:\Users\Mohamed\Desktop\sas’;
PROC IMPORT DATAFILE= “C:\Users\Mohamed\Desktop\data.xls”
OUT= assign.data
dbms=xls
REPLACE;
GETNAMES=YES;
Sheet=”Sheet1″;
RUN;
ods graphics on;
procreg;
model overall = s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12;
run;
ods graphics off;
ods graphics on;
procreg;
model overall = s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 / tolvifcollin;
run;
ods graphics off;
ods graphics on;
proc factor data=Assign.Data
priors=smcmsa residual
rotate=promax reorder
outstat=fact_all
plots=(scree initloadingspreloadings loadings);
var s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 ;
run;
ods graphics off;
data fact2(type=factor);
setfact_all;
if _TYPE_ in(‘PATTERN’ ‘FCORR’) then delete;
if _TYPE_=’UNROTATE’ then _TYPE_=’PATTERN’;
ods graphics on;
proc factor data=Assign.Data
priors=smcmsa residual
rotate=promax reorder
outstat=fact_all
score=fact
plots=(scree initloadingspreloadings loadings);
var s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 ;
run;
ods graphics off;
proc factor data=Assign.Data score outstat=fact;
var s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 ;
run;
proc score data=Assign.Data score=fact out=scores;
var s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 ;
run;
ods graphics on;
procreg data=Scores;
model overall = Factor1 Factor2 factor3;
run;
ods graphics off;
Haver&Boecker is one of the world’s leading providers of filling and screening systems. The company operates a number of facilities in Germany, as well as production plants in the UK, Belgium, USA, Canada, and Brazil. It is a recognized specialist in the fields of weighing, filling, and material handling technology.
Haver&Boecker designs, produces, and markets systems and plants for filling and processing loose bulk materials of every type and, thus, solely operates in industrial markets. The company’s relationships with its customers are usually long-term oriented, and complex.
Since the company’s philosophy is to assist customers and business partners in solving technical problems and innovating new solutions, their products are often customized to the buyers’ needs. Therefore, the customer is no longer a passive buyer, but an active partner. Given this background, the customer’s satisfaction plays an important role in establishing, developing, and maintaining successful customer relationships.
Very early on, the company’s management realized the importance of customer satisfaction and decided to commission a market research project to identify marketing activities that can positively contribute to the business’s overall success. Based on a thorough literature review as well as interviews with experts, the company developed a short survey to explore their customers’ satisfaction with specific performance features and their overall satisfaction. All items were measured on 7-point scales with higher scores denoting higher levels of satisfaction. A standardized survey was mailed to customers in 12 countries worldwide, which yielded 281 fully completed questionnaires. The following items (names in parentheses) were listed in the survey:
- Reliability of the machines and systems (s1)
- Life-time of the machines and systems (s2)
- Functionality and user-friendliness operation of the machines and systems (s3)
- Appearance of the machines and systems (s4)
- Accuracy of the machines and systems (s5)
- Timely availability of the after-sales service (s6)
- Local availability of the after-sales service (s7)
- Fast processing of complaints (s8)
- Composition of quotations (s9)
- Transparency of quotations (s10)
- Fixed product prize for the machines and systems (s11)
- Cost/performance ratio of the machines and systems (s12)
- Overall, how satisfied are you with the supplier (overall)?
- Using regression analysis, let us locate those variables that best explain the customers' overall satisfaction. The following tables represent the regression outputs from SAS:
odsgraphicson; procreg; model overall = s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12; run; odsgraphicsoff; |
The SAS System |
Number of Observations Read | 281 |
Number of Observations Used | 281 |
Analysis of Variance | |||||
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
Model | 12 | 198.91066 | 16.57589 | 15.69 | <.0001 |
Error | 268 | 283.07510 | 1.05625 | ||
Corrected Total | 280 | 481.98577 |
Root MSE | 1.02774 | R-Square | 0.4127 |
Dependent Mean | 5.00712 | Adj R-Sq | 0.3864 |
CoeffVar | 20.52559 |
Parameter Estimates | ||||||
Variable | Label | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| |
Intercept | Intercept | 1 | 2.68730 | 0.22756 | 11.81 | <.0001 |
S1 | s1 | 1 | 0.17898 | 0.05198 | 3.44 | 0.0007 |
S2 | s2 | 1 | 0.03091 | 0.04280 | 0.72 | 0.4709 |
S3 | s3 | 1 | 0.05274 | 0.05177 | 1.02 | 0.3092 |
S4 | s4 | 1 | 0.06009 | 0.05042 | 1.19 | 0.2344 |
S5 | s5 | 1 | 0.02594 | 0.04602 | 0.56 | 0.5735 |
S6 | s6 | 1 | -0.00967 | 0.04832 | -0.20 | 0.8416 |
S7 | s7 | 1 | -0.02486 | 0.04157 | -0.60 | 0.5504 |
S8 | s8 | 1 | 0.06262 | 0.04669 | 1.34 | 0.1810 |
S9 | s9 | 1 | 0.06358 | 0.04729 | 1.34 | 0.1799 |
S10 | s10 | 1 | 0.01909 | 0.04504 | 0.42 | 0.6720 |
S11 | s11 | 1 | -0.11662 | 0.04563 | -2.56 | 0.0112 |
S12 | s12 | 1 | 0.16684 | 0.04634 | 3.60 | 0.0004 |
odsgraphicson; procreg; model overall = s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 / tolvifcollin; run; odsgraphicsoff; |
Parameter Estimates | ||||||||
Variable | Label | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| | Tolerance | Variance Inflation |
Intercept | Intercept | 1 | 2.68730 | 0.22756 | 11.81 | <.0001 | . | 0 |
S1 | s1 | 1 | 0.17898 | 0.05198 | 3.44 | 0.0007 | 0.36902 | 2.70989 |
S2 | s2 | 1 | 0.03091 | 0.04280 | 0.72 | 0.4709 | 0.41655 | 2.40068 |
S3 | s3 | 1 | 0.05274 | 0.05177 | 1.02 | 0.3092 | 0.44269 | 2.25890 |
S4 | s4 | 1 | 0.06009 | 0.05042 | 1.19 | 0.2344 | 0.43494 | 2.29917 |
S5 | s5 | 1 | 0.02594 | 0.04602 | 0.56 | 0.5735 | 0.51240 | 1.95160 |
S6 | s6 | 1 | -0.00967 | 0.04832 | -0.20 | 0.8416 | 0.39505 | 2.53133 |
S7 | s7 | 1 | -0.02486 | 0.04157 | -0.60 | 0.5504 | 0.52572 | 1.90214 |
S8 | s8 | 1 | 0.06262 | 0.04669 | 1.34 | 0.1810 | 0.38187 | 2.61872 |
S9 | s9 | 1 | 0.06358 | 0.04729 | 1.34 | 0.1799 | 0.36659 | 2.72783 |
S10 | s10 | 1 | 0.01909 | 0.04504 | 0.42 | 0.6720 | 0.38487 | 2.59826 |
S11 | s11 | 1 | -0.11662 | 0.04563 | -2.56 | 0.0112 | 0.50294 | 1.98829 |
S12 | s12 | 1 | 0.16684 | 0.04634 | 3.60 | 0.0004 | 0.41761 | 2.39456 |
- Let us determine the factors that characterize the respondents by performing a factor analysis:
- Before performing a factor analysis, it is recommended to check the FA assumptions
- The following results were obtained using SAS:
odsgraphicson; procfactordata=Assign.Data priors=smcmsa residual rotate=promaxreorder outstat=fact_all plots=(screeinitloadingspreloadings loadings); var s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 ; run; odsgraphicsoff; |
Eigenvalues of the Reduced Correlation Matrix: Total = 6.84974007 Average = 0.57081167 | ||||
Eigenvalue | Difference | Proportion | Cumulative | |
1 | 4.99825280 | 3.87778870 | 0.7297 | 0.7297 |
2 | 1.12046410 | 0.32833702 | 0.1636 | 0.8933 |
3 | 0.79212708 | 0.29569712 | 0.1156 | 1.0089 |
4 | 0.49642996 | 0.21084773 | 0.0725 | 1.0814 |
5 | 0.28558224 | 0.32084982 | 0.0417 | 1.1231 |
6 | -.03526758 | 0.02317729 | -0.0051 | 1.1179 |
7 | -.05844487 | 0.03761719 | -0.0085 | 1.1094 |
8 | -.09606207 | 0.01664101 | -0.0140 | 1.0954 |
9 | -.11270307 | 0.03596681 | -0.0165 | 1.0789 |
10 | -.14866989 | 0.03274139 | -0.0217 | 1.0572 |
11 | -.18141127 | 0.02914608 | -0.0265 | 1.0307 |
12 | -.21055735 | -0.0307 | 1.0000 |
As we can notice, the first three largest positive eigenvalues of the reduced correlation matrix account for 100.89% of the common variance.The scree and variance explained plots clearly support the conclusion that three common factors are present.
- The following tables represent the quartimax rotation from type=factor
The SAS System |
Orthogonal Transformation Matrix | |||
1 | 2 | 3 | |
1 | 0.95280 | 0.23208 | 0.19575 |
2 | -0.29862 | 0.83273 | 0.46626 |
3 | -0.05480 | -0.50270 | 0.86272 |
Rotated Factor Pattern | ||||
Factor1 | Factor2 | Factor3 | ||
S3 | s3 | 0.76878 | 0.01891 | 0.01975 |
S1 | s1 | 0.76676 | 0.08647 | -0.17922 |
S6 | s6 | 0.74527 | -0.01233 | 0.20478 |
S2 | s2 | 0.73721 | -0.02689 | -0.20578 |
S4 | s4 | 0.72969 | 0.16241 | -0.06462 |
S8 | s8 | 0.69994 | 0.06638 | 0.26004 |
S5 | s5 | 0.64419 | 0.15131 | 0.00818 |
S7 | s7 | 0.59822 | -0.14136 | 0.31048 |
S9 | s9 | 0.31265 | 0.75647 | 0.13789 |
S10 | s10 | 0.35498 | 0.73627 | 0.10118 |
S11 | s11 | 0.27322 | 0.17054 | 0.65092 |
S12 | s12 | 0.51801 | 0.14334 | 0.53542 |
Variance Explained by Each Factor | ||
Factor1 | Factor2 | Factor3 |
4.6398200 | 1.2463468 | 1.0246772 |
Final Communality Estimates: Total = 6.910844 | |||||||||||
S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | S9 | S10 | S11 | S12 |
0.62751549 | 0.58655527 | 0.59176624 | 0.56300275 | 0.43794391 | 0.59751792 | 0.47425167 | 0.56194856 | 0.68900934 | 0.67834519 | 0.52742980 | 0.57555786 |
- Factor1 : general characteristics of the machine and systems (variables s1, s2, s3, s4, s5, s6, s7 and s8)
- Factor2 : composition and transparency of quotations (variables s9 and s10)
- Factor3 : cost and prize of the machines and systems (variables s11,s12)
- Goodness of fit
Partial Correlations Controlling all other Variables | |||||||||||||
S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | S9 | S10 | S11 | S12 | ||
S1 | s1 | 1.00000 | 0.53364 | 0.08440 | 0.21083 | -0.06904 | 0.02151 | -0.05023 | 0.12656 | 0.04780 | 0.03591 | -0.17162 | 0.19072 |
S2 | s2 | 0.53364 | 1.00000 | 0.21205 | 0.02173 | 0.03700 | 0.08820 | 0.01627 | 0.00369 | -0.12243 | 0.06437 | -0.00185 | -0.05719 |
S3 | s3 | 0.08440 | 0.21205 | 1.00000 | 0.16996 | 0.21912 | 0.12463 | 0.04091 | 0.06178 | 0.01031 | -0.04926 | 0.01954 | 0.09331 |
S4 | s4 | 0.21083 | 0.02173 | 0.16996 | 1.00000 | 0.39920 | 0.02519 | 0.00325 | 0.02253 | -0.03544 | 0.11584 | -0.05966 | 0.09087 |
S5 | s5 | -0.06904 | 0.03700 | 0.21912 | 0.39920 | 1.00000 | 0.09408 | 0.08136 | -0.07459 | 0.13282 | -0.03575 | -0.01364 | 0.03805 |
S6 | s6 | 0.02151 | 0.08820 | 0.12463 | 0.02519 | 0.09408 | 1.00000 | 0.25105 | 0.40322 | -0.04941 | 0.07276 | 0.00267 | 0.01977 |
S7 | s7 | -0.05023 | 0.01627 | 0.04091 | 0.00325 | 0.08136 | 0.25105 | 1.00000 | 0.32910 | -0.14730 | 0.01204 | 0.07300 | 0.06058 |
S8 | s8 | 0.12656 | 0.00369 | 0.06178 | 0.02253 | -0.07459 | 0.40322 | 0.32910 | 1.00000 | 0.28431 | -0.12448 | 0.03698 | 0.01487 |
S9 | s9 | 0.04780 | -0.12243 | 0.01031 | -0.03544 | 0.13282 | -0.04941 | -0.14730 | 0.28431 | 1.00000 | 0.70788 | 0.13231 | -0.09990 |
S10 | s10 | 0.03591 | 0.06437 | -0.04926 | 0.11584 | -0.03575 | 0.07276 | 0.01204 | -0.12448 | 0.70788 | 1.00000 | -0.01613 | 0.12426 |
S11 | s11 | -0.17162 | -0.00185 | 0.01954 | -0.05966 | -0.01364 | 0.00267 | 0.07300 | 0.03698 | 0.13231 | -0.01613 | 1.00000 | 0.62577 |
S12 | s12 | 0.19072 | -0.05719 | 0.09331 | 0.09087 | 0.03805 | 0.01977 | 0.06058 | 0.01487 | -0.09990 | 0.12426 | 0.62577 | 1.00000 |
Kaiser's Measure of Sampling Adequacy: Overall MSA = 0.83846511 | |||||||||||
S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | S9 | S10 | S11 | S12 |
0.84308697 | 0.84677623 | 0.93908778 | 0.90005427 | 0.88364757 | 0.90375303 | 0.88854400 | 0.85856369 | 0.66694994 | 0.72117052 | 0.69422859 | 0.79622332 |
- Let us use the factor scores and regress the customers' overall satisfaction (overall) on these. First we need to extract the first three factors then run a regression analysis.
procscoredata=Assign.Datascore=fact out=scores; var s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 ; run; odsgraphicson; procregdata=Scores; model overall = Factor1 Factor2 factor3; run; odsgraphicsoff; |
The SAS System |
Number of Observations Read | 281 |
Number of Observations Used | 281 |
Analysis of Variance | |||||
Source | DF | Sum of Squares | Mean Square | F Value | Pr > F |
Model | 3 | 175.32101 | 58.44034 | 52.79 | <.0001 |
Error | 277 | 306.66476 | 1.10709 | ||
Corrected Total | 280 | 481.98577 |
Root MSE | 1.05218 | R-Square | 0.3637 |
Dependent Mean | 5.00712 | Adj R-Sq | 0.3569 |
CoeffVar | 21.01378 |
Parameter Estimates | ||||||
Variable | Label | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| |
Intercept | Intercept | 1 | 5.00712 | 0.06277 | 79.77 | <.0001 |
Factor1 | 1 | 0.76425 | 0.06288 | 12.15 | <.0001 | |
Factor2 | 1 | -0.05840 | 0.06288 | -0.93 | 0.3539 | |
Factor3 | 1 | -0.19663 | 0.06288 | -3.13 | 0.0020 |