Examlex
A research team wanted to assess the relationship between age, systolic blood pressure, smoking, and risk of stroke. A sample of 150 patients who had a stroke is selected and the data collected are given below. Here, for the variable Smoker, 1 represents smokers and 0 represents nonsmokers.
Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the Risk of stroke using a regression tree. Use Risk as the output variable and all the other variables as input variables. In Step 2 of XLMiner's Regression Tree procedure, be sure to Normalize input data, to set the Maximum #splits for input variables to 74, to set the Minimum #records in a terminal node to 1, and specify Using Best prune tree as the scoring option. In Step 3 of XLMiner's Regression Tree procedure, set the maximum number of levels to 7. Generate the Full tree, Best pruned tree, and Minimum error tree. Generate a Detailed Scoring report for all three sets of data.
a. In terms of number of decision nodes, compare the size of the full tree to the size of the best pruned tree.
b. What is the root mean squared error (RMSE) of the best pruned tree on the validation data and on the test data?
c. What is the average error on the validation data and test data? What does this suggest?
d. By examining the best pruned tree, what are the critical variables in predicting the risk?
Semiannually
Occurring or happening twice a year; every six months.
Face Value
The nominal value printed on a bond or share, distinct from its market value.
Par Value
The face value of a bond or stock, representing the amount the issuer agrees to pay at maturity (for bonds) or as an arbitrary value assigned to shares.
Debt Relationship
The connection between borrowers and lenders, specifying the terms under which debt capital is lent and must be repaid.
Q1: Cardiac disease is found in 20% to
Q8: Probenecid is sometimes given to patients taking
Q22: A manager of a quality testing team
Q26: The term concordance is sometimes preferred over
Q49: _ refers to the scenario in which
Q53: _ is a graphical summary of data
Q54: The minimax regret approach is<br>A) purely optimistic.<br>B)
Q56: With reference to exponential forecasting models, a
Q59: A company asked one of their analysis
Q59: Which of the following error messages is