Examlex
A research team wanted to assess the relationship between age, systolic blood pressure, smoking, and risk of stroke. A sample of 150 patients who had a stroke is selected and the data collected are given below. Here, for the variable Smoker, 1 represents smokers and 0 represents nonsmokers.
Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the Risk of stroke using a regression tree. Use Risk as the output variable and all the other variables as input variables. In Step 2 of XLMiner's Regression Tree procedure, be sure to Normalize input data, to set the Maximum #splits for input variables to 74, to set the Minimum #records in a terminal node to 1, and specify Using Best prune tree as the scoring option. In Step 3 of XLMiner's Regression Tree procedure, set the maximum number of levels to 7. Generate the Full tree, Best pruned tree, and Minimum error tree. Generate a Detailed Scoring report for all three sets of data.
a. In terms of number of decision nodes, compare the size of the full tree to the size of the best pruned tree.
b. What is the root mean squared error (RMSE) of the best pruned tree on the validation data and on the test data?
c. What is the average error on the validation data and test data? What does this suggest?
d. By examining the best pruned tree, what are the critical variables in predicting the risk?
Pluralism
A societal condition or system in which multiple distinct ethnic, religious, or cultural groups coexist within the same society, maintaining their individual traditions.
Racial Diversity
The inclusion and representation of individuals from a variety of racial and ethnic backgrounds in a group or organization.
Ethnic Diversity
Ethnic diversity refers to the presence of multiple ethnic groups within a given society, each with its distinct culture, traditions, and sometimes language.
Genocide
The deliberate and systematic extermination of a racial, ethnic, national, or cultural group.
Q3: Severe hemophilia develops in patients who have
Q5: Which of the following are never administered
Q9: Excel searches for an exact match of
Q14: In a _, a random variable can
Q16: GE Asset Management uses optimization models to:<br>A)
Q19: As part of the quarterly reviews, the
Q20: Transformation of chronic lymphocytic leukemia or small
Q25: A(n) _ is an input to a
Q30: The data on grades (A, B, C,
Q58: A feasible solution is a local minimum