variable
(noun)
a quantity that may assume any one of a set of values
Examples of variable in the following topics:
-
Explanatory and response variables
- If we suspect poverty might affect spending in a county, then poverty is the explanatory variable and federal spending is the response variable in the relationship.
- Sometimes the explanatory variable is called the independent variable and the response variable is called the dependent variable.
- If there are many variables, it may be possible to consider a number of them as explanatory variables.
- The explanatory variable might affect response variable.
- In some cases, there is no explanatory or response variable.
-
Variables
- In this case, the variable is "type of antidepressant. " When a variable is manipulated by an experimenter, it is called an independent variable.
- An important distinction between variables is between qualitative variables and quantitative variables.
- Qualitative variables are sometimes referred to as categorical variables.
- Quantitative variables are those variables that are measured in terms of numbers.
- The variable "type of supplement" is a qualitative variable; there is nothing quantitative about it.
-
Types of Variables
- Numeric variables have values that describe a measurable quantity as a number, like "how many" or "how much. " Therefore, numeric variables are quantitative variables.
- A continuous variable is a numeric variable.
- A discrete variable is a numeric variable.
- An ordinal variable is a categorical variable.
- A nominal variable is a categorical variable.
-
Qualitative Variable Models
- Dummy, or qualitative variables, often act as independent variables in regression and affect the results of the dependent variables.
- Dummy variables are "proxy" variables, or numeric stand-ins for qualitative facts in a regression model.
- In regression analysis, the dependent variables may be influenced not only by quantitative variables (income, output, prices, etc.), but also by qualitative variables (gender, religion, geographic region, etc.).
- One type of ANOVA model, applicable when dealing with qualitative variables, is a regression model in which the dependent variable is quantitative in nature but all the explanatory variables are dummies (qualitative in nature).
- Break down the method of inserting a dummy variable into a regression analysis in order to compensate for the effects of a qualitative variable.
-
Slope and Intercept
- The general purpose is to explain how one variable, the dependent variable, is systematically related to the values of one or more independent variables.
- The coefficients are numeric constants by which variable values in the equation are multiplied or which are added to a variable value to determine the unknown.
- Here, by convention, $x$ and $y$ are the variables of interest in our data, with $y$ the unknown or dependent variable and $x$ the known or independent variable.
- Linear regression is an approach to modeling the relationship between a scalar dependent variable $y$ and one or more explanatory (independent) variables denoted $X$.
- An equation where y is the dependent variable, x is the independent variable, m is the slope, and b is the intercept.
-
Types of variables
- This variable seems to be a hybrid: it is a categorical variable but the levels have a natural ordering.
- A variable with these properties is called an ordinal variable.
- To simplify analyses, any ordinal variables in this book will be treated as categorical variables.
- Are these numerical or categorical variables?
- Thus, each is categorical variables.
-
Controlling for a Variable
- Controlling for a variable is a method to reduce the effect of extraneous variations that may also affect the value of the dependent variable.
- For instance, temperature is a continuous variable, while the number of legs of an animal is a discrete variable.
- There are also quasi-independent variables, which are used by researchers to group things without affecting the variable itself.
- In a scientific experiment measuring the effect of one or more independent variables on a dependent variable, controlling for a variable is a method of reducing the confounding effect of variations in a third variable that may also affect the value of the dependent variable.
- The failure to do so results in omitted-variable bias.
-
An alternative test statistic
- Recall that R2 described the proportion of variability in the response variable (y) explained by the explanatory variable (x).
- If this proportion is large, then this suggests a linear relationship exists between the variables.
- This concept – considering the amount of variability in the response variable explained by the explanatory variable – is a key component in some statistical techniques.
- The method states that if enough variability is explained away by the categories, then we would conclude the mean varied between the categories.
- On the other hand, we might not be convinced if only a little variability is explained.
-
Email data
- The email data set was first presented in Chapter 1 with a relatively small number of variables.In fact, there are many more variables available that might be useful for classifying spam.Descriptions of these variables are presented in Table 8.13.The spam variable will be the outcome, and the other 10 variables will be the model predictors.While we have limited the predictors used in this section to be categorical variables (where many are represented as indicator variables), numerical predictors may also be used in logistic regression.
- Recall from Chapter 7 that if outliers are present in predictor variables, the corresponding observations may be especially influential on the resulting model.
- This is the motivation for omitting the numerical variables, such as the number of characters and line breaks in emails, that we saw in Chapter 1.
- These variables exhibited extreme skew.
- We could resolve this issue by transforming these variables (e.g. using a log-transformation), but we will omit this further investigation for brevity.
-
Interaction Models
- In statistics, an interaction may arise when considering the relationship among three or more variables, and describes a situation in which the simultaneous influence of two variables on a third is not additive.
- If two variables of interest interact, the relationship between each of the interacting variables and a third "dependent variable" depends on the value of the other interacting variable.
- In practice, this makes it more difficult to predict the consequences of changing the value of a variable, particularly if the variables it interacts with are hard to measure or difficult to control.
- The notion of "interaction" is closely related to that of "moderation" that is common in social and health science research: the interaction between an explanatory variable and an environmental variable suggests that the effect of the explanatory variable has been moderated or modified by the environmental variable.
- An interaction variable is a variable constructed from an original set of variables in order to represent either all of the interaction present or some part of it.