Variables

Published

September 21, 2024

1 Variables

A variable can be defined as a characteristic or attribute that can be measured or observed.
The values of a variable change from one observation to another (this is opposed to constants that do not change).

2 Random variables

When the values of the variable are the result of chance factors (i.e., the values are subject to randomness or uncertainty), the variable is refered to as a random variable.
For example, when a coin is flipped, the outcome (head or tail) is random and can not be predicted with certainty until experiment is performed (i.e., the coin is flipped).

2.1 Types of variables

2.1.1 Independent and Dependent variables

Independent variable:
- It is a variable that is manipulated or controlled by the researcher during an experiment.
- It has other names such as predictor, explanatory, or treatment variable.
- It may have different levels or categories.
Dependent variable:
- It is a variable that is measured or observed during an experiment in response to changes in the independent variable(s).
- It has other names such as outcome or response variable.
Example:
- In a study evaluating the effect of three medications on lowering blood pressure in hypertensive patients:
  - The medication type is the independent variable that has three levels (medication A, medication B, and medication C).
  - The blood pressure is the dependent variable.

2.1.2 Qualitative and Quantitative variables

Qualitative (categorical) variables:
- They can not be measured in the usual sense but can be only classified into categories (i.e., they express a quality or attribute).
- Examples: hair color, socioeconomic status, etc.
Quantitative (numerical) variables:
- They represent measurable quantities that can be expressed in terms of numbers.
- Examples: age, weight, blood pressure, etc.

Note

Qualitative variables can be assigned numbers for coding purposes, however, these numbers do not have any numerical significance.
For example:
- The categories of the variable gender {“male” and “female”} can be coded as {0 and 1}, {1 and 2}, or {2 and 1}, etc.
- The selection of codes is arbitrary and does not imply any numerical order.

2.1.3 Discrete and Continuous variables

Discrete variables:
- They can take only whole numbers (i.e., integers).
- They can be described as having no intermediate values between two adjacent values (i.e., they have gaps or interruptions in the values that they can assume).
- Discrete variables can be either qualitative or quantitative (all qualitative variables are discrete by definition).
- Examples:
  - Number of migraine attacks, number of children in a family, number of hospital admissions , etc.
  - The above variables can take whole numbers such as \(0, 1, 2, 3, \cdots\), but they can not be fractional numbers such as \(1.34, 2.76\), etc.
Continuous variables:
- Unlike discrete variables, continuous variables can take any value within a given range.
- They do not have gaps or interruptions in the values that they can assume (i.e., between any two values, there are an infinite number of other values).
- Examples:
  - Age, weight, height, blood pressure, etc.
  - The above variables can take any value within a given range (e.g., weight can be \(25.3, 30.6, 45.8\) kg, etc.).

Note

Continuous variables are often recorded rounded to a certain number of decimal places as if they were discrete variables.
The cause of this is the limitations of the measuring instruments used to collect the data (the scale in the case of weight, the thermometer in the case of temperature, etc.).
Therefore, the weight of an individual can take infinite values between say \(70\) and \(71\) kg, but in practice it is recorded rounded to the nearest decimal place (e.g., \(70.3, 70.6, 70.8\) kg, etc.) because of the lack of precise measurement.

3 Scales of measurement

Measurement can be defined as the assignment of numbers to objects or events according to certain rules.
Different scales of measurement arise from the fact that numbers can be assigned to objects or events under different sets of rules.
Each scale of measurement has its own properties that affect the type of statistical analysis that can be performed on the data.
These properties include the order of the categories, the distance between the categories, and the presence of a true zero point.
According to Stevens (1946), there are four scales of measurement:
1. Nominal scale:
  - It involves naming or classifying observations into categories that are mutually exclusive (i.e., categories that do not overlap) and collectively exhaustive (i.e., every observation must fall into one of the categories).
  - The categories have no inherent order or ranking.
  - Examples of nominal scale variables:
    1. Blood type (A, B, AB, and O).
    2. Hair color (black, brown, blonde, and red).
    3. Gender (male and female).
    Note
    A nominal variable that possesses only two categories is referred to as a dichotomous or binary variable (e.g., gender, yes/no response).
2. Ordinal scale:
  - Like the nominal scale, the ordinal scale involves naming or classifying observations into categories that are mutually exclusive and collectively exhaustive.
  - However, the categories have a natural order or ranking.
  - The distance or the the difference between the categories is not necessarily equal (i.e., the difference between the categories is not meaningful in a numerical sense).
  - Examples of ordinal scale variables:
    1. Educational level (elementary, high school, college, and postgraduate).
    2. Socioeconomic status (low, middle, and high).
    3. Pain severity (mild, moderate, and severe).
    4. Cancer stage (stage I, stage II, stage III, and stage IV).
    5. Likert scale (strongly disagree, disagree, neutral, agree, and strongly agree).
    Note
    Consider for example the variable pain severity, the difference between mild and moderate is not necessarily equal to the difference between moderate and severe.
    
    Even if we can code the categories as 1, 2, and 3, no mathematical operations can be performed on these numbers and we still can not say that the difference between 1 and 2 is equal to the difference between 2 and 3.
3. Interval scale:
  - It is a quantitative and numerical scale.
  - The measurements are ordered (ranked).
  - The difference between measurements is meaningful (i.e., interpretable).
  - It does not have a true zero point (i.e., the value of zero does not indicate the absence of the quantity being measured). This zero point is arbitrary.
  - The ratio of two measurements is not meaningful because of the absence of a true zero point.
  - Examples of interval scale variables:
    1. Temperature measured in Celsius or Fahrenheit.
    2. Intelligence Quotient (IQ) scores.
    Note
    Consider the temperature measured in Celsius, the difference between 10°C and 15°C is equal to the difference between 20°C and 25°C.
    
    However, the ratio of 20°C to 10°C is not meaningful (i.e., we can not say that 20°C is twice as hot as 10°).
    
    The value of zero Celsius does not imply the absence of temperature (i.e., the absence of molecular kinetic energy).
4. Ratio scale:
  - It has the properties of the interval scale with the addition of:
    1. A true zero point.
    2. The ratio of two measurements is meaningful.
  - Examples of ratio scale variables:
    1. Age.
    2. Weight.
    3. Temperature measured in Kelvin (this scale has a true zero point).

Note

Each scale has the same properties as the scales that precede it.
The statistical analysis that can be performed on a lower scale variable can also be performed on a higher scale variable but the reverse is not true.
The scales can be ordered according to the level of information they provide in a descending order as follows: ratio > interval > ordinal > nominal.

Exercise B.1.1

Indicate the measurement scale for each of the following variables:

Car speed

Cancer diagnosis

Pain level on a 5-point scale

Serum sodium level

4 References

Daniel, W. W. and Cross, C. L. (2013). Biostatistics: A Foundation for Analysis in the Health Sciences, Tenth edition. Wiley
Heumann, C., Schomaker, M., and Shalabh (2022). Introduction to Statistics and Data Analysis: With Exercises, Solutions and Applications in R. Springer
Lane, D. M. et al., (2019). Introduction to Statistics. Online Edition. Retrieved September 14, 2024, from https://openstax.org/details/introduction-statistics

1 Variables

2 Random variables

2.1 Types of variables

2.1.1 Independent and Dependent variables

2.1.2 Qualitative and Quantitative variables

2.1.3 Discrete and Continuous variables

3 Scales of measurement

4 References

5 Add your comments