In statistics, stratified randomization is a method of sampling which first stratifies the whole study population into subgroups with same attributes or characteristics, known as strata, then followed by simple random sampling from the stratified groups, where each element within the same subgroup are selected unbiasedly during any stage of the sampling process, randomly and entirely by chance.[1] Stratified randomization is considered a subdivision of stratified sampling, and should be adopted when shared attributes exist partially and vary widely between subgroups of the investigated population, so that they require special considerations or clear distinctions during sampling. This sampling method should be distinguished from cluster sampling, where a simple random sample of several entire clusters is selected to represent the whole population, or stratified systematic sampling, where a systematic sampling is carried out after the stratification process.
Stratified randomization is extremely useful when the target population is heterogeneous and effectively displays how the trends or characteristics under study differ between strata. When performing a stratified randomization, the following 8 steps should be taken:[2]
Stratified randomization may also refer to the random assignment of treatments to subjects, in addition to referring to random sampling of subjects from a population, as described above.In this context, stratified randomization uses one or multiple prognostic factors to make subgroups, on average, that have similar entry characteristics. The patient factor can be accurately decided by examining the outcome in previous studies.[3]
The number of subgroups can be calculated by multiplying the number of strata for each factor. Factors are measured before or at the time of randomization and experimental subjects are divided into several subgroups or strata according to the results of measurements.[4]
Within each stratum, several randomization strategies can be applied, which involves simple randomization, blocked randomization, and minimization.
Simple randomization is considered as the easiest method for allocating subjects in each stratum. Subjects are assigned to each group purely randomly for every assignment. Even though it is easy to conduct, simple randomization is commonly applied in strata that contain more than 100 samples since a small sampling size would make assignment unequal.
Block randomization, sometimes called permuted block randomization, applies blocks to allocate subjects from the same strata equally to each group in the study. In block randomization, allocation ratio (ratio of the number of one specific group over other groups) and group sizes are specified. The block size must be multiples of the number of treatments so that samples in each stratum can be assigned to treatment groups with the intended ratio. For instance, there should be 4 or 8 strata in a clinical trial concerning breast cancer where age and nodal statuses are two prognostic factors and each factor is split into two-level. The different blocks can be assigned to samples in multiple ways including random list and computer programming.[5]
Block randomization is commonly used in the experiment with a relatively big sampling size to avoid the imbalance allocation of samples with important characteristics. In certain fields with strict requests of randomization such as clinical trials, the allocation would be predictable when there is no blinding process for conductors and the block size is limited. The blocks permuted randomization in strata could possibly cause an imbalance of samples among strata as the number of strata increases and the sample size is limited, For instance, there is a possibility that no sample is found meeting the characteristic of certain strata.[6]
In order to guarantee the similarity of each treatment group, the "minimization" method attempts are made, which is more direct than random permuted block within strats. In the minimization method, samples in each stratum are assigned to treatment groups based on the sum of samples in each treatment group, which makes the number of subjects keep balance among the group. If the sums for multiple treatment groups are the same, simple randomization would be conducted to assign the treatment. In practice, the minimization method needs to follow a daily record of treatment assignments by prognostic factors, which can be done effectively by using a set of index cards to record. The minimization method effectively avoids imbalance among groups but involves less random process than block randomization because the random process is only conducted when the treatment sums are the same. A feasible solution is to apply an additional random list which makes the treatment groups with a smaller sum of marginal totals possess a higher chance (e.g.¾) while other treatments have a lower chance(e.g.¼).[7]
Stratified random sampling is useful and productive in situations requiring different weightings on specific strata. In this way, the researchers can manipulate the selection mechanisms from each strata to amplify or minimize the desired characteristics in the survey result.[8]
Stratified randomization is helpful when researchers intend to seek for associations between two or more strata, as simple random sampling causes a larger chance of unequal representation of target groups. It is also useful when the researchers wish to eliminate confounders in observational studies as stratified random sampling allows the adjustments of covariances and the p-values for more accurate results.[9]
There is also a higher level of statistical accuracy for stratified random sampling compared with simple random sampling, due to the high relevance of elements chosen to represent the population. The differences within the strata is much less compared to the one between strata. Hence, as the between-sample differences are minimized, the standard deviation will be consequently tightened, resulting in higher degree of accuracy and small error in the final results. This effectively reduces the sample size needed and increases cost-effectiveness of sampling when research funding is tight.
In real life, stratified random sampling can be applied to results of election polling, investigations into income disparities among social groups, or measurements of education opportunities across nations.
In clinical trials, patients are stratified according to their social and individual backgrounds, or any factor that are relevant to the study, to match each of these groups within the entire patient population. The aim of such is to create a balance of clinical/prognostic factor as the trials would not produce valid results if the study design is not balanced.[10] The step of stratified randomization is extremely important as an attempt to ensure that no bias, deliberate or accidental, affects the representative nature of the patient sample under study.[11] It increases the study power, especially in small clinical trials(n<400), as these known clinical traits stratified are thought to effect the outcomes of the interventions.[12] It helps prevent the occurrence of type I error, which is valued highly in clinical studies.[13] It also has an important effect on sample size for active control equivalence trials and in theory, facilitates subgroup analysis and interim analysis.
The advantages of stratified randomization include:
The limits of stratified randomization include: