Skip to Main Content

During the design phase of a study, the investigator must decide which type of data will be collected and analyzed. Large amounts of data will be collected, and not all of the data can or will be reported in the published account. The investigator must decide how to most appropriately summarize and present the large volume of data collected to give readers an accurate understanding of the study results.

It is important to be able to differentiate between various types of data. The type of data collected will determine how the data are presented, described, and analyzed.1,2 Just as there are multiple types of data, there are various methods of describing and summarizing data. These methods are referred to as descriptive methods. After the data are described and summarized, the data are analyzed statistically using inferential methods. Whereas descriptive methods simply describe or summarize data, inferential methods are used to make inferences about populations. This chapter discusses the various types of data, explains the most common means with which data are described and summarized, and introduces inferential methods.

There are two types of data: discrete and continuous. It is important to be able to differentiate between the two because the type of data that is collected determines how the data are reported and analyzed. Discrete data fit into a limited number of categories and are also referred to as count data. Each subject fits into only one discrete category.

The simplest form of discrete data is a dichotomous variable. Dichotomous variables have only two categories per variable. An example of a dichotomous variable is gender. Each subject in a study is either male or female. There are only two options or categories possible for this dichotomous variable. Other examples of dichotomous variables include questions that would be answered with a yes or a no. For example, an investigator may wish to solicit information from subjects about whether they have any significant past medical history. Subjects would be given a list of several medical conditions (e.g., hypertension, hypercholesterolemia, myocardial infarction) and asked to answer yes or no to having been diagnosed with each condition.

If the discrete data have more than two possible values or categories, then they are referred to as multichotomous. Race is an example of a multichotomous discrete variable. Race can be divided into many possible categories: black, white, Asian, Hispanic, and so forth.

Another type of discrete variable is an ordinal variable. Ordinal variables have a limited number of values or categories, but the categories are in a logical order or progression. An example of an ordinal variable is degree of inflammation. When data on degree of inflammation are collected, it is typically measured in +1, +2, or +3 categories. In this case, +1 inflammation is the mildest form, and +3 is the most severe form of inflammation. Similar scales are often used to measure pain. ...

Pop-up div Successfully Displayed

This div only appears when the trigger link is hovered over. Otherwise it is hidden from view.