4417

918

Robert Johnston

The **biostatistics** is a science that is part of statistics, and is applied to other disciplines within the field of biology and medicine, mainly.

Biology is an extensive field that is responsible for studying the enormous variety of life forms that exist on earth - viruses, animals, plants, etc. - from different points of view.

Biostatistics is a very useful tool that can be applied to the study of these organisms, including the experimental design, the collection of data to carry out the study and the summary of the results obtained..

Thus, the data can be systematically analyzed, leading to obtaining relevant and objective conclusions. In the same way, it has tools that allow the graphical representation of the results.

Biostatistics has a wide series of subspecialties in molecular biology, genetics, agricultural studies, animal research - both in the field and in the laboratory, clinical treatments in humans, among others..

Article index

- 1 History
- 1.1 James Bernoulli
- 1.2 Johann Carl Friedrich Gauss
- 1.3 Pierre Charles-Alexandre Louis
- 1.4 Francis Galton
- 1.5 Ronald Fisher

- 2 What does biostatistics study? (Field of study)
- 3 Applications
- 3.1 Health sciences
- 3.2 Biological sciences

- 4 Basic tests
- 4.1 Tests for one variable
- 4.2 Multivariate tests

- 5 Most Used Programs
- 5.1 SPSS
- 5.2 S-plus and Statistica
- 5.3 R

- 6 References

In the middle of the seventeenth century, modern statistical theory emerged with the introduction of the theory of probability and the theory of games and chance, developed by thinkers from France, Germany and England. Probability theory is a critical concept, and is considered the "backbone" of modern statistics..

Some of the most notable contributors to the field of biostatistics, and statistics in general, are listed below:

Bernoulli was an important Swiss scientist and mathematician of his time. Bernoulli is credited with the first treatise on probability theory, and the binomial distribution. His masterpiece was published by his nephew in 1713 and is titled *Ars Conjectandi*.

Gauss is one of the most outstanding scientists in statistics. From an early age he proved to be a child prodigy, making himself known in the scientific field since he was just a young high school student.

One of his most important contributions to science was the work *Disquisitiones arithmeticae,* published when Gauss was 21.

In this book, the German scientist exposes number theory, which also compiles the results of a series of mathematicians such as Fermat, Euler, Lagrange and Legendre..

The first study of medicine that involved the use of statistical methods is attributed to the physician Pierre Charles-Alexandre Louis, a native of France. He applied the numerical method to studies related to tuberculosis, having a significant impact on medical students of the time.

The study motivated other doctors to use statistical methods in their research, which greatly enriched the disciplines, highlighting those related to epidemiology.

Francis Galton was a character who had multiple contributions to science, and he is considered the founder of statistical biometrics. Galton was the cousin of the British naturalist Charles Darwin, and his studies were based on a mixture of his cousin's theories with society, in what was called social Darwinism..

Darwin's theories had a great impact on Galton, who felt the need to develop a statistical model that would guarantee the stability of the population..

Thanks to this concern, Galton developed the correlation and regression models, which are widely used today, as we will see later..

He is known as the father of statistics. The development of modernization of biostatistics techniques is credited to Ronald Fisher and his collaborators.

When Charles Darwin published the *Origin of Species*, biology did not yet have precise interpretations of the inheritance of characters.

Years later, with the rediscovery of the work of Gregor Mendel, a group of scientists developed the modern synthesis of evolution, by merging both bodies of knowledge: the theory of evolution through natural selection, and the laws of heredity..

Together with Fisher, Sewall G. Wright and J. B. S. Haldane developed the synthesis and established the principles of population genetics..

The synthesis brought with it a new legacy in biostatistics, and the techniques developed have been key in biology. Among them, the distribution of the sampling, the variance, the analysis of variance and the experimental design stand out. These techniques have a wide range of uses, from agriculture to genetics..

Biostatistics is a branch of statistics that focuses on the design and execution of scientific experiments that are carried out in living beings, on the acquisition and analysis of the data obtained through said experiments, and on the subsequent interpretation and presentation of the results from the analyzes.

Since the biological sciences comprise an extensive series of study objectives, biostatistics must be equally diverse, and manages to engage the variety of topics that biology aims to study, characterize, and analyze life forms..

The applications of biostatistics are extremely varied. The application of statistical methods is an intrinsic step of the scientific method, so any researcher must combine statistics to test their working hypotheses.

Biostatistics is used in the health area to produce results related to epidemics, nutritional studies, among others..

It is also used directly in medical studies and in the development of new treatments. Statistics make it possible to objectively discern whether a drug had positive, negative or neutral effects on the development of a specific disease.

For any biologist, statistics is an indispensable tool in research. With few exceptions of purely descriptive works, research in the biological sciences requires an interpretation of the results, for which it is necessary to apply statistical tests.

Statistics allow us to know if the differences that we are observing in biological systems are due to chance, or if they reflect significant differences that must be taken into account..

In the same way, it allows creating models to predict the behavior of some variable, by applying correlations, for example.

In biology, a series of tests that are frequently done in research can be specified. The choice of the appropriate test depends on the biological question to be answered, and on certain characteristics of the data, such as its distribution of the homogeneity of the variances..

A simple test is the pairwise comparison or Student's t test. It is widely used in medical publications and in health matters. Generally, it is used to compare two samples with a size smaller than 30. It assumes equality in the variances and normal distribution. There are variants for paired or unpaired samples.

If the sample does not meet the assumption of the normal distribution, there are tests that are used in these cases, and they are known as nonparametric tests. For the t-test, the nonparametric alternative is the Wilcoxon rank test.

Analysis of variance (abbreviated as ANOVA) is also widely used and allows one to discern whether several samples differ significantly from each other. Like the Student's t test, it assumes equality in the variances and normal distribution. The non-parametric alternative is the Kruskal-Wallis test.

If you want to establish the relationship between two variables, a correlation is applied. The parametric test is the Pearson correlation, and the nonparametric one is the Spearman rank correlation.

It is common to want to study more than two variables, so multivariate tests are very useful. These include regression studies, canonical correlation analysis, discriminant analysis, multivariate analysis of variance (MANOVA), logistic regression, principal component analysis, etc..

Biostatistics is an essential tool in the biological sciences. These analyzes are carried out by specialized programs for the statistical analysis of data..

One of the most used worldwide, in the academic environment, is SPSS. Among its advantages is the handling of large amounts of data and the ability to recode variables.

S-plus is another widely used program, which allows - like SPSS - to perform basic statistical tests on large amounts of data. Statistica is also widely used, and is characterized by its intuitive handling and the variety of graphics it offers..

Today, most biologists choose to perform their statistical analyzes in R. This software is characterized by its versatility, as new packages with multiple functions are created every day. Unlike the previous programs, in R you must find the package that performs the test you want to do, and download it.

Although R may not seem very user-friendly and user-friendly, it provides a wide variety of useful tests and functions for biologists. In addition, there are certain packages (such as ggplot) that allow the visualization of the data in a very professional way.

- Bali, J. (2017). Basics of Biostatistics: A Manual for Medical Practitioners. Jaypee Brothers Medical Publishers.
- Hazra, A., & Gogtay, N. (2016). Biostatistics series module 1: Basics of biostatistics.
*Indian journal of dermatology*,*61*(1), 10. - Saha, I., & Paul, B. (2016).
*Essentials of biostatistics: for undergraduate, postgraduate students of medical science, biomedical science and researchers*. Academic publishers. - Trapp, R. G., & Dawson, B. (1994). Basic & clinical biostatistics. Appleton & Lange.
- Zhao, Y., & Chen, D. G. (2018).
*New Frontiers of Biostatistics and Bioinformatics.*Springer.

Yet No Comments