|
|
Ray D’Alonzo, Ph.D., is Manager of Doctoral Recruiting & University
Relations and a former Associate Director of Research and Development at
Procter & Gamble Pharmaceuticals where he has worked for 30 years. He has
led research programs in bone metabolism, infectious disease, respiratory
disease, arthritis, and nutrition and has published scientific papers on a
wide variety of topics from the chemical composition of fats and oils to the
pharmacoeconomics of osteoporosis. Dr. D’Alonzo is the recipient of the
Chancellor’s Medal from the University of Massachusetts, Amherst, in part,
for his contributions to the development of new pharmaceutical agents. As
both a patient and scientist, he has made a personal effort to increase the
awareness of Chiari in the health care sector and to assist others afflicted
with the syndrome. He has published the story of his personal struggle with
Chiari in a book,
Contents Under Pressure, with 100% of royalties going towards Chiari
education, awareness, and research programs.
July 31, 2008 --
In the last newsletter, I asked readers to let me know if you wanted me to
write about data and what to look for in data when trying to determine if a
treatment is effective. Well, you asked for it so here it is.
When we become ill we search for a treatment that works. But, how do we know
that a treatment really works? This is not a trivial question. First, it is
important to recognize that even when a treatment is proven to work in most
people it may not work for a particular individual because each of us is
different. However, ignoring this fact, we generally take comfort in knowing
that a treatment is proven to be effective in most people. So, how can we be
reasonably confident that a treatment is effective in most people? The
answer lies in what is called evidenced based medicine. That is, there
should be solid evidence or data that a treatment works. Centuries ago,
medicine was more of an art than a science. Consequently, there remains much
“folklore” in the practice of medicine today. Many treatments accepted or
practiced over a long period of time but never scientifically studied are
taken for gospel truth. For example, chicken soup has never been proven an
effective treatment for the common cold. The same is true for causes of
disease. For decades doctors were convinced that stomach ulcers were caused
by stress. We now know that they are caused by the infectious organism
Helicobacter pylori. Closer to home, because of this folklore effect, many
neurologists still to this day believe that Chiari malformations do not
cause symptoms.
The best type of evidence is data produced by well designed clinical
studies, and preferably multiple well designed clinical studies coming to
the same or similar conclusion as reproducing results is very important.
Let’s begin by examining the definition of data. Turning to a popular
on-line dictionary, I found the following definitions of data.
1. factual information (as measurements or statistics) used as a basis for
reasoning, discussion, or calculation
2. information output by a sensing device or organ that includes both useful
and irrelevant or redundant information and must be processed to be
meaningful
3. information in numerical form that can be digitally transmitted or
processed
These definitions are fine but when it comes to proving something
scientifically or medically, there are other very important
characterizations of data that must be considered such as the source of the
data, the robustness of the experimental design that produced the data, the
integrity of the data, the quality of the analysis of the data and the
reproducibility of the data.
The first characteristic is “source”. Do the data have a legitimate source
or are they mythical? Does a valid reference exist for the data being used?
Is that reference available for review? Is the reference source legitimate?
The source of data is extremely important. In some instances, I have found
that no data exist to support commonly heralded cures and treatments. Most
often however, data cited as evidence of effective treatments comes from
medical, scientific, or technical journals. It is critical to understand
that not all journals are equal. Journals have standards for accepting
publications and some journals have very low standards. Quoting data from a
journal can actually be a liability if the journal has low or no standards
and/or is considered biased. A while back I wrote about Reiki as a treatment
for Chiari. When I researched Reiki, I found that most references were from
journals considered favorably biased towards alternative medical treatments.
I may have been able to make a stronger case for Reiki had I found even a
single reference in its favor in a mainstream peer-reviewed medical journal.
It is also important to locate and review the original reference source. One
should never depend on lay magazines particularly if they do not reference
journals. The same can be true of information found on-line.
The second characteristic has to do with elements pertaining to the
“robustness of the experimental design” that produced the data. The main
question here is was the design of the experiment that generated the data
robust and valid? What must be considered in judging experimental design to
be robust and valid? There are many elements to be considered here. First,
before the experiment or study is conducted, a reasonable hypothesis must
exist. In other words, the investigator must state up front what the
objective of the study is and exactly what is being tested. This is called a
prospective study. When one collects data from a study and attempts to
analyze those data to answer a different question than the one prospectively
stated it’s called a retrospective study. Conclusions drawn from prospective
and retrospective studies do not carry the same weight. In general, one can
not make firm conclusions from retrospective studies. The real value of
retrospective studies is that they generate new hypotheses that can then be
tested prospectively. Let me point out here that the vast majority of
studies in the medical literature regarding the effectiveness of
decompression surgery are retrospective.
Another element which must be considered in the experimental design of a
study is control of variables. Basically, if you are trying to answer one
question or determine one unknown, all of the other variables must be
controlled or known. This is usually accomplished by using a control group.
A control is a group of test subjects or patients who are equivalent to the
subjects or patients receiving the treatment. Equivalent is the operative
word here. The control group must be the same size and must possess the same
characteristics. The control group and treatment group must be balanced in
other words for variables like gender and age for example. There are
different ways one can design a controlled study. Sometimes the treatment
group can serve as its own control group by using a cross-over design. The
same group for example might take a placebo for a week followed by a wash
out period and then given a treatment for a week. And, variations on
cross-over designs also exist.
Defining criteria up front for admitting and excluding subjects or patients
into a study are another important way to control variables. For example,
other or concomitant diseases may often affect the outcome of the treatment
being studied so patients with certain diseases should be excluded. Entry
criteria are equally important for one to assure that patients entering a
study actually have the disease being investigated at the proper level of
severity. For example, suppose you want to conduct a study to determine the
effectiveness of decompression surgery for Chiari Type I malformations. One
thing you will need to determine in designing the study is if you want to
investigate Chiari patients who also have syringomyelia. This is a critical
question as patients with syringomyelia may potentially have different
outcomes versus those who do not. A criterion to exclude patients with
syringomyelia might therefore be established.
The selection of measures or end points is another critical element in
designing studies. Not all measures or end points are valid. You must be
able to show that what ever you are measuring you can measure accurately and
reliably and that it is relevant. This sounds like common sense, but readers
would be amazed at how often this is violated and how difficult it can be to
establish valid end points. An excellent example here is the definition of a
successful surgical outcome. I know countless post-surgery Chiari patients
who, when told by their surgeon that their surgery was successful, were
shoved into a state of disbelief because they felt so bad. The reason for
this is that the surgeon’s definition of success was very different from the
patient’s definition or expectation of success. A surgeon might define
success as 1) the ability of the patient to walk into his/her office and 2)
Cine MRI evidence of cerebrospinal fluid flow being re-established whereas
the patient might define success as simply feeling like they did before
symptoms emerged. There are ways to define surgical success that can take
into account the level of patient perceived wellness. Often considerable
work must be done to establish these definitions as valid instruments by
which outcome can be measured. Most of the retrospective decompression
studies that I referred to earlier did not use validated instruments for
determining patient outcome.
Bias must also be eliminated and is a critical element in conducting any
study. If a patient knows that he or she is receiving treatment rather than
placebo, they may respond favorably due to psychological reasons. The same
goes for those conducting and analyzing the study. To avoid this, a study
must be blinded. In other words, the study must be conducted in a manner
where all involved in it are “blinded” as to which subjects or patients are
receiving treatment and which are receiving placebo. This can require
considerable effort. Readers may be amazed to learn that entire companies
exist to provide services and technologies to enable this.
Statistics must be carefully considered before conducting a study or
experiment. There is always the possibility that a result will happen by
shear chance. Statistics can provide some confidence that this is not the
case. Using the principles of statistics control and test groups can be
sized to minimize obtaining chance results. How large study groups must be
depends on multiple factors. The FDA generally requires two pivotal studies
to approve a drug. These are studies that are well designed and that meet
rigorous statistical standards. Studies that contain a small number of
subjects or patients are often termed pilot studies. The weight placed on
the outcome of a pilot study versus a pivotal study is vastly different.
The third characteristic one must look for in determining if a treatment is
proven is the “integrity” of the data themselves. Are the data clean and
valid? This is critical and most often under appreciated or taken for
granted. Errors can be introduced into data many different ways from
malfunctioning or improperly calibrated equipment to inaccurate observations
to inaccurate recordings to inaccurate transcriptions to malfunctioning
computer programs. A few errors in the data can result in a totally
different conclusion. The amount of effort to assure that data bases are
clean and valid can be enormous. In a well designed pivotal study, a large
group of people working full time for months can be employed to simply
inspect and validate the data. Sophisticated computer programs and tools are
employed to assist in this effort. Sometimes just one data point being
different can make or break the statistical analysis on which the conclusion
is based. There is even something called meta-data involved. For example,
let’s say one is measuring a patient’s response to a medication that treats
pain. The patient’s indication of the level of pain is the primary data but
the date, time and place of the measurement as well as the doctor or nurse
who made the measurement is also recorded and this is referred to as
meta-data. Meta-data must also be reviewed. Sometimes meta-data provides
insight into the validity of the primary data. If meta-data don’t line up,
the primary data falls into question. For example, what do you do when the
meta-data indicate that the patient’s pain response was collected on a date
and time before the patient was even given medication? Meta-data can be
extremely useful in detecting many kinds of data errors including fraud.
The fourth characteristic to consider in judging the effectiveness of a
treatment is the extent and “quality of the analyses” of the data. Only
after the data collected have been determined to be clean and valid can they
be analyzed. (This is known in the community of science as locking or
freezing the database.) So, the next question is are the analyses complete
and thorough? I talked a little about statistics above but I did not talk
about statistical methods of analysis. There are different ways to perform
statistical analyses. It is important to understand that the method of
statistical analysis used must be stated and documented before the analysis
is actually performed. This prevents the introduction of bias. The a priori
analysis plan must be thorough. It should include primary and secondary
endpoints (measures supportive of the primary measures) on the total
population of the study as well as important and relevant subgroups (groups
based on characteristics like sex, age, disease severity, etc.). There is an
important concept known as intent to treat. Often is a study, subjects or
patients drop out for all sorts of reasons. Some move a great distance away.
Some can not tolerate the treatment. Some even have severe accidents or die.
As a result, at the conclusion of a study, there are two groups, the drop
outs and the completers. The analyses must look at both groups as well as
the combined group. The analysis that looks at both groups is called the
intent to treat analysis. The results on the drop outs can not be dismissed
just because they didn’t complete the study. Dismissing such results, albeit
incomplete in nature, can bias the outcome and conclusion of the study.
Also, keep in mind that a proper study design will estimate up front the
expected number of patients that will drop out. There are many important
questions to be addressed. Are unexpected results understood and do they
have a plausible explanation? Are the weaknesses in the experimental design
and their influence on the outcome understood and weighed appropriately in
making conclusions? What is the meaning of a conclusion if a different
statistical method of analysis is used and produces a different result?
Performing analyses and drawing conclusions can be difficult to do
particularly on difficult problems or questions or where the problem or
disease being investigated is new or poorly understood. In such cases,
attempting to draw valid conclusions from a single study regardless of how
well it was designed and conducted can not be done and additional studies
are needed.
The fifth characteristic which must be considered is “reproducibility”. Take
note that when I discussed statistics I said there is always the possibility
of obtaining a particular result by pure chance – always. Even when a study
is well designed using good statistical principles and methods, the results
obtained could be a reflection of lady luck. For this reason, it is
important to show that the results obtained can be reproduced. Reproducing
results by different investigators adds a great deal of credibility to the
conclusions. Some readers may be familiar with cold fusion. About 20 years
ago, a couple of electrochemists claimed to produce a nuclear reaction in a
small table top vessel. It was hailed as the ultimate solution to our energy
problems. One of the problems was that many other independent investigators
could not reproduce their results. At first, there was a lot of controversy
with one camp claiming that it worked and another claiming it was
impossible. Today most scientists agree that cold fusion can not be
achieved. The inability to reproduce the results at the beginning of the
controversy was an early warning sign that turned out to be correct.
Much has been discussed in this article and much more has not. Determining
if a treatment really works is a very complex task that often takes
considerable research and audit skills. I hope my readers have gotten some
flavor of this. The next time you watch an infomercial for some supplement,
device, or diet claiming that is proven to work, keep the principles
discussed above in mind, roll your eyes and change the channel. But in a
serious vein, I hope this helps in sorting out what may or may not help when
discussing treatment options with caretakers.
A reader recently suggested that I write about neuroplasticity and its
potential for improving recovery. I hope to research it over the coming
months and address it in a future newsletter. Please keep your suggestions
coming.
-- Ray D'Alonzo
** If you
would like to share your comments, thoughts, or ideas with Ray,
please send them to dalonzo.rp@fuse.net.
Due to the volume and nature of email received, individual responses are not
possible. **
[Ed. Note: The opinions expressed above are solely those of the
author. They do not represent the opinions of the editor, publisher,
or this publication. Mr. D'Alonzo is not a medical doctor and does not
give medical advice. Anyone with a medical problem is strongly
encouraged to seek professional medical care.]
Return To Table Of Contents
|