Interesting articles, books, quotes, or websites added to this site for 2010 (created 2010-01-14)
This page is moving to a new website.
Peter Congdon. Bayesian Statistical Modelling. 2nd ed. Wiley; 2007. Description: A fairly technical book, but what book about Bayesian methods is not? The first chapter provides a detailed explanation of Markov Chain Monte Carlo. The remaining chapter provide some very sophisticated examples. Excerpt: "Bayesian methods combine the evidence from the data at hand with previous quantitative knowledge to analyse practical problems in a wide range of areas. The calculations were previously complex, but it is now possible to routinely apply Bayesian methods due to advances in computing technology and the use of new sampling methods for estimating parameters. Such developments together with the availability of freeware such as WINBUGS and R have facilitated a rapid growth in the use of Bayesian methods, allowing their application in many scientific disciplines, including applied statistics, public health research, medical science, the social sciences and economics. Following the success of the first edition, this reworked and updated book provides an accessible approach to Bayesian computing and analysis, with an emphasis on the principles of prior selection, identification and the interpretation of real data sets. The second edition: * Provides an integrated presentation of theory, examples, applications and computer algorithms. * Discusses the role of Markov Chain Monte Carlo methods in computing and estimation. * Includes a wide range of interdisciplinary applications, and a large selection of worked examples from the health and social sciences. * Features a comprehensive range of methodologies and modelling techniques, and examines model fitting in practice using Bayesian principles. * Provides exercises designed to help reinforce the reader�s knowledge and a supplementary website containing data sets and relevant programs. Bayesian Statistical Modelling is ideal for researchers in applied statistics, medical science, public health and the social sciences, who will benefit greatly from the examples and applications featured. The book will also appeal to graduate students of applied statistics, data analysis and Bayesian methods, and will provide a great source of reference for both researchers and students."
Peter D. Congdon. Applied Bayesian Hierarchical Methods. Chapman and Hall/CRC; 2010. Excerpt: "The use of Markov chain Monte Carlo (MCMC) methods for estimating hierarchical models involves complex data structures and is often described as a revolutionary development. An intermediate-level treatment of Bayesian hierarchical models and their applications, Applied Bayesian Hierarchical Methods demonstrates the advantages of a Bayesian approach to data sets involving inferences for collections of related units or variables and in methods where parameters can be treated as random collections. Emphasizing computational issues, the book provides examples of the following application settings: meta-analysis, data structured in space or time, multilevel and longitudinal data, multivariate data, nonlinear regression, and survival time data. For the worked examples, the text mainly employs the WinBUGS package, allowing readers to explore alternative likelihood assumptions, regression structures, and assumptions on prior densities. It also incorporates BayesX code, which is particularly useful in nonlinear regression. To demonstrate MCMC sampling from first principles, the author includes worked examples using the R package. Through illustrative data analysis and attention to statistical computing, this book focuses on the practical implementation of Bayesian hierarchical methods. It also discusses several issues that arise when applying Bayesian techniques in hierarchical and random effects models."
Raymond J. Carroll, David Ruppert. Transformation and Weighting in Regression. 1st ed. Chapman and Hall/CRC; 1988. Description: This is a bit dated, but it has some interesting ideas, like transforming both sides of the equation to fix heteroscedascity while still maintaining linearity. Excerpt: "This monograph provides a careful review of the major statistical techniques used to analyze regression data with nonconstant variability and skewness. The authors have developed statistical techniques--such as formal fitting methods and less formal graphical techniques-- that can be applied to many problems across a range of disciplines, including pharmacokinetics, econometrics, biochemical assays, and fisheries research. While the main focus of the book in on data transformation and weighting, it also draws upon ideas from diverse fields such as influence diagnostics, robustness, bootstrapping, nonparametric data smoothing, quasi-likelihood methods, errors-in-variables, and random coefficients. The authors discuss the computation of estimates and give numerous examples using real data. The book also includes an extensive treatment of estimating variance functions in regression."
October 2010
Mubashir Arain, Michael Campbell, Cindy Cooper, Gillian Lancaster. What
is a pilot or feasibility study? A review of current practice and editorial
policy. BMC Medical Research Methodology. 2010;10(1):67. Abstract:
"BACKGROUND: In 2004, a review of pilot studies published in seven major
medical journals during 2000-01 recommended that the statistical analysis of
such studies should be either mainly descriptive or focus on sample size
estimation, while results from hypothesis testing must be interpreted with
caution. We revisited these journals to see whether the subsequent
recommendations have changed the practice of reporting pilot studies. We also
conducted a survey to identify the methodological components in registered
research studies which are described as 'pilot' or 'feasibility' studies. We
extended this survey to grant-awarding bodies and editors of medical journals
to discover their policies regarding the function and reporting of pilot
studies. METHODS: Papers from 2007-08 in seven medical journals were screened
to retrieve published pilot studies. Reports of registered and completed
studies on the UK Clinical Research Network (UKCRN) Portfolio database were
retrieved and scrutinized. Guidance on the conduct and reporting of pilot
studies was retrieved from the websites of three grant giving bodies and seven
journal editors were canvassed. RESULTS: 54 pilot or feasibility studies
published in 2007-8 were found, of which 26 (48%) were pilot studies of
interventions and the remainder feasibility studies. The majority incorporated
hypothesis-testing (81%), a control arm (69%) and a randomization procedure
(62%). Most (81%) pointed towards the need for further research. Only 8 out of
90 pilot studies identified by the earlier review led to subsequent main
studies. Twelve studies which were interventional pilot/feasibility studies
and which included testing of some component of the research process were
identified through the UKCRN Portfolio database. There was no clear
distinction in use of the terms 'pilot' and 'feasibility'. Five journal
editors replied to our entreaty. In general they were loathe to publish
studies described as 'pilot'. CONCLUSION: Pilot studies are still poorly
reported, with inappropriate emphasis on hypothesis-testing. Authors should be
aware of the different requirements of pilot studies, feasibility studies and
main studies and report them appropriately. Authors should be explicit as to
the purpose of a pilot study. The definitions of feasibility and pilot studies
vary and we make proposals here to clarify terminology." [Accessed October
25, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/67.
September 2010
R B D'Agostino. Propensity score methods for
bias reduction in the comparison of a treatment to a non-randomized control
group. Stat Med. 1998;17(19):2265-2281. Abstract: "In observational
studies, investigators have no control over the treatment assignment. The
treated and non-treated (that is, control) groups may have large differences
on their observed covariates, and these differences can lead to biased
estimates of treatment effects. Even traditional covariance analysis
adjustments may be inadequate to eliminate this bias. The propensity score,
defined as the conditional probability of being treated given the covariates,
can be used to balance the covariates in the two groups, and therefore reduce
this bias. In order to estimate the propensity score, one must model the
distribution of the treatment indicator variable given the observed
covariates. Once estimated the propensity score can be used to reduce bias
through matching, stratification (subclassification), regression adjustment,
or some combination of all three. In this tutorial we discuss the uses of
propensity score methods for bias reduction, give references to the literature
and illustrate the uses through applied examples." [Accessed October 11,
2010]. Available at:
http://www.ncbi.nlm.nih.gov/pubmed/9802183.
M M Joffe, P R Rosenbaum. Invited commentary:
propensity scores. Am. J. Epidemiol. 1999;150(4):327-333. Abstract:
"The propensity score is the conditional probability of exposure to a
treatment given observed covariates. In a cohort study, matching or
stratifying treated and control subjects on a single variable, the propensity
score, tends to balance all of the observed covariates; however, unlike random
assignment of treatments, the propensity score may not also balance unobserved
covariates. The authors review the uses and limitations of propensity scores
and provide a brief outline of associated statistical theory. They also
present a new result of using propensity scores in case-cohort studies."
[Accessed October 11, 2010]. Available at:
http://stat.wharton.upenn.edu/~rosenbap/AJEpropen.pdf.
Kevin L. Delucchi. Sample Size Estimation in Research With Dependent
Measures and Dichotomous Outcomes. Am J Public Health. 2004;94(3):372-377.
Abstract: "I reviewed sample estimation methods for research designs
involving nonindependent data and a dichotomous response variable to examine
the importance of proper sample size estimation and the need to align methods
of sample size estimation with planned methods of statistical analysis.
Examples and references to published literature are provided in this article.
When the method of sample size estimation is not in concert with the method of
planned analysis, poor estimates may result. The effects of multiple measures
over time also need to be considered. Proper sample size estimation is often
overlooked. Alignment of the sample size estimation method with the planned
analysis method, especially in studies involving nonindependent data, will
produce appropriate estimates." Available at:
http://ajph.aphapublications.org/cgi/content/full/94/3/372.
Patrick Vandewalle, Jelena Kovacevic, Martin Vetterli. Reproducible
Research. Excerpt: "Welcome on this site about reproducible research.
This site is intended to gather a lot of information and useful links about
reproducible research. As the authors (Patrick Vandewalle, Jelena Kovacevic
and Martin Vetterli) are all doing research in signal/image processing, that
will also be the main focus of this site." [Accessed October 5, 2010].
Available at:
http://reproducibleresearch.net.
A Caveman. The invited review - or, my field, from my standpoint,
written by me using only my data and my ideas, and citing only my publications.
J Cell Sci. 2000;113(18):3125-3126. Comment: The title is better than any
summary I could write. [Accessed September 27, 2010]. Available at:
http://jcs.biologists.org/cgi/content/abstract/113/18/3125.
Susan A. Peters. Engaging with the Art and Science of Statistics.
Mathematics Teacher. 2010;103(7):496. Abstract: "Statistics uses scientific
tools but also requires the art of flexible and creative reasoning."
[Accessed September 24, 2010]. Available at:
http://www.nctm.org/eresources/view_media.asp?article_id=9145.
Gillian D. Sanders, Lurdes Inoue, Gregory Samsa, Shalini Kulasingam, David
Matchar. Use of Bayesian Techniques in Randomized Clinical Trials: A CMS
Case Study. Excerpt: "We provide a basic tutorial on Bayesian
statistics and the possible uses of such statistics in clinical trial design
and analysis. We conducted a synthesis of existing published research focusing
on how Bayesian techniques can modify inferences that affect policy-level
decisionmaking. Noting that subgroup analysis is a particularly fruitful
application of Bayesian methodology, and an area of particular interest to
CMS, we focused our efforts there rather on the design of such trials. We used
simulation studies and a case study of patient-level data from eight trials to
explore Bayesian techniques in the CMS decisional context in the clinical
domain of the prevention of sudden cardiac death and the use of the
implantable cardioverter defibrillator (ICD). We combined knowledge gained
through the literature review, simulation studies, and the case study to
provide findings concerning the use of Bayesian approaches specific to the CMS
context." [Accessed September 24, 2010]. Available at:
http://www.stat.columbia.edu/~cook/movabletype/archives/2009/06/use_of_bayesian.html.
Office for Human Research Protections, U.S. Department of Health & Human
Services. Quality Improvement Activities Frequently Asked Questions.
Excerpt: "Protecting human subjects during research activities is critical and
has been at the forefront of HHS activities for decades. In addition, HHS is
committed to taking every appropriate opportunity to measure and improve the
quality of care for patients. These two important goals typically do not
intersect, since most quality improvement efforts are not research subject to
the HHS protection of human subjects regulations. However, in some cases
quality improvement activities are designed to accomplish a research purpose
as well as the purpose of improving the quality of care, and in these cases
the regulations for the protection of subjects in research (45 CFR part 46)
may apply." [Accessed September 24, 2010]. Available at:
http://www.hhs.gov/ohrp/qualityfaq.html.
R J Lilford, D Braunholtz. For Debate: The statistical basis of public
policy: a paradigm shift is overdue. BMJ. 1996;313(7057):603 -607.
Excerpt: "The recent controversy over the increased risk of venous thrombosis
with third generation oral contraceptives illustrates the public policy
dilemma that can be created by relying on conventional statistical tests and
estimates: case-control studies showed a significant increase in risk and
forced a decision either to warn or not to warn. Conventional statistical
tests are an improper basis for such decisions because they dichotomise
results according to whether they are or are not significant and do not allow
decision makers to take explicit account of additional evidence�for example,
of biological plausibility or of biases in the studies. A Bayesian approach
overcomes both these problems. A Bayesian analysis starts with a �prior�
probability distribution for the value of interest (for example, a true
relative risk)�based on previous knowledge�and adds the new evidence (via a
model) to produce a �posterior� probability distribution. Because different
experts will have different prior beliefs sensitivity analyses are important
to assess the effects on the posterior distributions of these differences.
Sensitivity analyses should also examine the effects of different assumptions
about biases and about the model which links the data with the value of
interest. One advantage of this method is that it allows such assumptions to
be handled openly and explicitly. Data presented as a series of posterior
probability distributions would be a much better guide to policy, reflecting
the reality that degrees of belief are often continuous, not dichotomous, and
often vary from one person to another in the face of inconclusive evidence."
[Accessed September 24, 2010]. Available at:
http://www.bmj.com/content/313/7057/603.short.
Laurence Freedman. Bayesian statistical methods. BMJ.
1996;313(7057):569 -570. Excerpt: "In this week's BMJ, Lilford and
Braunholtz (p 603) explain the basis of Bayesian statistical theory.1 They
explore its use in evaluating evidence from medical research and incorporating
such evidence into policy decisions about public health. When drawing
inferences from statistical data, Bayesian theory is an alternative to the
frequentist theory that has predominated in medical research over the past
half century." [Accessed September 24, 2010]. Available at:
http://www.bmj.com/content/313/7057/569.short.
U.S. Food and Drug Administration. Guidance for
Sponsors, Clinical Investigators, and IRBs: Data Retention When Subjects
Withdraw from FDA-Regulated Clinical Trials. Excerpt: "This guidance is
intended for sponsors, clinical investigators and institutional review boards
(IRBs). It describes the Food and Drug Administration�s (FDA) longstanding
policy that already-accrued data, relating to individuals who cease
participating in a study, are to be maintained as part of the study data. This
pertains to data from individuals who decide to discontinue participation in a
study, who are withdrawn by their legally authorized representative, as
applicable, or who are discontinued from participation by the clinical
investigator. This policy is supported by the statutes and regulations
administered by FDA as well as ethical and quality standards applicable to
clinical research. Maintenance of these records includes, as with all study
records, safeguarding the privacy and confidentiality of the subject�s
information." [Accessed September 22, 2010]. Available at:
http://www.fda.gov/downloads/RegulatoryInformation/Guidances/UCM126489.pdf.
Office for Human Research Protections. Guidance on Withdrawal of
Subjects from Research: Data Retention and Other Related Issues.
Excerpt: "This document applies to non-exempt human subjects research
conducted or supported by HHS. It clarifies that when a subject chooses to
withdraw from (i.e., discontinue his or her participation in) an ongoing
research study, or when an investigator terminates a subject�s participation
in such a research study without regard to the subject�s consent, the
investigator may retain and analyze already collected data relating to that
subject, even if that data includes identifiable private information about the
subject. For HHS-conducted or supported research that is regulated by the Food
and Drug Administration (FDA), FDA�s guidance on this issue also should be
consulted." [Accessed September 22, 2010]. Available at:
http://www.hhs.gov/ohrp/policy/subjectwithdrawal.html.
R. L. Glass. A letter from the frustrated author of a journal paper.
Journal of Systems and Software. 2000;54(1):1. Excerpt: "Editor�s Note: It
seems appropriate, in this issue of JSS containing the findings of our annual
Top Scholars/Institutions study, to pay tribute to the persistent authors who
make a journal like this, and a study like that, possible. In their honor, we
dedicate the following humorous, anonymously-authored, letter!" [Accessed
September 22, 2010]. Available at:
http://dx.doi.org/10.1016/S0164-1212(00)00020-0.
Amy Harmon. New Drugs Stir Debate on Rules of Clinical Trials. The
New York Times. 2010. Excerpt: "Controlled trials have for decades been
considered essential for proving a drug�s value before it can go to market.
But the continuing trial of the melanoma drug, PLX4032, has ignited an
anguished debate among oncologists about whether a controlled trial that
measures a drug�s impact on extending life is still the best method for
evaluating hundreds of genetically targeted cancer drugs being developed."
[Accessed September 20, 2010]. Available at: http://www.nytimes.com/2010/09/19/health/research/19trial.html.
Springer, PlanetMath. StatProb: The Encyclopedia Sponsored by
Statistics and Probability Societies. Excerpt: "StatProb: The
Encyclopedia Sponsored by Statistics and Probability Societies combines the
advantages of traditional wikis (rapid and up-to-date publication,
user-generated development, hyperlinking, and a saved history) with
traditional publishing (quality assurance, review, credit to authors, and a
structured information display). All contributions have been approved by an
editorial board determined by leading statistical societies; the editorial
board members are listed on the About page. All encyclopedia entries are
written in LaTeX. All of the entries are automatically cross-referenced and
the entire corpus is kept updated in real-time. Anyone can view articles. To
submit a new article or propose a change in an existing article, you must
create an account. It takes only a minute, so sign up!" [Accessed
September 15, 2010]. Available at:
http://statprob.com/.
Isaac Asimov. The Relativity of Wrong. Originally published in
The Skeptical Inquirer, Vol. 14 No. 1, Fall 1989, pages 35-44. Excerpt: "I
received a letter the other day. It was handwritten in crabbed penmanship so
that it was very difficult to read. Nevertheless, I tried to make it out just
in case it might prove to be important. In the first sentence, the writer told
me he was majoring in English literature, but felt he needed to teach me
science. (I sighed a bit, for I knew very few English Lit majors who are
equipped to teach me science, but I am very aware of the vast state of my
ignorance and I am prepared to learn as much as I can from anyone, so I read
on.) " [Accessed September 13, 2010]. Available at:
http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm.
Committee for Medicinal Products for Human Use. Guideline on the choice
of the non-inferiority margin. Excerpt: "Many clinical trials comparing
a test product with an active comparator are designed as non-inferiority
trials. The term �non-inferiority� is now well established, but if taken
literally could be misleading. The objective of a non-inferiority trial is
sometimes stated as being to demonstrate that the test product is not inferior
to the comparator. However, only a superiority trial can demonstrate this. In
fact a noninferiority trial aims to demonstrate that the test product is not
worse than the comparator by more than a pre-specified, small amount. This
amount is known as the non-inferiority margin, or delta (Δ). [Accessed
September 13, 2010]. Available at:
http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC500003636.pdf.
Committee for Proprietary Medicinal Products. Points to consider on
switching between superiority and non-inferiority. Br J Clin Pharmacol.
2001;52(3):223-228. Excerpt: "A number of recent applications have led to
CPMP discussions concerning the interpretation of superiority, noninferiority
and equivalence trials. These issues are covered in ICH E9 (Statistical
Principles for Clinical Trials). There is further relevant material in the
Step 2 draft of ICH E10 (Choice of Control Group) and in the CPMP Note for
Guidance on the Investigation of Bioavailability and Bioequivalence. However,
the guidelines do not address some specific difficulties that have arisen in
practice. In broad terms, these difficulties relate to switching from one
design objective to another at the time of analysis." Available at:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2014556/.
Sage Foundation. Sage: Open Source Mathematics Software.
Abstract: "Sage is a free open-source mathematics software system licensed
under the GPL. It combines the power of many existing open-source packages
into a common Python-based interface. Mission: Creating a viable free open
source alternative to Magma, Maple, Mathematica and Matlab." [Accessed
September 8, 2010]. Available at:
http://www.sagemath.org/.
Alex Zolot. useR! 2010: Work with R on Amazon's Cloud. Abstract:
"Usage of R is often constrained by available memory and/or cpu power. Cloud
computing allows users to get as much resources as necessary in any specific
moment. The tutorial will cover software tools and procedures that are useful
to manage R applications on Amazon's Elastic Compute Cloud (EC2) and Simple
Storage Service (S3) cloud services." [Accessed September 8, 2010].
Available at:
http://user2010.org/tutorials/Zolot.html.
August 2010
Iain Hrynaszkiewicz. A call for BMC Research Notes contributions
promoting best practice in data standardization, sharing and publication.
BMC Research Notes. 2010;3(1):235. Abstract: "BMC Research Notes aims to
ensure that data files underlying published articles are made available in
standard, reusable formats, and the journal is calling for contributions from
the scientific community to achieve this goal. Educational Data Notes included
in this special series should describe a domain-specific data standard and
provide an example data set with the article, or a link to data that are
permanently hosted elsewhere. The contributions should also provide some
evidence of the data standard's application and preparation guidance that
could be used by others wishing to conduct similar experiments. The journal is
also keen to receive contributions on broader aspects of scientific data
sharing, archiving, and open data." [Accessed September 3, 2010].
Available at:
http://www.biomedcentral.com/content/3/1/235.
Iain Hrynaszkiewicz. BMC Research Notes � adding value to your data.
Posted on the BioMEd Central Blog, Thursdy, September 2, 2010. Excerpt:
"Support for scientific data sharing is gathering more and more support in
2010, so rather than �why share data?� the question now is �how?�. Making data
available in readily interpretable formats is vital to realising its value in
driving new knowledge discovery, and BMC Research Notes today launches a new
initiative aimed at promoting best practice in sharing and publishing data,
with a focus on standardized, re-useable formats." [Accessed September 3,
2010]. Available at:
http://blogs.openaccesscentral.com/blogs/bmcblog/entry/bmc_research_notes_wants_your.
Celia Brown, Richard Lilford. The stepped wedge trial design: a
systematic review. BMC Medical Research Methodology. 2006;6(1):54.
Abstract: "BACKGROUND: Stepped wedge randomised trial designs involve
sequential roll-out of an intervention to participants (individuals or
clusters) over a number of time periods. By the end of the study, all
participants will have received the intervention, although the order in which
participants receive the intervention is determined at random. The design is
particularly relevant where it is predicted that the intervention will do more
good than harm (making a parallel design, in which certain participants do not
receive the intervention unethical) and/or where, for logistical, practical or
financial reasons, it is impossible to deliver the intervention simultaneously
to all participants. Stepped wedge designs offer a number of opportunities for
data analysis, particularly for modelling the effect of time on the
effectiveness of an intervention. This paper presents a review of 12 studies
(or protocols) that use (or plan to use) a stepped wedge design. One aim of
the review is to highlight the potential for the stepped wedge design, given
its infrequent use to date. METHODS: Comprehensive literature review of
studies or protocols using a stepped wedge design. Data were extracted from
the studies in three categories for subsequent consideration: study
information (epidemiology, intervention, number of participants), reasons for
using a stepped wedge design and methods of data analysis. RESULTS: The 12
studies included in this review describe evaluations of a wide range of
interventions, across different diseases in different settings. However the
stepped wedge design appears to have found a niche for evaluating
interventions in developing countries, specifically those concerned with HIV.
There were few consistent motivations for employing a stepped wedge design or
methods of data analysis across studies. The methodological descriptions of
stepped wedge studies, including methods of randomisation, sample size
calculations and methods of analysis, are not always complete. CONCLUSION:
While the stepped wedge design offers a number of opportunities for use in
future evaluations, a more consistent approach to reporting and data analysis
is required." [Accessed September 1, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/6/54.
Michael A Hussey, James P Hughes. Design and analysis of stepped wedge
cluster randomized trials. Contemp Clin Trials. 2007;28(2):182-191.
Abstract: "Cluster randomized trials (CRT) are often used to evaluate
therapies or interventions in situations where individual randomization is not
possible or not desirable for logistic, financial or ethical reasons. While a
significant and rapidly growing body of literature exists on CRTs utilizing a
"parallel" design (i.e. I clusters randomized to each treatment), only a few
examples of CRTs using crossover designs have been described. In this article
we discuss the design and analysis of a particular type of crossover CRT - the
stepped wedge - and provide an example of its use." [Accessed September 1,
2010]. Available at:
http://faculty.washington.edu/peterg/Vaccine2006/articles/HusseyHughes.2007.pdf.
Keith A. McGuinness. Of rowing boats, ocean liners and tests of the
ANOVA homogeneity of variance assumption. Austral Ecology.
2008;27(6):681-688. Abstract: "One of the assumptions of analysis of
variance (ANOVA) is that the variances of the groups being compared are
approximately equal. This assumption is routinely checked before doing an
analysis, although some workers consider ANOVA robust and do not bother and
others avoid parametric procedures entirely. Two of the more commonly used
heterogeneity tests are Bartlett's and Cochran's, although, as for most of
these tests, they may well be more sensitive to violations of the ANOVA
assumptions than is ANOVA itself. Simulations were used to examine how well
these two tests protected ANOVA against the problems created by variance
heterogeneity. Although Cochran's test performed a little better than
Bartlett's, both tests performed poorly, frequently disallowing perfectly
valid analyses. Recommendations are made about how to proceed, given these
results." [Accessed August 19, 2010]. Available at:
http://onlinelibrary.wiley.com/doi/10.1111/j.1442-9993.2002.tb00217.x/abstract.
Patricia Keith-Spiegel, Joan Sieber, Gerald P. Koocher. Responding to
Research Wrongdoing : A User Friendly Guide. Excerpt: "Every once in
awhile a product comes along that is destined to make a difference. This Guide
is such a product. Informed by data generated through surveys and interviews
involving more than 2,000 scientists, the Guide gives voice to those
researchers willing, some with eagerness and others with relief, to share
their stories publicly in their own words. There are stories from scientists
who want to do the right thing, but are unsure how to go about it or concerned
about negative consequences for them or their junior colleagues. There are
accounts from researchers who took action, and are keen to share their
successful strategies with others. On the flip side, there are those who
hesitated and now lament not having guidance that might have altered the
course of past events." [Accessed August 14, 2010]. Available at:
http://www.ethicsresearch.com/images/RRW_7-17-10.pdf.
Gerald P. Koocher, Patricia Keith-Spiegel. Peers nip misconduct in the
bud. Nature. 2010;466(7305):438-440. Excerpt: "What do researchers do
when they suspect a colleague of cutting corners, not declaring a conflict of
interest, neglecting proper oversight of research assistants or 'cooking'
data? In one study1, almost all said that they would personally intervene if
they viewed an act as unethical, especially if it seemed minor and the
offender had no history of infractions." [Accessed August 14, 2010].
Available at:
http://www.ethicsresearch.com/images/Nature_Opinion_-_Koocher_Keith-Spiegel.pdf.
M. Castillo. Authorship and Bylines. American Journal of
Neuroradiology. 2009;30(8):1455-1456. Excerpt: "From the ancient Greeks to
Shakespeare, the question of authorship often arises. The issue of appropriate
article authorshiphas always been of special interest to editors of scientific
journals. In the biomedical sciences, as the complexity and funding of
published studies increases, so does the length of the byline. Although a
previous American Journal of Neuroradiology Editor-in-Chief already addressed
this issue, I think it is time to revisit it.1 From my own experience,
articles can be categorized according to the number of authors as follows:
fewer than 2 authors (Editorials, Commentaries, Letters), fewer than 5 authors
(Case Reports and Technical Notes), 5�10 authors (retrospective full-length
articles), 10�15 (prospective, often grant-funded articles), more than 15
authors (reports of task forces, white papers, etc). Among so many authors, it
is not uncommon to find individuals whose contributions are minimal and many
times questionable. Who actually did enough work to be listed as an author? In
other words, who can claim ownership rights in a particular intellectual
property?" [Accessed August 14, 2010]. Available at:
http://www.ajnr.org/cgi/reprint/ajnr.A1636v1.pdf.
R A Parker. Estimating the value of an internal biostatistical
consulting service. Stat Med. 2000;19(16):2131-2145. Abstract: "Biostatistical
consulting is a service business. Although a consulting biostatistician's goal
is long-term collaborative relationships with investigators, this is the same
as the long-term goal of any business: having a group of contented, satisfied
customers. In this era of constrained resources, we must be able to
demonstrate that the benefit a biostatistical consulting group provides to its
organization exceeds its actual cost to the institution. In this paper, I
provide both a theoretical framework for assessing the value of a
biostatistical service and provide an ad hoc method to value the contribution
of a biostatistical service to a grant. Using the methods described, our
biostatistics group returns more than $6 for each dollar spent on
institutional support in 1998." [Accessed August 14, 2010]. Available at:
http://www.ncbi.nlm.nih.gov/pubmed/10931516.
Richard Horton, Richard Smith. Time to redefine authorship. BMJ.
1996;312(7033):723. Excerpt: "Physicists do it by the hundred; scientists
do it in groups; fiction writers mostly alone. And medical researchers? Rarely
now do they write papers alone, and the number of authors on papers is
increasing steadily.1 Under pressure from molecular biologists, the National
Library of Medicine in Washington is planning to list not just the first six
authors in Index Medicus but the first 24 plus the last author.2 Notions of
authorship are clearly in the eye of the beholder, and many authors on modern
papers seem to have made only a minimal contribution.3 4 5 Few authors on
modern multidisciplinary medical papers fit the 19th century notion of taking
full responsibility for every word and thought included, and yet the
cumbersome definition of authorship produced by the International Committee of
Medical Journal Editors (the Vancouver Group) is based on that concept.6 The
definition produced by editors seems to be out of touch with what is happening
in the real world of research, and researchers and editors need to consider a
new definition. The BMJ, Lancet, University of Nottingham, and Locknet (a
network to encourage research into peer review7) are therefore organising a
one day meeting on 6 June in Nottingham to consider the need for a new
definition. All the members of the Vancouver Group will be there, and
everybody is welcome." [Accessed August 14, 2010]. Available at:
http://www.bmj.com/cgi/content/full/312/7033/723.
R A Parker, N G Berman. Criteria for authorship for statisticians in
medical papers. Stat Med. 1998;17(20):2289-2299. We organize a
statistician's potential scientific and intellectual contributions to a
medical study into three types of activities relating to design,
implementation and analysis. For each type, we describe high-level, mid-level
and low-level contributions. Using this framework, we develop a point system
to assess whether authorship is justified. Although we recommend discussion
and resolution of authorship issues early in the course of any project, our
system is especially useful when this has not been done. [Accessed August
14, 2010]. Available at:
http://www.ncbi.nlm.nih.gov/pubmed/9819828.
LiquidPub. Liquid Publications: Scientific Publications meet the Web.
Excerpt: "The LiquidPub project proposes a paradigm shift in the way
scientific knowledge is created, disseminated, evaluated and maintained. This
shift is enabled by the notion of Liquid Publications, which are evolutionary,
collaborative, and composable scientific contributions. Many Liquid
Publication concepts are based on a parallel between scientific knowledge
artifacts and software artifacts, and hence on lessons learned in (agile,
collaborative, open source) software development, as well as on lessons
learned from Web 2.0 in terms of collaborative evaluation of knowledge
artifacts." [Accessed August 10, 2010]. Available at:
http://project.liquidpub.org/.
July 2010
Anup Malani, Tomas J. Philipson. Push for more trials may hurt patients.
Washington Examiner. 2010. Excerpt: "U.S. pharmaceutical companies are
increasingly going abroad to conduct clinical trials required by the FDA.
Recently, the Department of Health and Human Services released a report
suggesting that the FDA lacks the resources to adequately monitor these
foreign trials. Four of every five new drugs sold in the U.S. are tested in
foreign trials, and the FDA inspects less than one in 10 of these. This is
half the rate of inspection for domestic trials." [Accessed July 27,
2010]. Available at:
http://www.washingtonexaminer.com/opinion/columns/Push-for-more-clinical-trials-may-hurt-patients-1002114-98875969.html.
H. Gilbert Welch, Lisa M. Schwartz, Steven Woloshin. The exaggerated
relations between diet, body weight and mortality: the case for a categorical
data approach. CMAJ. 2005;172(7):891-895. Excerpt: "Multivariate
analysis has become a major statistical tool for medical research. It is most
commonly used for adjustment � the process of correcting the main effect for
multiple variables that confound the relation between exposure and outcome in
an observational study. Any apparent relation between estrogen replacement and
dementia, for example, should be adjusted for socioeconomic status, a variable
that is known to relate both to access (and thus the likelihood of having
received estrogen) and to measures of cognitive function (and thus the
likelihood of being diagnosed with dementia). The capacity to account for
numerous variables (e.g., income, education and insurance status)
simultaneously constitutes a major advance in the ability of researchers to
estimate the true effect of the exposure of interest. But this advance has
come at a cost: the actual relation between exposure and outcome is
increasingly opaque to readers, researchers and editors alike." [Accessed
July 26, 2010]. Available at:
http://www.ecmaj.com/cgi/content/full/172/7/891.
Phil Ender. Centering (ED230B/C). Excerpt: "Centering a variable
involves subtracting the mean from each of the scores, that is, creating
deviation scores. Centering can be done two ways; 1) centering using the grand
mean and 2) centering using group means, which is also known as context
centering." [Accessed July 26, 2010]. Available at:
http://www.gseis.ucla.edu/courses/ed230bc1/notes4/center.html.
MediciGlobal. L2FU - Lost to Follow Up. Excerpt: "Patient drop
outs in a clinical trial costs your company money. It can cost you the
integrity of your study too! If it's important to recover patients lost from
your clinical trial, you've come to the right place. Here, you'll read how
L2FU's services can help you and how to begin finding patients today!"
[Accessed July 26, 2010]. Available at:
http://www.l2fu.com.
Steve Miller. Biostatistics, Open Source and BI � an Interview with
Frank Harrell. Description: This article, published in Information
Management Online, February 25, 2009, offers a nice interview with Frank
Harrell, a leading proponent of modern statistical methods. Excerpt: "My
correspondence with Frank provided the opportunity to ask him to do an
interview for the OpenBI Forum. He graciously accepted, turning around deft
responses to my sometimes ponderous questions in very short order. What
follows is text for our questions and answer session. I trust that readers
will learn as much from Frank�s responses as I did." [Accessed July 19,
2010]. Available at:
http://www.information-management.com/news/10015023-1.html.
Karyn Heavner, Carl Phillips, Igor Burstyn, Warren Hare.
Dichotomization: 2 x 2 (x2 x 2 x 2...) categories: infinite possibilities.
BMC Medical Research Methodology. 2010;10(1):59. Abstract: "BACKGROUND:
Consumers of epidemiology may prefer to have one measure of risk arising from
analysis of a 2-by-2 table. However, reporting a single measure of
association, such as one odds ratio (OR) and 95% confidence interval, from a
continuous exposure variable that was dichotomized withholds much potentially
useful information. Results of this type of analysis are often reported for
one such dichotomization, as if no other cutoffs were investigated or even
possible. METHODS: This analysis demonstrates the effect of using different
theory and data driven cutoffs on the relationship between body mass index and
high cholesterol using National Health and Nutrition Examination Survey data.
The recommended analytic approach, presentation of a graph of ORs for a range
of cutoffs, is the focus of most of the results and discussion. RESULTS: These
cutoff variations resulted in ORs between 1.1 and 1.9. This allows
investigators to select a result that either strongly supports or provides
negligible support for an association; a choice that is invisible to readers.
The OR curve presents readers with more information about the exposure disease
relationship than a single OR and 95% confidence interval. CONCLUSION: As well
as offering results for additional cutoffs that may be of interest to readers,
the OR curve provides an indication of whether the study focuses on a
reasonable representation of the data or outlier results. It offers more
information about trends in the association as the cutoff changes and the
implications of random fluctuations than a single OR and 95% confidence
interval." [Accessed July 19, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/59.
Chris Corcoran, Louise Ryan, Pralay Senchaudhuri, et al. An Exact Trend
Test for Correlated Binary Data. Biometrics. 2001;57(3):941-948.
Abstract: "The problem of testing a dose-response relationship in the presence
of exchangeably correlated binary data has been addressed using a variety of
models. Most commonly used approaches are derived from likelihood or
generalized estimating equations and rely on large-sample theory to justify
their inferences. However, while earlier work has determined that these
methods may perform poorly for small or sparse samples, there are few
alternatives available to those faced with such data. We propose an exact
trend test for exchangeably correlated binary data when groups of correlated
observations are ordered. This exact approach is based on an exponential model
derived by Molenberghs and Ryan (1999) and Ryan and Molenberghs (1999) and
provides natural analogues to Fisher's exact test and the binomial trend test
when the data are correlated. We use a graphical method with which one can
efficiently compute the exact tail distribution and apply the test to two
examples." [Accessed July 16, 2010]. Available at:
http://dx.doi.org/10.1111/j.0006-341X.2001.00941.x.
Casey Olives, Marcello Pagano. Bayes-LQAS: classifying the prevalence
of global acute malnutrition. Emerging Themes in Epidemiology.
2010;7(1):3. Abstract: "Lot Quality Assurance Sampling (LQAS) applications
in health have generally relied on frequentist interpretations for statistical
validity. Yet health professionals often seek statements about the probability
distribution of unknown parameters to answer questions of interest. The
frequentist paradigm does not pretend to yield such information, although a
Bayesian formulation might. This is the source of an error made in a recent
paper published in this journal. Many applications lend themselves to a
Bayesian treatment, and would benefit from such considerations in their
design. We discuss Bayes-LQAS (B-LQAS), which allows for incorporation of
prior information into the LQAS classification procedure, and thus shows how
to correct the aforementioned error. Further, we pay special attention to the
formulation of Bayes Operating Characteristic Curves and the use of prior
information to improve survey designs. As a motivating example, we discuss the
classification of Global Acute Malnutrition prevalence and draw parallels
between the Bayes and classical classifications schemes. We also illustrate
the impact of informative and non-informative priors on the survey design.
Results indicate that using a Bayesian approach allows the incorporation of
expert information and/or historical data and is thus potentially a valuable
tool for making accurate and precise classifications." [Accessed July 16,
2010]. Available at:
http://www.ete-online.com/content/7/1/3.
Sylvia Sudat, Elizabeth Carlton, Edmund Seto, Robert Spear, Alan Hubbard.
Using variable importance measures from causal inference to rank risk
factors of schistosomiasis infection in a rural setting in China.
Epidemiologic Perspectives & Innovations. 2010;7(1):3. Abstract:
"BACKGROUND: Schistosomiasis infection, contracted through contact with
contaminated water, is a global public health concern. In this paper we
analyze data from a retrospective study reporting water contact and
schistosomiasis infection status among 1011 individuals in rural China. We
present semi-parametric methods for identifying risk factors through a
comparison of three analysis approaches: a prediction-focused machine learning
algorithm, a simple main-effects multivariable regression, and a
semi-parametric variable importance (VI) estimate inspired by a causal
population intervention parameter. RESULTS: The multivariable regression found
only tool washing to be associated with the outcome, with a relative risk of
1.03 and a 95% confidence interval (CI) of 1.01-1.05. Three types of water
contact were found to be associated with the outcome in the semi-parametric VI
analysis: July water contact (VI estimate 0.16, 95% CI 0.11-0.22), water
contact from tool washing (VI estimate 0.88, 95% CI 0.80-0.97), and water
contact from rice planting (VI estimate 0.71, 95% CI 0.53-0.96). The July VI
result, in particular, indicated a strong association with infection status -
its causal interpretation implies that eliminating water contact in July would
reduce the prevalence of schistosomiasis in our study population by 84%, or
from 0.3 to 0.05 (95% CI 78%-89%). CONCLUSIONS: The July VI estimate suggests
possible within-season variability in schistosomiasis infection risk, an
association not detected by the regression analysis. Though there are many
limitations to this study that temper the potential for causal
interpretations, if a high-risk time period could be detected in something
close to real time, new prevention options would be opened. Most importantly,
we emphasize that traditional regression approaches are usually based on
arbitrary pre-specified models, making their parameters difficult to interpret
in the context of real-world applications. Our results support the practical
application of analysis approaches that, in contrast, do not require arbitrary
model pre-specification, estimate parameters that have simple public health
interpretations, and apply inference that considers model selection as a
source of variation." [Accessed July 16, 2010]. Available at:
http://www.epi-perspectives.com/content/7/1/3.
C. Elizabeth McCarron, Eleanor Pullenayegum, Lehana Thabane, Ron Goeree,
Jean-Eric Tarride. The importance of adjusting for potential confounders in
Bayesian hierarchical models synthesising evidence from randomised and non-randomised
studies: an application comparing treatments for abdominal aortic aneurysms.
BMC Medical Research Methodology. 2010;10(1):64. Abstract: "BACKGROUND:
Informing health care decision making may necessitate the synthesis of
evidence from different study designs (e.g., randomised controlled trials,
non-randomised/observational studies). Methods for synthesising different
types of studies have been proposed, but their routine use requires
development of approaches to adjust for potential biases, especially among
non-randomised studies. The objective of this study was to extend a published
Bayesian hierarchical model to adjust for bias due to confounding in
synthesising evidence from studies with different designs. METHODS: In this
new methodological approach, study estimates were adjusted for potential
confounders using differences in patient characteristics (e.g., age) between
study arms. The new model was applied to synthesise evidence from randomised
and non-randomised studies from a published review comparing treatments for
abdominal aortic aneurysms. We compared the results of the Bayesian
hierarchical model adjusted for differences in study arms with: 1) unadjusted
results, 2) results adjusted using aggregate study values and 3) two methods
for downweighting the potentially biased non-randomised studies. Sensitivity
of the results to alternative prior distributions and the inclusion of
additional covariates were also assessed. RESULTS: In the base case analysis,
the estimated odds ratio was 0.32 (0.13,0.76) for the randomised studies alone
and 0.57 (0.41,0.82) for the non-randomised studies alone. The unadjusted
result for the two types combined was 0.49 (0.21,0.98). Adjusted for
differences between study arms, the estimated odds ratio was 0.37 (0.17,0.77),
representing a shift towards the estimate for the randomised studies alone.
Adjustment for aggregate values resulted in an estimate of 0.60 (0.28,1.20).
The two methods used for downweighting gave odd ratios of 0.43 (0.18,0.89) and
0.35 (0.16,0.76), respectively. Point estimates were robust but credible
intervals were wider when using vaguer priors. CONCLUSIONS: Covariate
adjustment using aggregate study values does not account for covariate
imbalances between treatment arms and downweighting may not eliminate bias.
Adjustment using differences in patient characteristics between arms provides
a systematic way of adjusting for bias due to confounding. Within the context
of a Bayesian hierarchical model, such an approach could facilitate the use of
all available evidence to inform health policy decisions." [Accessed July
14, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/64.
Julie Weed. Factory Efficiency Comes to the Hospital. The New York
Times. 2010. Excerpt: "The program, called �continuous performance
improvement,� or C.P.I., examines every aspect of patients� stays at the
hospital, from the time they arrive in the parking lot until they are
discharged, to see what could work better for them and their families. Last
year, amid rising health care expenses nationally, C.P.I. helped cut Seattle
Children�s costs per patient by 3.7 percent, for a total savings of $23
million, Mr. Hagan says. And as patient demand has grown in the last six
years, he estimates that the hospital avoided spending $180 million on capital
projects by using its facilities more efficiently. It served 38,000 patients
last year, up from 27,000 in 2004, without expansion or adding beds."
[Accessed July 13, 2010]. Available at:
http://www.nytimes.com/2010/07/11/business/11seattle.html.
Katharine Barnard, Louise Dent, Andrew Cook. A systematic review of
models to predict recruitment to multicentre clinical trials. BMC Medical
Research Methodology. 2010;10(1):63. Abstract: "BACKGROUND: Less than one
third of publicly funded trials managed to recruit according to their original
plan often resulting in request for additional funding and/or time extensions.
The aim was to identify models which might be useful to a major public funder
of randomised controlled trials when estimating likely time requirements for
recruiting trial participants. The requirements of a useful model were
identified as usability, based on experience, able to reflect time trends,
accounting for centre recruitment and contribution to a commissioning
decision. METHODS: A systematic review of English language articles using
MEDLINE and EMBASE. Search terms included: randomised controlled trial,
patient, accrual, predict, enrol, models, statistical; Bayes Theorem; Decision
Theory; Monte Carlo Method and Poisson. Only studies discussing prediction of
recruitment to trials using a modelling approach were included. Information
was extracted from articles by one author, and checked by a second, using a
pre-defined form. RESULTS: Out of 326 identified abstracts, only 8 met all the
inclusion criteria. Of these 8 studies examined, there are five major classes
of model discussed: the unconditional model, the conditional model, the
Poisson model, Bayesian models and Monte Carlo simulation of Markov models.
None of these meet all the pre-identified needs of the funder. CONCLUSIONS: To
meet the needs of a number of research programmes, a new model is required as
a matter of importance. Any model chosen should be validated against both
retrospective and prospective data, to ensure the predictions it gives are
superior to those currently used." [Accessed July 11, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/63.
John F. Hall. Journeys in Survey Research - Home. Excerpt:
"Welcome to this new resource for researchers, students and others doing, or
learning about, survey research and the analysis of survey data. You will find
here a wealth of materials drawn from my 45 years of doing and teaching survey
research." [Accessed July 9, 2010]. Available at:
http://surveyresearch.weebly.com/.
Kristin L. Carman, Maureen Maurer, Jill Mathews Yegian, et al. Evidence
That Consumers Are Skeptical About Evidence-Based Health Care. Health Aff.
2010;29(7):1400-1406. Abstract: "We undertook focus groups, interviews, and
an online survey with health care consumers as part of a recent project to
assist purchasers in communicating more effectively about health care evidence
and quality. Most of the consumers were ages 18-64; had health insurance
through a current employer; and had taken part in making decisions about
health insurance coverage for themselves, their spouse, or someone else. We
found many of these consumers' beliefs, values, and knowledge to be at odds
with what policy makers prescribe as evidence-based health care. Few consumers
understood terms such as "medical evidence" or "quality guidelines." Most
believed that more care meant higher-quality, better care. The gaps in
knowledge and misconceptions point to serious challenges in engaging consumers
in evidence-based decision making." [Accessed July 8, 2010]. Available at:
http://content.healthaffairs.org/cgi/content/abstract/29/7/1400.
Statistics Without Borders. Home - Statistics Without Borders.
Excerpt: "Statistics Without Borders (SWB) is an apolitical organization under
the auspices of the American Statistical Association, comprised entirely of
volunteers, that provides pro bono statistical consulting and assistance to
organizations and government agencies in support of these organizations'
not-for-profit efforts to deal with international health issues (broadly
defined). Our vision is to achieve better statistical practice, including
statistical analysis and design of experiments and surveys, so that
international health projects and initiatives are delivered more effectively
and efficiently." [Accessed July 7, 2010]. Available at:
http://community.amstat.org/AMSTAT/StatisticsWithoutBorders/Home/Default.aspx.
Ross Prentice. Invited Commentary: Ethics and Sample Size--Another View.
Am. J. Epidemiol. 2005;161(2):111-112. Excerpt: "In their article entitled,
"Ethics and Sample Size," Bacchetti et al. (1) provide a spirited
justification, based on ethical considerations, for the conduct of clinical
trials that may have little potential to provide powerful tests of therapeutic
or public health hypotheses. This perspective is somewhat surprising given the
longstanding encouragement by clinical trialists and bioethicists in favor of
large trials (2�4). Heretofore, the defenders of smaller trials have
essentially argued only that small, underpowered trials need not be unethical
if well conducted given their contribution to intervention effect estimation
and their potential contribution to meta-analyses (5, 6). However, Bacchetti
et al. evidently go further on the basis of certain risk-benefit
considerations, and they conclude: "In general, ethics committees and others
concerned with the protection of research subjects need not consider whether a
study is too small.... Indeed, a more legitimate ethical issue regarding
sample size is whether it is too large" (1, p. 108)." [Accessed July 7,
2010]. Available at:
http://aje.oxfordjournals.org.
Peter Bacchetti, Leslie E. Wolf, Mark R. Segal, Charles E. McCulloch.
Bacchetti et al. Respond to "Ethics and Sample Size--Another View". Am. J.
Epidemiol. 2005;161(2):113. Excerpt: "We thank Dr. Prentice (1) for taking
the time to respond to our article (2). We explain here why we do not believe
that he has provided a meaningful challenge to our argument. We see possible
objections related to unappealing implications, use of power to measure value,
implications for series of trials, how value per participant is calculated,
and participants� altruistic satisfaction." [Accessed July 7, 2010].
Available at:
http://aje.oxfordjournals.org.
Mitchell H. Katz. Multivariable Analysis: A Primer for Readers of
Medical Research. Annals of Internal Medicine. 2003;138(8):644-650.
Abstract: "Many clinical readers, especially those uncomfortable with
mathematics, treat published multivariable models as a black box, accepting
the author's explanation of the results. However, multivariable analysis can
be understood without undue concern for the underlying mathematics. This paper
reviews the basics of multivariable analysis, including what multivariable
models are, why they are used, what types exist, what assumptions underlie
them, how they should be interpreted, and how they can be evaluated. A deeper
understanding of multivariable models enables readers to decide for themselves
how much weight to give to the results of published analyses." [Accessed
July 7, 2010]. Available at:
http://www.annals.org/content/138/8/644.abstract.
Peter Bacchetti, Jacqueline Leung. Sample Size Calculations in Clinical
Research : Anesthesiology. Anesthesiology. 2002;97(4):1028-1029.
Excerpt: "We write to make the case that the practice of providing a priori
sample size calculations, recently endorsed in an Anesthesiology editorial, is
in fact undesirable. Presentation of confidence intervals serves the same
purpose, but is superior because it more accurately reflects the actual data,
is simpler to present, addresses uncertainty more directly, and encourages
more careful interpretation of results." [Accessed July 7, 2010].
Available at:
http://journals.lww.com/anesthesiology/Fulltext/2002/10000/Sample_Size_Calculations_in_Clinical_Research.50.aspx.
Peter Bacchetti. Current sample size conventions: Flaws, harms, and
alternatives. BMC Medicine. 2010;8(1):17. Abstract: "BACKGROUND: The
belief remains widespread that medical research studies must have statistical
power of at least 80% in order to be scientifically sound, and peer reviewers
often question whether power is high enough. DISCUSSION: This requirement and
the methods for meeting it have severe flaws. Notably, the true nature of how
sample size influences a study's projected scientific or practical value
precludes any meaningful blanket designation of <80% power as "inadequate". In
addition, standard calculations are inherently unreliable, and focusing only
on power neglects a completed study's most important results: estimates and
confidence intervals. Current conventions harm the research process in many
ways: promoting misinterpretation of completed studies, eroding scientific
integrity, giving reviewers arbitrary power, inhibiting innovation, perverting
ethical standards, wasting effort, and wasting money. Medical research would
benefit from alternative approaches, including established value of
information methods, simple choices based on cost or feasibility that have
recently been justified, sensitivity analyses that examine a meaningful array
of possible findings, and following previous analogous studies. To promote
more rational approaches, research training should cover the issues presented
here, peer reviewers should be extremely careful before raising issues of
"inadequate" sample size, and reports of completed studies should not discuss
power. SUMMARY: Common conventions and expectations concerning sample size are
deeply flawed, cause serious harm to the research process, and should be
replaced by more rational alternatives." [Accessed July 7, 2010].
Available at:
http://www.biomedcentral.com/1741-7015/8/17.
Peter Bacchetti. Peer review of statistics in medical research: the
other problem. BMJ. 2002;324(7348):1271-1273. Excerpt: "The process of
peer review before publication has long been criticised for failing to prevent
the publication of statistics that are wrong, unclear, or suboptimal. 1 2 My
concern here, however, is not with failing to find flaws, but with the
complementary problem of finding flaws that are not really there. My
impression as a collaborating and consulting statistician is that spurious
criticism of sound statistics is increasingly common, mainly from subject
matter reviewers with limited statistical knowledge. Of the subject matter
manuscript reviews I see that raise statistical issues, perhaps half include a
mistaken criticism. In grant reviews unhelpful statistical comments seem to be
a near certainty, mainly due to unrealistic expectations concerning sample
size planning. While funding or publication of bad research is clearly
undesirable, so is preventing the funding or publication of good research.
Responding to misguided comments requires considerable time and effort, and
poor reviews are demoralising---a subtler but possibly more serious cost."
[Accessed July 7, 2010]. Available at:
http://www.bmj.com/cgi/content/full/324/7348/1271.
Peter Bacchetti, Leslie E. Wolf, Mark R. Segal, Charles E. McCulloch.
Ethics and Sample Size. Am. J. Epidemiol. 2005;161(2):105-110.
Abstract: "The belief is widespread that studies are unethical if their sample
size is not large enough to ensure adequate power. The authors examine how
sample size influences the balance that determines the ethical acceptability
of a study: the balance between the burdens that participants accept and the
clinical or scientific value that a study can be expected to produce. The
average projected burden per participant remains constant as the sample size
increases, but the projected study value does not increase as rapidly as the
sample size if it is assumed to be proportional to power or inversely
proportional to confidence interval width. This implies that the value per
participant declines as the sample size increases and that smaller studies
therefore have more favorable ratios of projected value to participant burden.
The ethical treatment of study participants therefore does not require
consideration of whether study power is less than the conventional goal of 80%
or 90%. Lower power does not make a study unethical. The analysis addresses
only ethical acceptability, not optimality; large studies may be desirable for
other than ethical reasons." [Accessed July 7, 2010]. Available at:
http://aje.oxfordjournals.org/cgi/content/abstract/161/2/105.
Larry Goldbetter, Susan E. Davis, Paul J. MacArthur. Have You Received
an E-Book Contract Amendment? | NWU - National Writers Union. Excerpt:
"Writers across the country are receiving letters from HarperCollins, Random
House, and other publishers asking them to sign e-book amendments to their
book contracts. When reviewing an e-book amendment, there are several things
you should consider." [Accessed July 7, 2010]. Available at:
https://nwu.org/have-you-received-e-book-contract-amendment%3F.
Kilem Li Gwet. Research Papers on Inter-Rater Reliability Estimation.
Excerpt: "Below are some downloadable research papers published by Dr. Gwet
on Inter-Rater Reliability. They are all in PDF format." [Accessed July 7,
2010]. Available at:
http://www.agreestat.com/research_papers.html.
David L DeMets, Thomas R Fleming, Frank Rockhold, et al. Liability
issues for data monitoring committee members. Clinical Trials, 2004:525
-531 vol. 1: Abstract: "In randomized clinical trials, a data monitoring
committee (DMC) is often appointed to review interim data to determine whether
there is early convincing evidence of intervention benefit, lack of benefit or
harm to study participants. Because DMCs bear serious responsibility for
participant safety, their members may be legally liable for their actions.
Despite more than three decades of experiences with DMCs, the issues of
liability and indemnification have yet to receive appropriate attention from
either government or industry sponsors. In industry-sponsored trials, DMC
members are usually asked to sign an agreement delineating their
responsibilities and operating procedures. While these agreements may include
language on indemnification, such language sometimes protects only the sponsor
rather than the DMC members. In government-sponsored trials, there has been
even less structure, since typically there are no signed agreements regarding
DMC activities. This paper discusses these issues and suggests sample language
for indemnification agreements to protect DMC members. This type of language
should be included in DMC charters and in all consulting agreements signed by
DMC members." [Accessed July 6, 2010]. Available at:
http://ctj.sagepub.com/cgi/content/abstract/1/6/525.
June 2010
Jonathan J. Shuster. Empirical vs natural weighting in random effects
meta-analysis. Statistics in Medicine. 2010;29(12):1259-1265. Abstract:
"This article brings into serious question the validity of empirically based
weighting in random effects meta-analysis. These methods treat sample sizes as
non-random, whereas they need to be part of the random effects analysis. It
will be demonstrated that empirical weighting risks substantial bias. Two
alternate methods are proposed. The first estimates the arithmetic mean of the
population of study effect sizes per the classical model for random effects
meta-analysis. We show that anything other than an unweighted mean of study
effect sizes will risk serious bias for this targeted parameter. The second
method estimates a patient level effect size, something quite different from
the first. To prevent inconsistent estimation for this population parameter,
the study effect sizes must be weighted in proportion to their total sample
sizes for the trial. The two approaches will be presented for a meta-analysis
of a nasal decongestant, while at the same time will produce counter-intuitive
results for the DerSimonian-Laird approach, the most popular empirically based
weighted method. It is concluded that all past publications based on
empirically weighted random effects meta-analysis should be revisited to see
if the qualitative conclusions hold up under the methods proposed herein. It
is also recommended that empirically based weighted random effects
meta-analysis not be used in the future, unless strong cautions about the
assumptions underlying these analyses are stated, and at a minimum, some form
of secondary analysis based on the principles set forth in this article be
provided to supplement the primary analysis. Copyright � 2009 John Wiley &
Sons, Ltd." [Accessed June 29, 2010]. Available at:
http://dx.doi.org/10.1002/sim.3607.
David J. Hand. Evaluating diagnostic tests: The area under the ROC
curve and the balance of errors. Statistics in Medicine.
2010;29(14):1502-1510. Abstract: "Because accurate diagnosis lies at the
heart of medicine, it is important to be able to evaluate the effectiveness of
diagnostic tests. A variety of accuracy measures are used. One particularly
widely used measure is the AUC, the area under the receiver operating
characteristic (ROC) curve. This measure has a well-understood weakness when
comparing ROC curves which cross. However, it also has the more fundamental
weakness of failing to balance different kinds of misdiagnoses effectively.
This is not merely an aspect of the inevitable arbitrariness in choosing a
performance measure, but is a core property of the way the AUC is defined.
This property is explored, and an alternative, the H measure, is described.
Copyright � 2010 John Wiley & Sons, Ltd." [Accessed June 16, 2010].
Available at:
http://dx.doi.org/10.1002/sim.3859.
Steve Shiboski. Table of Calculators for Survival Outcomes.
Description: This webpage highlights several different programs for power
calculations for sirvial analysis. It includes a Java applet by Marc Bacsafra
and SAS macros by Joanna Shih. [Accessed June 16, 2010]. Available at:
http://cct.jhsph.edu/javamarc/index.htm.
David A. Schoenfeld. Considerations for a parallel trial where the
outcome is a time to failure. Description: This web page calculates
power for a survival analysis. You need to specify the accrual interval, the
follow-up interval, the median time to failure in the group with the smallest
time to failure. Thne also specify two of the following three items: power,
total number of patients, and the minimal detectable hazard ratio. In an
exponential model the last term is equivalent to the ratio of median survival
times. [Accessed June 16, 2010]. Available at:
http://hedwig.mgh.harvard.edu/sample_size/time_to_event/para_time.html.
K Akazawa, T Nakamura, Y Palesch. Power of logrank test and Cox
regression model in clinical trials with heterogeneous samples. Stat Med.
1997;16(5):583-597. Abstract: "This paper evaluates the loss of power of
the simple and stratified logrank tests due to heterogeneity of patients in
clinical trials and proposes a flexible and efficient method of estimating
treatment effects adjusting for prognostic factors. The results of the paper
are based on the analyses of survival data from a large clinical trial which
includes more than 6000 cancer patients. Major findings from the simulation
study on power are: (i) for a heterogeneous sample, such as advanced cancer
patients, a simple logrank test can yield misleading results and should not be
used; (ii) the stratified logrank test may suffer some power loss when many
prognostic factors need to be considered and the number of patients within
stratum is small. To address the problems due to heterogeneity, the Cox
regression method with a special hazard model is recommended. We illustrate
the method using data from a gastric cancer clinical trial." [Accessed
June 16, 2010]. Available at:
http://www3.interscience.wiley.com/journal/9725/abstract.
Ian Campbell. Two-by-two Methods. Excerpt: "This page expands on
the methods section published in the paper: Campbell Ian, 2007, Chi-squared
and Fisher-Irwin tests of two-by-two tables with small sample recommendations,
Statistics in Medicine, 26, 3661 - 3675." [Accessed June 14, 2010].
Available at:
http://www.iancampbell.co.uk/twobytwo/methods.htm.
P Peduzzi, J Concato, E Kemper, T R Holford, A R Feinstein. A
simulation study of the number of events per variable in logistic regression
analysis. J Clin Epidemiol. 1996;49(12):1373-1379. Abstract: "We
performed a Monte Carlo study to evaluate the effect of the number of events
per variable (EPV) analyzed in logistic regression analysis. The simulations
were based on data from a cardiac trial of 673 patients in which 252 deaths
occurred and seven variables were cogent predictors of mortality; the number
of events per predictive variable was (252/7 =) 36 for the full sample. For
the simulations, at values of EPV = 2, 5, 10, 15, 20, and 25, we randomly
generated 500 samples of the 673 patients, chosen with replacement, according
to a logistic model derived from the full sample. Simulation results for the
regression coefficients for each variable in each group of 500 samples were
compared for bias, precision, and significance testing against the results of
the model fitted to the original sample. For EPV values of 10 or greater, no
major problems occurred. For EPV values less than 10, however, the regression
coefficients were biased in both positive and negative directions; the large
sample variance estimates from the logistic model both overestimated and
underestimated the sample variance of the regression coefficients; the 90%
confidence limits about the estimated values did not have proper coverage; the
Wald statistic was conservative under the null hypothesis; and paradoxical
associations (significance in the wrong direction) were increased. Although
other factors (such as the total number of events, or sample size) may
influence the validity of the logistic model, our findings indicate that low
EPV can lead to major problems." [Accessed June 14, 2010]. Available at:
http://www.ncbi.nlm.nih.gov/pubmed/8970487.
Steffen Mickenautsch. Systematic reviews, systematic error and the
acquisition of clinical knowledge. BMC Medical Research Methodology.
2010;10(1):53. Abstract: "BACKGROUND: Since its inception, evidence-based
medicine and its application through systematic reviews, has been widely
accepted. However, it has also been strongly criticised and resisted by some
academic groups and clinicians. One of the main criticisms of evidence-based
medicine is that it appears to claim to have unique access to absolute
scientific truth and thus devalues and replaces other types of knowledge
sources. DISCUSSION: The various types of clinical knowledge sources are
categorised on the basis of Kant's categories of knowledge acquisition, as
being either 'analytic' or 'synthetic'. It is shown that these categories do
not act in opposition but rather, depend upon each other. The unity of
analysis and synthesis in knowledge acquisition is demonstrated during the
process of systematic reviewing of clinical trials. Systematic reviews
constitute comprehensive synthesis of clinical knowledge but depend upon
plausible, analytical hypothesis development for the trials reviewed. The
dangers of systematic error regarding the internal validity of acquired
knowledge are highlighted on the basis of empirical evidence. It has been
shown that the systematic review process reduces systematic error, thus
ensuring high internal validity. It is argued that this process does not
exclude other types of knowledge sources. Instead, amongst these other types
it functions as an integrated element during the acquisition of clinical
knowledge. CONCLUSIONS: The acquisition of clinical knowledge is based on the
interaction between analysis and synthesis. Systematic reviews provide the
highest form of synthetic knowledge acquisition in terms of achieving internal
validity of results. In that capacity it informs the analytic knowledge of the
clinician but does not replace it." [Accessed June 14, 2010]. Available
at:
http://www.biomedcentral.com/1471-2288/10/53.
Beth Woods, Neil Hawkins, David Scott. Network meta-analysis on the
log-hazard scale, combining count and hazard ratio statistics accounting for
multi-arm trials: A tutorial. BMC Medical Research Methodology.
2010;10(1):54. Abstract: "BACKGROUND: Data on survival endpoints are
usually summarised using either hazard ratio, cumulative number of events, or
median survival statistics. Network meta-analysis, an extension of traditional
pairwise meta-analysis, is typically based on a single statistic. In this
case, studies which do not report the chosen statistic are excluded from the
analysis which may introduce bias. METHODS: In this paper we present a
tutorial illustrating how network meta-analyses of survival endpoints can
combine count and hazard ratio statistics in a single analysis on the hazard
ratio scale. We also describe methods for accounting for the correlations in
relative treatment effects (such as hazard ratios) that arise in trials with
more than two arms. Combination of count and hazard ratio data in a single
analysis is achieved by estimating the cumulative hazard for each trial arm
reporting count data. Correlation in relative treatment effects in multi-arm
trials is preserved by converting the relative treatment effect estimates (the
hazard ratios) to arm-specific outcomes (hazards). RESULTS: A worked example
of an analysis of mortality data in chronic obstructive pulmonary disease (COPD)
is used to illustrate the methods. The data set and WinBUGS code for fixed and
random effects models are provided. CONCLUSIONS: By incorporating all data
presentations in a single analysis, we avoid the potential selection bias
associated with conducting an analysis for a single statistic and the
potential difficulties of interpretation, misleading results and loss of
available treatment comparisons associated with conducting separate analyses
for different summary statistics." [Accessed June 14, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/54.
Osamu Komori, Shinto Eguchi. A boosting method for maximizing the
partial area under the ROC curve. BMC Bioinformatics. 2010;11(1):314.
Abstract: "BACKGROUND: The receiver operating characteristic (ROC) curve is a
fundamental tool to assess the discriminant performance for not only a single
marker but also a score function combining multiple markers. The area under
the ROC curve (AUC) for a score function measures the intrinsic ability for
the score function to discriminate between the controls and cases. Recently,
the partial AUC (pAUC) has been paid more attention than the AUC, because a
suitable range of the false positive rate can be focused according to various
clinical situations. However, existing pAUC-based methods only handle a few
markers and do not take nonlinear combination of markers into consideration.
RESULTS: We have developed a new statistical method that focuses on the pAUC
based on a boosting technique. The markers are combined componentially for
maximizing the pAUC in the boosting algorithm using natural cubic splines or
decision stumps (single-level decision trees), according to the values of
markers (continuous or discrete). We show that the resulting score plots are
useful for understanding how each marker is associated with the outcome
variable. We compare the performance of the proposed boosting method with
those of other existing methods, and demonstrate the utility using real data
sets. As a result, we have much better discrimination performances in the
sense of the pAUC in both simulation studies and real data analysis.
CONCLUSIONS: The proposed method addresses how to combine the markers after a
pAUC-based filtering procedure in high dimensional setting. Hence, it provides
a consistent way of analyzing data based on the pAUC from maker selection to
marker combination for discrimination problems. The method can capture not
only linear but also nonlinear association between the outcome variable and
the markers, about which the nonlinearity is known to be necessary in general
for the maximization of the pAUC. The method also puts importance on the
accuracy of classification performance as well as interpretability of the
association, by offering simple and smooth resultant score plots for each
marker." [Accessed June 14, 2010]. Available at:
http://www.biomedcentral.com/1471-2105/11/314.
Luis Carlos Silva-Aycaguer, Patricio Suarez-Gil, Ana Fernandez-Somoano.
The null hypothesis significance test in health sciences research (1995-2006):
statistical analysis and interpretation. BMC Medical Research Methodology.
2010;10(1):44. Abstract: "BACKGROUND: The null hypothesis significance test
(NHST) is the most frequently used statistical method, although its
inferential validity has been widely criticized since its introduction. In
1988, the International Committee of Medical Journal Editors (ICMJE) warned
against sole reliance on NHST to substantiate study conclusions and suggested
supplementary use of confidence intervals (CI). Our objective was to evaluate
the extent and quality in the use of NHST and CI, both in English and Spanish
language biomedical publications between 1995 and 2006, taking into account
the International Committee of Medical Journal Editors recommendations, with
particular focus on the accuracy of the interpretation of statistical
significance and the validity of conclusions. METHODS: Original articles
published in three English and three Spanish biomedical journals in three
fields (General Medicine, Clinical Specialties and Epidemiology - Public
Health) were considered for this study. Papers published in 1995-1996,
2000-2001, and 2005-2006 were selected through a systematic sampling method.
After excluding the purely descriptive and theoretical articles, analytic
studies were evaluated for their use of NHST with P-values and/or CI for
interpretation of statistical "significance" and "relevance" in study
conclusions. RESULTS: Among 1,043 original papers, 874 were selected for
detailed review. The exclusive use of P-values was less frequent in English
language publications as well as in Public Health journals; overall such use
decreased from 41 % in 1995-1996 to 21% in 2005-2006. While the use of CI
increased over time, the "significance fallacy" (to equate statistical and
substantive significance) appeared very often, mainly in journals devoted to
clinical specialties (81%). In papers originally written in English and
Spanish, 15% and 10%, respectively, mentioned statistical significance in
their conclusions. CONCLUSIONS: Overall, results of our review show some
improvements in statistical management of statistical results, but further
efforts by scholars and journal editors are clearly required to move the
communication toward ICMJE advices, especially in the clinical setting, which
seems to be imperative among publications in Spanish." [Accessed June 14,
2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/44.
Rolf Groenwold, Maroeska Rovers, Jacobus Lubsen, Geert van der Heijden.
Subgroup effects despite homogeneous heterogeneity test results. BMC
Medical Research Methodology. 2010;10(1):43. Abstract: "BACKGROUND:
Statistical tests of heterogeneity are very popular in meta-analyses, as
heterogeneity might indicate subgroup effects. Lack of demonstrable
statistical heterogeneity, however, might obscure clinical heterogeneity,
meaning clinically relevant subgroup effects. METHODS: A qualitative, visual
method to explore the potential for subgroup effects was provided by a
modification of the forest plot, i.e., adding a vertical axis indicating the
proportion of a subgroup variable in the individual trials. Such a plot was
used to assess the potential for clinically relevant subgroup effects and was
illustrated by a clinical example on the effects of antibiotics in children
with acute otitis media. RESULTS: Statistical tests did not indicate
heterogeneity in the meta-analysis on the effects of amoxicillin on acute
otitis media (Q=3.29, p=0.51; I2=0%; T2=0). Nevertheless, in a modified forest
plot, in which the individual trials were ordered by the proportion of
children with bilateral otitis, a clear relation between bilaterality and
treatment effects was observed (which was also found in an individual patient
data meta-analysis of the included trials: p-value for interaction 0.021).
CONCLUSIONS: A modification of the forest plot, by including an additional
(vertical) axis indicating the proportion of a certain subgroup variable, is a
qualitative, visual, and easy-to-interpret method to explore potential
subgroup effects in studies included in meta-analyses." [Accessed June 14,
2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/43.
Karin Velthove, Hubert Leufkens, Patrick Souverein, Rene Schweizer, Wouter
van Solinge. Testing bias in clinical databases: methodological
considerations. Emerging Themes in Epidemiology. 2010;7(1):2. Abstract:
"BACKGROUND: Laboratory testing in clinical practice is never a random
process. In this study we evaluated testing bias for neutrophil counts in
clinical practice by using results from requested and non-requested
hematological blood tests. METHODS: This study was conducted using data from
the Utrecht Patient Oriented Database, a unique clinical database as it
contains physician requested data, but also data that are not requested by the
physician, but measured as result of requesting other hematological
parameters. We identified adult patients, hospitalized in 2005 with at least
two blood tests during admission, where requests for general blood profiles
and specifically for neutrophil counts were contrasted in scenario analyses.
Possible effect modifiers were diagnosis and glucocorticoid use. RESULTS: A
total of 567 patients with requested neutrophil counts and 1,439 patients with
non-requested neutrophil counts were analyzed. The absolute neutrophil count
at admission differed with a mean of 7.4.10E9/l for requested counts and
8.3.10E9/l for non-requested counts (p-value <0.001). This difference could be
explained for 83.2% by the occurrence of cardiovascular disease as underlying
disease and for 4.5% by glucocorticoid use. CONCLUSION: Requests for
neutrophil counts in clinical databases are associated with underlying disease
and with cardiovascular disease in particular. The results from our study show
the importance of evaluating testing bias in epidemiological studies obtaining
data from clinical databases." [Accessed June 14, 2010]. Available at:
http://www.ete-online.com/content/7/1/2.
Physicians for Human Rights. The Torture Reports. Excerpt:
"Experiments in Torture is the first report to reveal evidence indicating that
CIA medical personnel allegedly engaged in the crime of illegal
experimentation after 9/11, in addition to the previously disclosed crime of
torture. In their attempt to justify the war crime of torture, the CIA appears
to have committed another alleged war crime�illegal experimentation on
prisoners." [Accessed June 10, 2010]. Available at:
http://phrtorturepapers.org/.
Scott Aberegg, D Roxanne Richards, James O'Brien. Delta inflation: a
bias in the design of randomized controlled trials in critical care medicine.
Critical Care. 2010;14(2):R77. Abstract: "INTRODUCTION: Mortality is the
most widely accepted outcome measure in randomized controlled trials of
therapies for critically ill adults, but most of these trials fail to show a
statistically significant mortality benefit. The reasons for this are unknown.
METHODS: We searched five high impact journals (Annals of Internal Medicine,
British Medical Journal, JAMA, The Lancet, New England Journal of Medicine)
for randomized controlled trials comparing mortality of therapies for
critically ill adults over a ten year period. We abstracted data on the
statistical design and results of these trials to compare the predicted delta
(delta; the effect size of the therapy compared to control expressed as an
absolute mortality reduction) to the observed delta to determine if there is a
systematic overestimation of predicted delta that might explain the high
prevalence of negative results in these trials. RESULTS: We found 38 trials
meeting our inclusion criteria. Only 5/38 (13.2%) of the trials provided
justification for the predicted delta. The mean predicted delta among the 38
trials was 10.1% and the mean observed delta was 1.4% (P<0.0001), resulting in
a delta-gap of 8.7%. In only 2/38 (5.3%) of the trials did the observed delta
exceed the predicted delta and only 7/38 (18.4%) of the trials demonstrated
statistically significant results in the hypothesized direction; these trials
had smaller delta-gaps than the remainder of the trials (delta-gap 0.9% versus
10.5%; P<0.0001). For trials showing non-significant trends toward benefit
greater than 3%, large increases in sample size (380% - 1100%) would be
required if repeat trials use the observed delta from the index trial as the
predicted delta for a follow-up study. CONCLUSIONS: Investigators of therapies
for critical illness systematically overestimate treatment effect size (delta)
during the design of randomized controlled trials. This bias, which we refer
to as "delta inflation", is a potential reason that these trials have a high
rate of negative results." [Accessed June 9, 2010]. Available at:
http://ccforum.com/content/14/2/R77.
May 2010
Alex Guazzelli, Michael Zeller, Wen-Ching Lin, Graham Williams. PMML:
An Open Standard for Sharing Models. The R Journal. 2009;1(1):60-65.
Excerpt: "The PMML package exports a variety of predictive and descriptive
models from R to the Predictive Model Markup Language (Data Mining Group,
2008). PMML is an XML-based language and has become the de-facto standard to
represent not only predictive and descriptive models, but also data preand
post-processing. In so doing, it allows for the interchange of models among
different tools and environments, mostly avoiding proprietary issues and
incompatibilities." [Accessed May 29, 2010]. Available at:
http://journal.r-project.org/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf.
Wim Van Biesen, Francis Verbeke, Raymond Vanholder. An infallible
recipe? A story of cinnamon, souffle and meta-analysis. Nephrol. Dial.
Transplant. 2008;23(9):2729-2732. Excerpt: "Meta-analyses certainly do have
their place in scientific research. Like herbs, if used in the correct dish,
and not too much or too often, they can give that extra bit of flavour that
turns �food� into a �delicious dish�. However, meta-analyses are like
cinnamon: very tasteful in small quantities and in the right dish, but if you
use them too much or in the wrong dish, it ruins all other flavours and you
get nausea. Just as for the cinnamon, it requires skills and insight to know
when and how to use a meta-analysis." [Accessed May 27, 2010]. Available
at:
http://ndt.oxfordjournals.org/cgi/content/full/23/9/2729.
Committee on Strategies for Small-Number-Participant Clinical Research
Trials, Board on Health Sciences Policy. Small Clinical Trials: Issues and
Challenges. Washington, D.C.: The National Academies Press; 2001.
Abstract: "Scientific research has a long history of using well-established,
well documented, and validated methods for the design, conduct, and analysis
of clinical trials. A study design that is considered appropriate includes
sufficient sample size (n) and statistical power and proper control of bias to
allow a meaningful interpretation of the results. Whenever feasible, clinical
trials should be designed and performed so that they have adequate statistical
power. However, when the clinical context does not provide a sufficient number
of research participants for a trial with adequate statistical power but the
research question has great clinical significance, research can still proceed
under certain conditions. Small clinical trials might be warranted for the
study of rare diseases, unique study populations (e.g., astronauts),
individually tailored therapies, in environments that are isolated, in
emergency situations, and in instances of public health urgency. Properly
designed trials with small sample sizes may provide substantial evidence of
efficacy and are especially appropriate in particular situations. However, the
conclusions derived from such studies may require careful consideration of the
assumptions and inferences, given the small number of paticipants. Bearing in
mind the statistical power, precision, and validity limitations of trials with
small sample sizes, there are innovative design and analysis approaches that
can improve the quality of such trials. A number of trial designs especially
lend themselves to use in studies with small sample sizes, including one
subject (n-of-1) designs, sequential designs, �within-subject� designs,
decision analysis-based designs, ranking and selection designs, adaptive
designs, and risk-based allocation designs. Data analysis for trials with
small numbers of participants in particular must be focused. In general,
certain types of analyses are more amenable to studies with small numbers of
participants, including sequential analysis, hierarchical analysis, Bayesian
analysis, decision analysis, statistical prediction, meta-analysis, and
risk-based allocation. Because of the constraints of conducting research with
small sample sizes, the committee makes recommendations in several areas:
defining the research question, tailoring the study design by giving careful
consideration to alternative methods, clarifying sample characteristics and
methods for the reporting of results of clinical trials with small sample
sizes, performing corroborative analyses to evaluate the consistency and
robustness of the results of clinical trials with small sample sizes, and
exercising caution in the interpretation of the results before attempting to
extrapolate or generalize the findings of clinical trials with small sample
sizes. The committee also recommends that more research be conducted on the
development and evaluation of alternative experimental designs and analysis
methods for trials with small sample sizes. Available at:
http://www.nap.edu/catalog.php?record_id=10078.
C David Naylor. Meta-analysis and the meta-epidemiology of clinical
research. BMJ. 1997;315:617-9. Excerpt: "This week's BMJ contains a
pot-pourri of materials that deal with the research methodology of
meta-analysis. Meta-analysis in clinical research is based on simple
principles: systematically searching out, and, when possible, quantitatively
combining the results of all studies that have addressed a similar research
question. Given the information explosion in clinical research, the logic of
basing research reviews on systematic searching and careful quantitative
compilation of study results is incontrovertible. However, one aspect of
meta-analysis as applied to randomised trials has always been controversial1 2
�combining data from multiple studies into single estimates of treatment
effect." [Accessed May 19, 2010]. Available at:
http://www.bmj.com/cgi/content/extract/315/7109/617.
ST Brookes, E Whitley, TJ Peters, et al. Subgroup analyses in
randomised controlled trials: quantifying the risks of false-positives and
false-negatives. Excerpt: "Subgroup analyses are common in randomised
controlled trials (RCTs). There are many easily accessible guidelines on the
selection and analysis of subgroups but the key messages do not seem to be
universally accepted and inappropriate analyses continue to appear in the
literature. This has potentially serious implications because erroneous
identification of differential subgroup effects may lead to inappropriate
provision or withholding of treatment." [Accessed May 19, 2010]. Available
at:
http://www.hta.ac.uk/execsumm/summ533.shtml.
C Bartlett, L Doyal, S Ebrahim, et al. The causes and effects of
socio-demographic exclusions from clinical trials. Excerpt: "The
exclusion from trials of people likely to be in need of or to benefit from an
intervention could compromise the trials� generalisability. We investigated
the exclusion of women, older people and minority ethnic groups, focusing on
two drug exemplars, statins and non-steroidal anti-inflammatory drugs (NSAIDs)."
[Accessed May 19, 2010]. Available at:
http://www.hta.ac.uk/execsumm/summ938.shtml.
Leon Bax, Noriaki Ikeda, Naohito Fukui, et al. More Than Numbers: The
Power of Graphs in Meta-Analysis. Am. J. Epidemiol. 2009;169(2):249-255.
Abstract: "In meta-analysis, the assessment of graphs is widely used in an
attempt to identify or rule out heterogeneity and publication bias. A variety
of graphs are available for this purpose. To date, however, there has been no
comparative evaluation of the performance of these graphs. With the objective
of assessing the reproducibility and validity of graph ratings, the authors
simulated 100 meta-analyses from 4 scenarios that covered situations with and
without heterogeneity and publication bias. From each meta-analysis, the
authors produced 11 types of graphs (box plot, weighted box plot, standardized
residual histogram, normal quantile plot, forest plot, 3 kinds of funnel
plots, trim-and-fill plot, Galbraith plot, and L'Abbe plot), and 3 reviewers
assessed the resulting 1,100 plots. The intraclass correlation coefficients (ICCs)
for reproducibility of the graph ratings ranged from poor (ICC = 0.34) to high
(ICC = 0.91). Ratings of the forest plot and the standardized residual
histogram were best associated with parameter heterogeneity. Association
between graph ratings and publication bias (censorship of studies) was poor.
Meta-analysts should be selective in the graphs they choose for the
exploration of their data." [Accessed May 19, 2010]. Available at:
http://aje.oxfordjournals.org/cgi/content/abstract/169/2/249.
Ylian Liem, John Wong, MG Myriam Hunink, Frank de Charro, Wolfgang
Winkelmayer. Propensity scores in the presence of effect modification: A
case study using the comparison of mortality on hemodialysis versus peritoneal
dialysis. Emerging Themes in Epidemiology. 2010;7(1):1. Abstract:
"Purpose: To control for confounding bias from non-random treatment assignment
in observational data, both traditional multivariable models and more recently
propensity score approaches have been applied. Our aim was to compare a
propensity score-stratified model with a traditional multivariable-adjusted
model, specifically in estimating survival of hemodialysis (HD) versus
peritoneal dialysis (PD) patients. METHODS: Using the Dutch End-Stage Renal
Disease Registry, we constructed a propensity score, predicting PD assignment
from age, gender, primary renal disease, center of dialysis, and year of first
renal replacement therapy. We developed two Cox proportional hazards
regression models to estimate survival on PD relative to HD, a propensity
score-stratified model stratifying on the propensity score and a
multivariable-adjusted model, and tested several interaction terms in both
models. RESULTS: The propensity score performed well: it showed a reasonable
fit, had a good c-statistic, calibrated well and balanced the covariates. The
main-effects multivariable-adjusted model and the propensity score-stratified
univariable Cox model resulted in similar relative mortality risk estimates of
PD compared with HD (0.99 and 0.97, respectively) with fewer significant
covariates in the propensity model. After introducing the missing interaction
variables for effect modification in both models, the mortality risk estimates
for both main effects and interactions remained comparable, but the propensity
score model had nearly as many covariates because of the additional
interaction variables. CONCLUSION: Although the propensity score performed
well, it did not alter the treatment effect in the outcome model and lost its
advantage of parsimony in the presence of effect modification." [Accessed
May 18, 2010]. Available at:
http://www.ete-online.com/content/7/1/1.
David L Streiner. Missing data and the trouble with LOCF. Evidence
Based Mental Health. 2008;11(1):3-5. Excerpt: "Missing data are the bane of
all clinical research. With the possible exception of the CAPRIE trial, in
which the investigators went to extraordinary lengths that enabled them to
followup 99.8% of their 19 000 participants over two years, it is highly
unusual for a study to end with complete data on all subjects. There are many
reasons for this: a person may omit an item on a questionnaire or refuse to
complete it entirely; a vial of blood may be dropped or the analyser fail to
function one day; or a participant may not appear for his or her appointment.
Longitudinal studies (those that follow participants over time) can be subject
to all of these mishaps, but now the problem is magnified in that they could
happen at each of the assessment sessions; in addition to which, participants
may drop out of the study entirely before all the data are collected.
Furthermore, the more sophisticated, multivariable statistical techniques that
use two or more variables in the same analysis, such as multiple regression or
factor analysis, make the problem even worse, in that most of them require
complete data for all of the subjects. If a person is missing one variable out
of the, say, 10 that are being analysed, then that subject is dropped entirely
from the analysis. Simulations have shown that if as little as 10% of the data
is missing, as many as 60% of the subjects could be eliminated." [Accessed
May 6, 2010]. Available at:
http://ebmh.bmj.com/content/11/1/3.2.short.
Fred Andersen, Torgeir Engstad, Bjorn Straume, et al. Recruitment
methods in Alzheimer's disease research: general practice versus population
based screening by mail. BMC Medical Research Methodology. 2010;10(1):35.
Abstract: "BACKGROUND: In Alzheimer's disease (AD) research patients are
usually recruited from clinical practice, memory clinics or nursing homes.
Lack of standardised inclusion and diagnostic criteria is a major concern in
current AD studies. The aim of the study was to explore whether patient
characteristics differ between study samples recruited from general practice
and from a population based screening by mail within the same geographic areas
in rural Northern Norway. METHODS: An interventional study in nine
municipalities with 70000 inhabitants was designed. Patients were recruited
from general practice or by population based screening of cognitive function
by mail. We sent a questionnaire to 11807 individuals [greater than or equal
to] 65 years of age of whom 3767 responded. Among these, 438 individuals whose
answers raised a suspicion of cognitive impairment were invited to extended
cognitive testing and a clinical examination. Descriptive statistics,
chi-square, independent sample t-test and analyses of covariance adjusted for
possible confounders were used. RESULTS: The final study samples included 100
patients recruited by screening and 87 from general practice. Screening
through mail recruited younger and more self-reliant male patients with a
higher MMSE sum score, whereas older women with more severe cognitive
impairment were recruited from general practice. Adjustment for age did not
alter the statistically significant differences of cognitive function,
self-reliance and gender distribution between patients recruited by screening
and from general practice. CONCLUSIONS: Different recruitment procedures of
individuals with cognitive impairment provided study samples with different
demographic characteristics. Initial cognitive screening by mail, preceding
extended cognitive testing and clinical examination may be a suitable
recruitment strategy in studies of early stage AD. Registration:
ClinicalTrial.gov Identifier: NCT00443014" [Accessed May 6, 2010].
Available at:
http://www.biomedcentral.com/1471-2288/10/35.
Michel Chavance, Sylvie Escolano, Monique Romon, et al. Latent
variables and structural equation models for longitudinal relationships: an
illustration in nutritional epidemiology. BMC Medical Research
Methodology. 2010;10(1):37. Abstract: "BACKGROUND: The use of structural
equation modeling and latent variables remains uncommon in epidemiology
despite its potential usefulness. The latter was illustrated by studying
cross-sectional and longitudinal relationships between eating behavior and
adiposity, using four different indicators of fat mass. METHODS: Using data
from a longitudinal community-based study, we fitted structural equation
models including two latent variables (respectively baseline adiposity and
adiposity change after 2 years of follow-up), each being defined, by the four
following anthropometric measurement (respectively by their changes): body
mass index, waist circumference, skinfold thickness and percent body fat.
Latent adiposity variables were hypothesized to depend on a cognitive
restraint score, calculated from answers to an eating-behavior questionnaire
(TFEQ-18), either cross-sectionally or longitudinally. RESULTS: We found that
high baseline adiposity was associated with a 2-year increase of the cognitive
restraint score and no convincing relationship between baseline cognitive
restraint and 2-year adiposity change could be established. CONCLUSIONS: The
latent variable modeling approach enabled presentation of synthetic results
rather than separate regression models and detailed analysis of the causal
effects of interest. In the general population, restrained eating appears to
be an adaptive response of subjects prone to gaining weight more than as a
risk factor for fat-mass increase." [Accessed May 6, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/37.
Jonathan Graffy, Peter Bower, Elaine Ward, et al. Trials within trials?
Researcher, funder and ethical perspectives on the practicality and
acceptability of nesting trials of recruitment methods in existing primary
care trials. BMC Medical Research Methodology. 2010;10(1):38. Abstract:
"BACKGROUND: Trials frequently encounter difficulties in recruitment, but
evidence on effective recruitment methods in primary care is sparse. A robust
test of recruitment methods involves comparing alternative methods using a
randomized trial, 'nested' in an ongoing 'host' trial. There are potential
scientific, logistical and ethical obstacles to such studies. METHOD:
Telephone interviews were undertaken with four groups of stakeholders (funders,
principal investigators, trial managers and ethics committee chairs) to
explore their views on the practicality and acceptability of undertaking
nested trials of recruitment methods. These semi-structured interviews were
transcribed and analyzed thematically. RESULTS: Twenty people were
interviewed. Respondents were familiar with recruitment difficulties in
primary care and recognised the case for 'nested' studies to build an evidence
base on effective recruitment strategies. However, enthusiasm for this global
aim was tempered by the challenges of implementation. Challenges for host
studies included increasing complexity and management burden; compatibility
between the host and nested study; and the impact of the nested study on trial
design and relationships with collaborators. For nested recruitment studies,
there were concerns that host study investigators might have strong
preferences, limiting the nested study investigators' control over their
research, and also concerns about sample size which might limit statistical
power. Nested studies needed to be compatible with the main trial and should
be planned from the outset. Good communication and adequate resources were
seen as important. CONCLUSIONS: Although research on recruitment was welcomed
in principle, the issue of which study had control of key decisions emerged as
critical. To address this concern, it appeared important to align the
interests of both host and nested studies and to reduce the burden of hosting
a recruitment trial. These findings should prove useful in devising a
programme of research involving nested studies of recruitment interventions."
[Accessed May 6, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/38.
Kuna Gupta, Jyotsna Gupta, Sukhdeep Singh. Surrogate Endpoints: How
Reliable Are They? 2010. Excerpt: "Surrogate endpoints offer three main
advantages to clinical studies: The study becomes simpler. Since surrogates
are usually measures of symptoms or laboratory biomarkers, they make it easier
to quantify comparisons. The study becomes shorter. It generally takes less
time to see the effect of an intervention on a surrogate than on the final
clinical outcome, especially if the surrogate marks an intermediate point in
the disease process. The study becomes less expensive. Since the study
duration is shorter, the cost decreases. Measurement of the surrogate may be
less costly than measurement of the true outcome. In addition, waiting for a
clinical outcome may involve more medical care for sicker patients."
[Accessed May 3, 2010]. Available at:
http://www.firstclinical.com/journal/2010/1005_Surrogate.pdf.
Peter B. Gilkey. Questionaire. Excerpt: "You are no doubt aware
that the number of questionnaires circulated is rapidly increasing, whereas
the length of the working day has at best remained constant. In order to
resolve the problem presented by this trend, I find it necessary to restrict
my replies to questionnaires to those questioners who first establish their
bona fide by completing the following questionnaire. Please fill it out and
return it to me electronically. This will help me compile a profile of people
who compile profiles." [Accessed May 1, 2010]. Available at:
http://www.uoregon.edu/~gilkey/dirhumor/questionaire.html.
Gary Wolf. The Data-Driven Life. The New York Times. 2010.
Excerpt: "And yet, almost imperceptibly, numbers are infiltrating the last
redoubts of the personal. Sleep, exercise, sex, food, mood, location,
alertness, productivity, even spiritual well-being are being tracked and
measured, shared and displayed. On MedHelp, one of the largest Internet forums
for health information, more than 30,000 new personal tracking projects are
started by users every month. Foursquare, a geo-tracking application with
about one million users, keeps a running tally of how many times players
�check in� at every locale, automatically building a detailed diary of
movements and habits; many users publish these data widely. Nintendo�s Wii
Fit, a device that allows players to stand on a platform, play physical games,
measure their body weight and compare their stats, has sold more than 28
million units." [Accessed May 1, 2010]. Available at:
http://www.nytimes.com/2010/05/02/magazine/02self-measurement-t.html.
EuSpRIG. European Spreadsheet Risks Interest Group - spreadsheet risk
management and solutions conference. Excerpt: "EuSpRIG is the largest
source of information on practical methods for introducing into organisations
processes and methods to inventory, test, correct, document, backup, archive,
compare and control the legions of spreadsheets that support critical
corporate infrastructure." [Accessed May 1, 2010]. Available at:
http://www.eusprig.org/.
Laura Rosen, Michal Ben Noach, Elliot Rosenberg. Missing the forest
(plot) for the trees? A critique of the systematic review in tobacco control.
BMC Medical Research Methodology. 2010;10(1):34. Abstract: "BACKGROUND: The
systematic review (SR) lies at the core of evidence-based medicine. While it
may appear that the SR provides a reliable summary of existing evidence,
standards of SR conduct differ. The objective of this research was to examine
systematic review (SR) methods used by the Cochrane Collaboration ("Cochrane")
and the Task Force on Community Preventive Services ("the Guide") for
evaluation of effectiveness of tobacco control interventions. METHODS: We
searched for all reviews of tobacco control interventions published by
Cochrane (4th quarter 2008) and the Guide. We recorded design rigor of
included studies, data synthesis method, and setting. RESULTS: About a third
of the Cochrane reviews and two thirds of the Guide reviews of interventions
in the community setting included uncontrolled trials. Most (74%) Cochrane
reviews in the clinical setting, but few (15%) in the community setting,
provided pooled estimates from RCTs. Cochrane often presented the community
results narratively. The Guide did not use inferential statistical approaches
to assessment of effectiveness. CONCLUSIONS: Policy makers should be aware
that SR methods differ, even among leading producers of SRs and among settings
studied. The traditional SR approach of using pooled estimates from RCTs is
employed frequently for clinical but infrequently for community-based
interventions. The common lack of effect size estimates and formal tests of
significance limit the contribution of some reviews to evidence-based decision
making. Careful exploration of data by subgroup, and appropriate use of random
effects models, may assist researchers in overcoming obstacles to pooling
data." [Accessed May 1, 2010]. Available at:
http://www.biomedcentral.com/1471-2288/10/34.
April 2010
H. Gilbert Welch, William C. Black. Overdiagnosis in Cancer. J.
Natl. Cancer Inst. 2010:djq099. Abstract: "This article summarizes the
phenomenon of cancer overdiagnosis--the diagnosis of a "cancer" that would
otherwise not go on to cause symptoms or death. We describe the two
prerequisites for cancer overdiagnosis to occur: the existence of a silent
disease reservoir and activities leading to its detection (particularly cancer
screening). We estimated the magnitude of overdiagnosis from randomized
trials: about 25% of mammographically detected breast cancers, 50% of chest
x-ray and/or sputum-detected lung cancers, and 60% of prostate-specific
antigen-detected prostate cancers. We also review data from observational
studies and population-based cancer statistics suggesting overdiagnosis in
computed tomography-detected lung cancer, neuroblastoma, thyroid cancer,
melanoma, and kidney cancer. To address the problem, patients must be
adequately informed of the nature and the magnitude of the trade-off involved
with early cancer detection. Equally important, researchers need to work to
develop better estimates of the magnitude of overdiagnosis and develop
clinical strategies to help minimize it." [Accessed April 28, 2010].
Available at:
http://jnci.oxfordjournals.org/cgi/content/abstract/djq099v1.
Elisabeth Bumiller. We Have Met the Enemy and He Is PowerPoint. The
New York Times. April 26, 2010. Excerpt: "Like an insurgency, PowerPoint
has crept into the daily lives of military commanders and reached the level of
near obsession. The amount of time expended on PowerPoint, the Microsoft
presentation program of computer-generated charts, graphs and bullet points,
has made it a running joke in the Pentagon and in Iraq and Afghanistan."
[Accessed April 27, 2010]. Available at:
http://www.nytimes.com/2010/04/27/world/27powerpoint.html.
Rip Stauffer. Some Problems with Attribute Charts | Quality Digest.
Excerpt: "While p- and np- charts can be very useful, and I highly
recommend them when the conditions are correct, they aren't always the best
charts to use, and should be used with some caution. There are a few inherent
problems that seem to crop up a lot. This article will illustrate a couple of
the foibles observed over many years of wrangling with these interesting
charts." [Accessed April 5, 2010]. Available at:
http://www.qualitydigest.com/inside/quality-insider-article/some-problems-attribute-charts.html
March 2010
Jeremy Genovese. The Ten Percent Solution. Anatomy of an Education Myth.
Excerpt: "For may years, versions of a claim that students remember �10% of
what they read, 20% of what they hear, 30% of what they see, 50% of what they
see and hear, and 90% of what they do� have been widely circulated among
educators. The source of this claim, however, is unknown and its validity is
questionable. It is an educational urban legend that suggests a willingness to
accept assertions about instructional strategies without empirical support."
[Accessed March 25, 2010]. Available at:
http://www.skeptic.com/eskeptic/10-03-24/#feature.
Don Zimmerman. Devilish Dictionary for Statisticians.
Description: This webpage offers some irreverent definitions of statistical
terms, akin to Ambrose Bierce's The Devil's Dictionary. They are all very
cynical and very funny. Here's an example: "Sample--a rag-tag, bob-tailed
bunch of atypical misfits who have volunteered to participate in an
experiment." [Accessed March 25, 2010]. Available at:
mypage.direct.ca/z/zimmerma/devilsdictionary.htm.
Brian L. Joiner, Sue Reynard, Yukihiro Ando. Fourth generation
management. McGraw-Hill Professional; 1994. Excerpt: "I knew that it was
important to find better ways to do things and to eliminate waste and
inefficiencies; that data could shed light on murky situations; that people
needed to work together. But it took another 20 years working with large
companies and small, with government, service, and manufacturing
organizations, with top managers, with operators on the shop floor, before I
had a good understanding of how all these pieces fit into a system of
management that brings rapid learning and rapid improvement. It's a system
I've come to call 4th Generation Management." Available at:
http://books.google.com/books?id=E99OVbYUmhEC.
February 2010
Julie Rehmeyer. Florence Nightingale: The Passionate Statistician -
Science News. Science News. 2008. Excerpt: "When Florence Nightingale
arrived at a British hospital in Turkey during the Crimean War, she found a
nightmare of misery and chaos. Men lay crowded next to each other in endless
corridors. The air reeked from the cesspool that lay just beneath the hospital
floor. There was little food and fewer basic supplies. By the time Nightingale
left Turkey after the war ended in July 1856, the hospitals were well-run and
efficient, with mortality rates no greater than civilian hospitals in England,
and Nightingale had earned a reputation as an icon of Victorian women. Her
later and less well-known work, however, saved far more lives. She brought
about fundamental change in the British military medical system, preventing
any such future calamities. To do it, she pioneered a brand-new method for
bringing about social change: applied statistics." [Accessed February 23,
2010]. Available at:
http://www.sciencenews.org/view/generic/id/38937/title/Math_Trek__Florence_Nightingale_The_passionate_statistician.
Garrett Watts, Splunk Inc. jQuery Sparklines. Excerpt: "This
jQuery plugin generates sparklines (small inline charts) directly in the
browser using data supplied either inline in the HTML, or via javascript. The
plugin is compatible with most modern browsers and has been tested with
Firefox 2+, Safari 3+, Opera 9, Google Chrome and Internet Explorer 6, 7 & 8.
Each example displayed below takes just 1 line of HTML or javascript to
generate. The plugin was written by Gareth Watts for Splunk Inc and released
under the New BSD License." [Accessed February 10, 2010]. Available at:
http://omnipotent.net/jquery.sparkline/.
Douglas G Altman. Confidence intervals for the number needed to treat.
BMJ. 1998;317(7168):1309-1312. Excerpt: "The number needed to treat is a
useful way of reporting results of randomised clinical trials. When the
difference between the two treatments is not statistically significant, the
confidence interval for the number needed to treat is difficult to describe.
Sensible confidence intervals can always be constructed for the number needed
to treat. Confidence intervals should be quoted whenever a number needed to
treat value is given" [Accessed February 8, 2010]. Available at:
http://www.bmj.com/cgi/content/full/317/7168/1309.
J. A C Sterne, I. R White, J. B Carlin, et al. Multiple imputation for
missing data in epidemiological and clinical research: potential and pitfalls.
BMJ. 2009;338(jun29 1):b2393-b2393. Excerpt: "Missing data are unavoidable
in epidemiological and clinical research but their potential to undermine the
validity of research results has often been overlooked in the medical
literature. This is partly because statistical methods that can tackle
problems arising from missing data have, until recently, not been readily
accessible to medical researchers. However, multiple imputation�a relatively
flexible, general purpose approach to dealing with missing data�is now
available in standard statistical software, making it possible to handle
missing data semiroutinely. Results based on this computationally intensive
method are increasingly reported, but it needs to be applied carefully to
avoid misleading conclusions." [Accessed February 8, 2010]. Available at:
http://www.bmj.com/cgi/data/bmj.b2393/DC1/1.
Anonymous. Statistical Graphics and more. Excerpt: "Statistical
Graphics, Data Visualization, Visual Analytics, Data Analysis, Data Mining,
User Interfaces - you name it" [Accessed February 5, 2010]. Available at:
http://www.theusrus.de/blog/.
Jon Peck. SPSS Inside-Out | Tips & Tricks for Statisticians to Work
Better, Smarter, and Faster. Excerpt: "Welcome to the SPSS Inside-Out
blog - Tips & Tricks for Statisticians to Work Better, Smarter, and Faster."
[Accessed February 5, 2010]. Available at:
http://insideout.spss.com/.
Edzard Ernst. How Much of CAM is Based on Research Evidence? eCAM.
2009:nep044. Abstract: "The aim of this article is to provide a preliminary
estimate of how much CAM is evidence-based. For this purpose, I calculated the
percentage of 685 treatment/condition pairings evaluated in the Desktop Guide
to Complementary and Alternative Medicine' which ere supported by sound data.
The resulting figure was 7.4%. For a range of reasons, it might be a gross
over-estimate. Further investigations into this subject are required to arrive
at more representative figures." [Accessed February 4, 2010]. Available
at:
http://ecam.oxfordjournals.org/cgi/content/abstract/nep044v1.
Doug Smith. But who's counting? The million-billion mistake is among
the most common in journalism. But why? Excerpt: "The difference
between a million and a billion is a number so vast that it would seem nearly
impossible to confuse the two. Take pennies. At the website of the Mega Penny
Project, you can see that a million pennies stack up to be about the size of a
filing cabinet. A billion would be about the size of five school buses. Or
take real estate. A home in a nice part of Los Angeles might cost a million
dollars. A billion dollars would buy the whole neighborhood. But journalists
can't seem to keep the two numbers straight. Committed as we are to getting
the smallest details right, we seem hopelessly prone to writing "million"
when, in fact, we mean "billion."" [Accessed February 4, 2010]. Available
at:
http://www.latimes.com/news/opinion/commentary/la-oe-smith31-2010jan31,0,2185811.story.
Clinical Evidence. How much do we know? Excerpt: "So what can
Clinical Evidence tell us about the state of our current knowledge? What
proportion of commonly used treatments are supported by good evidence, what
proportion should not be used or used only with caution, and how big are the
gaps in our knowledge? Of around 2500 treatments covered 13% are rated as
beneficial, 23% likely to be beneficial, 8% as trade off between benefits and
harms, 6% unlikely to be beneficial, 4% likely to be ineffective or harmful,
and 46%, the largest proportion, as unknown effectiveness (see figure 1)."
[Accessed February 4, 2010]. Available at:
http://clinicalevidence.bmj.com/ceweb/about/knowledge.jsp.
Cochrane Collaboration. The Cochrane Collaboration estimates that only
"10% to 35% of medical care is based on RCTs". On what information is this
estimate based? Excerpt: "The Cochrane Collaboration has not actually
conducted research to determine this estimate; it is possible that the
estimate of 10-35% comes from the following passage in a chapter by Kerr L
White entitled 'Archie Cochrane's legacy: an American perspective' in the book
'Non-random Reflections on Health Services Research: on the 25th anniversary
of Archie Cochrane's Effectiveness and Efficiency'. This book (published by
the BMJ Publishing Group) was edited by Alan Maynard and Iain Chalmers. Iain
was formerly Director of the UK Cochrane Centre, and the driving force behind
the establishment of The Cochrane Collaboration; he knew Archie Cochrane
well." [Accessed February 4, 2010]. Available at:
http://www.cochrane.org/docs/faq.htm#q20.
John P. A. Ioannidis. Contradicted and Initially Stronger Effects in
Highly Cited Clinical Research. JAMA. 2005;294(2):218-228. Abstract:
"Context: Controversy and uncertainty ensue when the results of clinical
research on the effectiveness of interventions are subsequently contradicted.
Controversies are most prominent when high-impact research is involved.
Objectives: To understand how frequently highly cited studies are contradicted
or find effects that are stronger than in other similar studies and to discern
whether specific characteristics are associated with such refutation over
time. Design: All original clinical research studies published in 3 major
general clinical journals or high-impact-factor specialty journals in
1990-2003 and cited more than 1000 times in the literature were examined. Main
Outcome Measure: The results of highly cited articles were compared against
subsequent studies of comparable or larger sample size and similar or better
controlled designs. The same analysis was also performed comparatively for
matched studies that were not so highly cited. Results: Of 49 highly cited
original clinical research studies, 45 claimed that the intervention was
effective. Of these, 7 (16%) were contradicted by subsequent studies, 7 others
(16%) had found effects that were stronger than those of subsequent studies,
20 (44%) were replicated, and 11 (24%) remained largely unchallenged. Five of
6 highly-cited nonrandomized studies had been contradicted or had found
stronger effects vs 9 of 39 randomized controlled trials (P = .008). Among
randomized trials, studies with contradicted or stronger effects were smaller
(P = .009) than replicated or unchallenged studies although there was no
statistically significant difference in their early or overall citation
impact. Matched control studies did not have a significantly different share
of refuted results than highly cited studies, but they included more studies
with "negative" results. Conclusions: Contradiction and initially stronger
effects are not unusual in highly cited research of clinical interventions and
their outcomes. The extent to which high citations may provoke contradictions
and vice versa needs more study. Controversies are most common with highly
cited nonrandomized studies, but even the most highly cited randomized trials
may be challenged and refuted over time, especially small ones." [Accessed
February 4, 2010]. Available at:
http://jama.ama-assn.org/cgi/content/abstract/294/2/218.
Ann Evensen, Rob Sanson-Fisher, Catherine D'Este, Michael Fitzgerald.
Trends in publications regarding evidence practice gaps: A literature review.
Implementation Science. 2010;5(1):11. Abstract: "BACKGROUND: Well-designed
trials of strategies to improve adherence to clinical practice guidelines are
needed to close persistent evidence-practice gaps. We studied how the number
of these trials is changing with time, and to what extent physicians are
participating in such trials. METHODS: This is a literature-based study of
trends in evidence-practice gap publications over 10 years and participation
of clinicians in intervention trials to narrow evidence-practice gaps. We
chose nine evidence-based guidelines and identified relevant publications in
the PubMed database from January 1998 to December 2007. We coded these
publications by study type (intervention versus non-intervention studies). We
further subdivided intervention studies into those for clinicians and those
for patients. Data were analyzed to determine if observed trends were
statistically significant. RESULTS: We identified 1,151 publications that
discussed evidence-practice gaps in nine topic areas. There were 169
intervention studies that were designed to improve adherence to
well-established clinical guidelines, averaging 1.9 studies per year per topic
area. Twenty-eight publications (34%; 95% CI: 24% - 45%) reported
interventions intended for clinicians or health systems that met Effective
Practice and Organization of Care (EPOC) criteria for adequate design. The
median consent rate of physicians asked to participate in these well-designed
studies was 60% (95% CI, 25% to 69%). CONCLUSIONS: We evaluated research
publications for nine evidence-practice gaps, and identified small numbers of
well-designed intervention trials and low rates of physician participation in
these trials." [Accessed February 4, 2010]. Available at:
http://www.implementationscience.com/content/5/1/11.
Glen Spielmans, Peter Parry. From Evidence-based Medicine to
Marketing-based Medicine: Evidence from Internal Industry Documents.
Journal of Bioethical Inquiry. Abstract: "While much excitement has been
generated surrounding evidence-based medicine, internal documents from the
pharmaceutical industry suggest that the publicly available evidence base may
not accurately represent the underlying data regarding its products. The
industry and its associated medical communication firms state that
publications in the medical literature primarily serve marketing interests.
Suppression and spinning of negative data and ghostwriting have emerged as
tools to help manage medical journal publications to best suit product sales,
while disease mongering and market segmentation of physicians are also used to
efficiently maximize profits. We propose that while evidence-based medicine is
a noble ideal, marketing-based medicine is the current reality." [Accessed
February 3, 2010]. Available at:
http://freepdfhosting.com/ebaef05bfe.pdf.
Wei-Jiun Lin, Huey-Miin Hsueh, James J. Chen. Power and sample size
estimation in microarray studies. BMC Bioinformatics. 2010;11(1):48.
Abstract: "BACKGROUND: Before conducting a microarray experiment, one
important issue that needs to be determined is the number of arrays required
in order to have adequate power to identify differentially expressed genes.
This paper discusses some crucial issues in the problem formulation, parameter
specifications, and approaches that are commonly proposed for sample size
estimation in microarray experiments. Common methods for sample size
estimation are formulated as the minimum sample size necessary to achieve a
specified sensitivity (proportion of detected truly differentially expressed
genes) on average at a specified false discovery rate (FDR) level and
specified expected proportion (pi1) of the true differentially expression
genes in the array. Unfortunately, the probability of detecting the specified
sensitivity in such a formulation can be low. We formulate the sample size
problem as the number of arrays needed to achieve a specified sensitivity with
95% probability at the specified significance level. A permutation method
using a small pilot dataset to estimate sample size is proposed. This method
accounts for correlation and effect size heterogeneity among genes. RESULTS: A
sample size estimate based on the common formulation, to achieve the desired
sensitivity on average, can be calculated using a univariate method without
taking the correlation among genes into consideration. This formulation of
sample size problem is inadequate because the probability of detecting the
specified sensitivity can be lower than 50%. On the other hand, the needed
sample size calculated by the proposed permutation method will ensure
detecting at least the desired sensitivity with 95% probability. The method is
shown to perform well for a real example dataset using a small pilot dataset
with 4-6 samples per group. CONCLUSIONS: We recommend that the sample size
problem should be formulated to detect a specified proportion of
differentially expressed genes with 95% probability. This formulation ensures
finding the desired proportion of true positives with high probability. The
proposed permutation method takes the correlation structure and effect size
heterogeneity into consideration and works well using only a small pilot
dataset." [Accessed February 1, 2010]. Available at:
http://www.biomedcentral.com/1471-2105/11/48.
January 2010
Dariusz Leszczynski, Zhengping Xu. Mobile phone radiation health risk
controversy: the reliability and sufficiency of science behind the safety
standards. Health Research Policy and Systems. 2010;8(1):2. Abstract:
"There is ongoing discussion whether the mobile phone radiation causes any
health effects. The International Commission on Non-Ionizing Radiation
Protection, the International Committee on Electromagnetic Safety and the
World Health Organization are assuring that there is no proven health risk and
that the present safety limits protect all mobile phone users. However, based
on the available scientific evidence, the situation is not as clear. The
majority of the evidence comes from in vitro laboratory studies and is of very
limited use for determining health risk. Animal toxicology studies are
inadequate because it is not possible to "overdose" microwave radiation, as it
is done with chemical agents, due to simultaneous induction of heating
side-effects. There is a lack of human volunteer studies that would, in
unbiased way, demonstrate whether human body responds at all to mobile phone
radiation. Finally, the epidemiological evidence is insufficient due to, among
others, selection and misclassification bias and the low sensitivity of this
approach in detection of health risk within the population. This indicates
that the presently available scientific evidence is insufficient to prove
reliability of the current safety standards. Therefore, we recommend to use
precaution when dealing with mobile phones and, whenever possible and
feasible, to limit body exposure to this radiation. Continuation of the
research on mobile phone radiation effects is needed in order to improve the
basis and the reliability of the safety standards." [Accessed February 1,
2010]. Available at:
http://www.health-policy-systems.com/content/8/1/2.
O Thomas, L Thabane, J Douketis, et al. Industry funding and the
reporting quality of large long-term weight loss trials. Int J Obes.
2008;32(10):1531-1536. Description: This article does not have full free
text available, so I can only comment on the abstract. It appears that
industry funded studies tend to adhere more closely to the CONSORT reporting
guidelines. I suspect that peer-reviewers are more cautious with industry
funded studies and demand more detailed reporting of results. The conclusion
in the abstract "Our findings suggest that the efforts to improve reporting
quality be directed to all obesity RCTs, irrespective of funding source."
seems to suggest that peer reviewers need to hold unfunded studies to the same
standards as they hold funded studies to. [Accessed January 26, 2010].
Available at:
http://dx.doi.org/10.1038/ijo.2008.137.
Harriette G. C. Van Spall, Andrew Toren, Alex Kiss, Robert A. Fowler.
Eligibility Criteria of Randomized Controlled Trials Published in High-Impact
General Medical Journals: A Systematic Sampling Review. JAMA.
2007;297(11):1233-1240. Abstract: "Context: Selective eligibility criteria
of randomized controlled trials (RCTs) are vital to trial feasibility and
internal validity. However, the exclusion of certain patient populations may
lead to impaired generalizability of results. Objective: To determine the
nature and extent of exclusion criteria among RCTs published in major medical
journals and the contribution of exclusion criteria to the representation of
certain patient populations. Data Sources and Study Selection: The MEDLINE
database was searched for RCTs published between 1994 and 2006 in certain
general medical journals with a high impact factor. Of 4827 articles, 283 were
selected using a series technique. Data Extraction: Trial characteristics and
the details regarding exclusions were extracted independently. All exclusion
criteria were graded independently and in duplicate as either strongly
justified, potentially justified, or poorly justified according to previously
developed and pilot-tested guidelines. Data Synthesis: Common medical
conditions formed the basis for exclusion in 81.3% of trials. Patients were
excluded due to age in 72.1% of all trials (60.1% in pediatric populations and
38.5% in older adults). Individuals receiving commonly prescribed medications
were excluded in 54.1% of trials. Conditions related to female sex were
grounds for exclusion in 39.2% of trials. Of all exclusion criteria, only
47.2% were graded as strongly justified in the context of the specific RCT.
Exclusion criteria were not reported in 12.0% of trials. Multivariable
analyses revealed independent associations between the total number of
exclusion criteria and drug intervention trials (risk ratio, 1.35; 95%
confidence interval, 1.11-1.65; P = .003) and between the total number of
exclusion criteria and multicenter trials (risk ratio, 1.26; 95% confidence
interval, 1.06-1.52; P = .009). Industry-sponsored trials were more likely to
exclude individuals due to concomitant medication use, medical comorbidities,
and age. Drug intervention trials were more likely to exclude individuals due
to concomitant medication use, medical comorbidities, female sex, and
socioeconomic status. Among such trials, justification for exclusions related
to concomitant medication use and comorbidities were more likely to be poorly
justified. Conclusions: The RCTs published in major medical journals do not
always clearly report exclusion criteria. Women, children, the elderly, and
those with common medical conditions are frequently excluded from RCTs. Trials
with multiple centers and those involving drug interventions are most likely
to have extensive exclusions. Such exclusions may impair the generalizability
of RCT results. These findings highlight a need for careful consideration and
transparent reporting and justification of exclusion criteria in clinical
trials." [Accessed January 15, 2010]. Available at:
http://jama.ama-assn.org/cgi/content/abstract/297/11/1233.
David Leonhardt. Making Health Care Better. The New York Times.
November 8, 2009. Description: This article profiles Brent James. chief
quality officer at Intermountain Health Care, and his pioneering efforts to
rigorously apply evidence based medicine principles. It highlights some of the
quality improvement initiatives at Intermountain and documents the resistance
to change among many doctors at Intermountain. [Accessed January 14,
2010]. Available at:
http://www.nytimes.com/2009/11/08/magazine/08Healthcare-t.html.
Patrick Burns. R Relative to Statistical Packages: Comment 1 on
Technical Report Number 1 (Version 1.0) Strategically using General Purpose
Statistics Packages: A Look at Stata, SAS and SPSS. Excerpt: "The
technical report Strategically using General Purpose Statistics Packages: A
Look at Stata, SAS and SPSS focuses on comparing strengths and weaknesses of
SAS, SPSS and Stata. There is a section on R, which some have suspected damns
R with faint praise. In particular, R is characterized as hard to learn.
Finally there are sections on a number of very specialized pieces of
statistical software. The primary purpose of this comment is to provide an
alternative view of the role that R has in the realm of statistical software."
[Accessed January 14, 2010]. Available at:
http://www.ats.ucla.edu/stat/technicalreports/Number1/R_relative_statpack.pdf.
Michael N. Mitchell. Strategically using General Purpose Statistics
Packages: A Look at Stata, SAS and SPSS. Abstract: "This report
describes my experiences using general purpose statistical software over 20
years and for over 11 years as a statistical consultant helping thousands of
UCLA researchers. I hope that this information will help you make strategic
decisions about statistical software { the software you choose to learn, and
the software you choose to use for analyzing your research data."
[Accessed January 14, 2010]. Available at:
http://www.ats.ucla.edu/stat/technicalreports/number1_editedFeb_2_2007/ucla_ATSstat_tr1_1.1_0207.pdf.
G. David Garson. StatNotes: Topics in Multivariate Analysis.
Description: This is a general purpose textbook, written in discrete sections
in html format. It covers more than just multivariate analysis. [Accessed
January 14, 2010]. Available at:
http://faculty.chass.ncsu.edu/garson/PA765/statnote.htm.
Robert Muenchen. R-SAS-SPSS Add-on Module Comparison. Excerpt:
"R has over 3,000 add-on packages, many containing multiple procedures, so it
can do most of the things that SAS and SPSS can do and quite a bit more. The
table below focuses only on SAS and SPSS products and which of them have
counterparts in R. As a result, some categories are extremely broad (e.g.
regression) while others are quite narrow (e.g. conjoint analysis). This table
does not contain the hundreds of R packages that have no counterparts in the
form of SAS or SPSS products. There are many important topics (e.g. mixed
models) offered by all three that are not listed because neither SAS Institute
nor IBM's SPSS Company sell a product focused just on that." [Accessed
January 14, 2010]. Available at:
http://r4stats.com/add-on-modules.
G. David Carson. Reliability Analysis: Statnotes, from North Carolina
State University, Public Administration Program. Excerpt: "Researchers
must demonstrate instruments are reliable since without reliability, research
results using the instrument are not replicable, and replicability is
fundamental to the scientific method. Reliability is the correlation of an
item, scale, or instrument with a hypothetical one which truly measures what
it is supposed to. Since the true instrument is not available, reliability is
estimated in one of four ways: 1. Internal consistency: Estimation based on
the correlation among the variables comprising the set (typically, Cronbach's
alpha). 2. Split-half reliability: Estimation based on the correlation of two
equivalent forms of the scale (typically, the Spearman-Brown coefficient). 3.
Test-retest reliability: Estimation based on the correlation between two (or
more) administrations of the same item, scale, or instrument for different
times, locations, or populations, when the two administrations do not differ
on other relevant variables (typically, the Spearman Brown coefficient). 4.
Inter-rater reliability: Estimation based on the correlation of scores
between/among two or more raters who rate the same item, scale, or instrument
(typically, intraclass correlation, of which there are six types discussed
below). These four reliability estimation methods are not necessarily mutually
exclusive, nor need they lead to the same results. All reliability
coefficients are forms of correlation coefficients, but there are multiple
types discussed below, representing different meanings of reliability and more
than one might be used in single research setting. " [Accessed January 1,
2010]. Available at:
http://faculty.chass.ncsu.edu/garson/PA765/reliab.htm.