Well received in its first edition, Survival Analysis: A Practical Approach is completely revised to provide an accessible and practical guide to survival analysis techniques in diverse environments.
Illustrated with many authentic examples, the book introduces basic statistical concepts and methods to construct survival curves, later developing them to encompass more specialised and complex models.
During the years since the first edition there have been several new topics that have come to the fore and many new applications. Parallel developments in computer software programmes, used to implement these methodologies, are relied upon throughout the text to bring it up to date.
"synopsis" may belong to another edition of this title.
David Machin: Division of Clincial Trials and Epidemiological Sciences, National Cancer Centre, Singapore; UK Children’s Cancer Study Group, University of Leicester, UK Institute of General Practice and Primary Care, School of Health and Related Sciences, University of Sheffield, UK
Author of recently published Design of Medical Studies for Medical Research and editor of Textbook of Clinical Trials among other Wiley titles.
Yin Bun Cheung: MRC Tropical Epidemiology Group, London School of Hygiene and Tropical Medicine, UK, Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre, Singapore.
Mahesh Parmar: Cancer Trials Division, MRC Clincial Trials Unit, London, UK.
Well received in its first edition, Survival Analysis: A Practical Approach is revised, keeping to its original underlying aim, to provide an accessible and practical guide to survival analysis techniques in diverse environments.
Illustrated with many authentic examples, the book introduces basic statistical concepts and methods to construct survival curves, later developing them to encompass more specialised and complex models.
During the years since the first edition there have been several new topics that have come to the fore and many new applications. Parallel developments in computer software programmes, used to implement these methodologies, are relied upon throughout the text to bring it up to date.
This book is designed with the practitioner in mind and is aimed at medical statisticians, epidemiologists, clinicians and healthcare professionals as well as students studying survival analysis as part of their graduate or postgraduate courses and thus presents the subject in a user-friendly way.
Well received in its first edition, Survival Analysis: A Practical Approach is revised, keeping to its original underlying aim, to provide an accessible and practical guide to survival analysis techniques in diverse environments.
Illustrated with many authentic examples, the book introduces basic statistical concepts and methods to construct survival curves, later developing them to encompass more specialised and complex models.
During the years since the first edition there have been several new topics that have come to the fore and many new applications. Parallel developments in computer software programmes, used to implement these methodologies, are relied upon throughout the text to bring it up to date.
This book is designed with the practitioner in mind and is aimed at medical statisticians, epidemiologists, clinicians and healthcare professionals as well as students studying survival analysis as part of their graduate or postgraduate courses and thus presents the subject in a user-friendly way.
Summary
In this chapter we introduce some examples of the use of survival methods in a selection of different areas and describe the concepts necessary to define survival time. The chapter also includes a review of some basic statistical ideas including the Normal distribution, hypothesis testing and the use of confidence intervals, the [chi square] and likelihood ratio tests and some other methods useful in survival analysis, including the median survival time and the hazard ratio. The difference between clinical and statistical significance is highlighted. The chapter indicates some of the computing packages that can be used to analyse survival data and emphasises that the database within which the study data is managed and stored must interface easily with these.
1.1 INTRODUCTION
There are many examples in medicine where a survival time measurement is appropriate. For example, such measurements may include the time a kidney graft remains patent, the time a patient with colorectal cancer survives once the tumour has been removed by surgery, the time a patient with osteoarthritis is pain-free following acupuncture treatment, the time a woman remains without a pregnancy whilst using a particular hormonal contraceptive and the time a pressure sore takes to heal. All these times are triggered by an initial event: a kidney graft, a surgical intervention, commencement of acupuncture therapy, first use of a contraceptive or identification of the pressure sore. These initial events are followed by a subsequent event: graft failure, death, return of pain, pregnancy or healing of the sore. The time between such events is known as the 'survival time'. The term survival is used because an early use of the associated statistical techniques arose from the insurance industry, which was developing methods of costing insurance premiums. The industry needed to know the risk, or average survival time, associated with a particular type of client. This 'risk' was based on that of a large group of individuals with a particular age, gender and possibly other characteristics; the individual was then given the risk for his or her group for the calculation of their insurance premium.
There is one major difference between 'survival' data and other types of numeric continuous data: the time to the event occurring is not necessarily observed in all subjects. Thus in the above examples we may not observe for all subjects the events of graft failure (the graft remains functional indefinitely), death (the patient survives for a very long time), return of pain (the patient remains pain-free thereafter), pregnancy (the woman never conceives) or healing of the sore (the sore does not heal), respectively. Such non-observed events are termed 'censored' but are quite different from 'missing' data items.
The date of 1 March 1973 can be thought of as the 'census day', that is, the day on which the currently available data on all patients recruited to the transplant programme were collected together and summarised. Typically, as in this example, by the census day some patients will have died whilst others remain alive. The survival times of those who are still alive are termed censored survival times. Censored survival times are described in Section 2.1
The probability of survival without transplant for patients identified as transplant candidates is shown in Figure 1.1. Details of how this probability is calculated using the Kaplan-Meier (product-limit) estimate, are given in Section 2.2. By reading across from 0.5 on the vertical scale in Figure 1.1 and then vertically downwards at the point of intersection with the curve, we can say that approximately half (see Section 2.3) of such patients will die within 80 days of being selected as suitable for transplant if no transplant becomes available for them.
Historically, much of survival analysis has been developed and applied in relation to cancer clinical trials in which the survival time is often measured from the date of randomisation or commencement of therapy until death. The seminal papers by Peto, Pike, Armitage et al. (1976, 1977) published in the British Journal of Cancer describing the design, conduct and analysis of cancer trials provide a landmark in the development and use of survival methods.
The method of making a formal comparison of two survival curves with the Logrank test is described in Chapter 3.
One field of application of survival studies has been in the development of methods of fertility regulation. In such applications alternative contraceptive methods either for the male or female partner are compared in prospective randomised trials. These trials usually compare the efficacy of different methods by observing how many women conceive in each group. A pregnancy is deemed a failure in this context.
Survival time methods have been used extensively in many medical fields, including trials concerned with the prevention of new cardiovascular events in patients who have had a recent myocardial infarction (Wallentin, Wilcox, Weaver et al., 2003), prevention of type 2 diabetes mellitus in those with impaired glucose tolerance (Chiasson, Josse, Gomis et al., 2002), return of post-stroke function (Mayo, Korner-Bitensky and Becker, 1991), and AIDS (Bonacini, Louie, Bzowej, et al., 2004).
1.2 DEFINING TIME
In order to perform survival analysis one must know how to define the time-to-event interval. The endpoint of the interval is relatively easy to define. In the examples in Section 1.1, they were graft failure, return of pain, pregnancy, healing of the sore, recovery of sperm function, and death. However, defining the initial event that trigger the times is sometimes a more difficult task.
INITIAL EVENTS AND THE ORIGIN OF TIME
The origin of time refers to the starting point of a time interval, when t = 0. We have mentioned the time from the surgical removal of a colorectal cancer to the death of the patient. So the initial event was surgery and t =0 corresponds to the date of surgery. However, a quite common research situation in cancer clinical trials is that after surgery, patients are randomised into receiving one of two treatments, say two types of adjuvant chemotherapy. Should the initial event be surgery, randomisation, or the start of chemotherapy? How does one choose when there are several (starting) events that can be considered? There are no definite rules, but some considerations are as follows.
It is intuitive to consider a point in time that marks the onset of exposure to the risk of the outcome event. For example, when studying ethnic differences in mortality, birth may be taken as the initial event as one is immediately at risk of death once born. In this case, survival time is equivalent to age at death. However, in studies of hospital readmission rates, a patient cannot be readmitted until he or she is first of all discharged. Therefore the latest discharge is the initial event and marks the origin of the time interval to readmission.
In randomised trials, the initial event should usually be randomisation to treatment. Prior to randomisation, patients may have already been at risk of the outcome event (perhaps dying from their colorectal cancer before surgery can take place), but that is not relevant to the research purpose. In randomised trials the purpose is to compare the occurrences of the outcome given the different assigned interventions. As such, the patient is not considered to be at risk until he or she is randomised to receive an intervention. In the colorectal cancer trial example, time of surgery is not a proper choice for the time origin. However, some interventions may not be immediately available at the time of randomisation, for example, treatments that involve waiting for a suitable organ donor or heavily booked facilities. In these cases a patient may die after randomisation but before the assigned treatment begins. In such circumstances would the date that the treatment actually begins form a better time origin? Most randomised clinical trials follow the 'intention-to-treat' (ITT) principle. That is, the interventions are not the treatments per se, but the clinical intention is to care for the patients by the particular treatment strategies being compared. If a treatment involves a long waiting time and patients die before the treatment begins, this is the weakness of the intervention and should be reflected in the comparisons made between the treatment groups at the time of analysis. In such situations it is correct to use randomisation to mark the time origin.
DELAYED ENTRY AND GAPS IN EXPOSURE TIME
Delayed entry, or late entry to the risk set, refers to the situation when a subject becomes at risk of the outcome event at a point in time after time zero. If, for example, ethnic differences in all cause mortality are being investigated, everyone becomes at risk of death at the time of birth (t = 0) so there is no delayed entry. An example of delayed entry is a study of first birth where no girl is at risk until menarche, the age of which varies from individual to individual. A common way of defining time in such circumstances is to 'reset the clock' to time zero at the age of menarche and time to first birth is counted from there. This is the approach used in Section 2.1 in the context of clinical trials. There is an alternative approach to defining time in the presence of delayed entry, which tends to be more useful in epidemiological studies and will be discussed in Section 7.4.
In certain circumstances a subject may be within the risk set for a period of time, leave the risk set, only to rejoin the risk set again. For example, a cohort may be recruited comprising all of the workforce at a particular factory at one point in time and these workers are then monitored for their exposure to a potential hazard within the work place. Some may leave the risk set later, perhaps for maternity leave, but will subsequently return once the baby is born. Such intermittent absences from the risk set leave gaps in the exposure time. If they are ignored, the event rate will be incorrectly estimated. The technique to handle these gaps in the 'at risk' time is similar to that for delayed entry and will also be discussed in Section 7.4.
FAILURE TIME MUST BE LARGER THAN ZERO
A logically valid survival time must be larger than zero. If survival time is measured in days and both the initial and outcome events take place on the same day, the survival time may be recorded as zero. For instance, in Table 1.1, a patient who did not receive a transplant had X=0. Such values should usually be interpreted as a survival time smaller than one but larger than zero. Hence one may consider replacing the zero survival time value with a value of 0.5 days. In circumstances when time intervals are wide, say months, and the exact times of the initial and outcome events are not recorded, then one may assume that, on average, the initial event takes place at the middle of the time unit, and that the outcome event takes place at the middle between the time of the initial event and the end of that time unit. In this case, one might replace the zero survival time with 0.25. Other small values may also be considered depending on the context. The problem of survival time recorded as zero also highlights the importance of measuring survival time as precisely as practicable. Computer packages generally refuse to analyse observations with negative failure time (usually a data error) or those equal to zero.
1.3 BASIC STATISTICAL IDEAS
The aim of nearly all studies, including those involving 'survival' data, is to extrapolate from observations made on a sample of individuals to the population as a whole. For example, in a trial of a new treatment for arthritis it is usual to assess the merits of the therapy on a sample of patients (preferably in a randomised controlled trial) and try to deduce from this trial whether the therapy is appropriate for general use in patients with arthritis. In many instances, it may be that the target population is more exactly specified, for example by patients with arthritis of a certain type, of a particular severity, or patients of a certain gender and age group. Nevertheless, the aim remains the same: the inference from the results obtained from a sample to a (larger) population.
MEDIAN SURVIVAL
A commonly reported summary statistic in survival studies is the median survival time. The median survival time is defined as the value for which 50% of the individuals in the study have longer survival times and 50% have shorter survival times. A more formal definition is given in Section 2.3 The reason for reporting this value rather than the mean survival time (the mean is defined in equation (1.1) below) is that the distributions of survival time data often tend to be skew, sometimes with a small number of long-term 'survivors'. For example, the distribution shown in Figure 1.5(a) of the delay between first symptom and formal diagnosis of cervical cancer in 131 women, ranging from 1 to 610 days, is not symmetric. The median delay to diagnosis for the women with cervical cancer was 135 days. The distribution is skewed to the right, in that the right-hand tail of the distribution is much longer than the left-hand tail. In this situation the mean is not a good summary of the 'average' survival time because it is unduly influenced by the extreme observations.
In this example, we have the duration of the delay in diagnosis for all 131 women. However, the approach used here for calculating the median should not be used if there are censored values amongst our observations. This will usually be the case with survival-type data. In this instance the method described in Section 2.3 is appropriate.
THE NORMAL DISTRIBUTION
For many types of medical data the histogram of a continuous variable obtained from a single measurement on different subjects will have a characteristic 'bell-shaped' or Normal distribution. For some data which do not have such a distribution, a simple transformation of the variable may help. For example, if we calculate x = log t for each of the n = 131 women discussed above, where t is the delay from first symptom to diagnosis in days, then the distribution of x is given in Figure 1.5(b). This distribution is closer to the Normal distribution shape than that of Figure 1.5(a) and we can therefore calculate the arithmetic mean-more briefly the mean-of the x's by
[bar.x] = [summation] x/n (1.1)
to indicate the average value of the data illustrated in Figure 1.5(b). For these data, this gives [bar.x] = 4.88 log days and which corresponds to 132 days.
Now that the distribution has an approximately Normal shape we can express the variability in values about the mean by the standard deviation (SD). This is given by
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (1.2)
For the women with delay to diagnosis of their cervical cancer, equation (1.2) gives SD = 0.81 log days. From this we can then calculate the standard error (SE) of the mean as
SE([bar.x]) = SD/[square root of n]. (1.3)
This gives SE([bar.x]) =0.81/[square root of 131] = 0.07 log days.
CONFIDENCE INTERVALS
For any statistic, such as a sample mean, [bar.x], it is useful to have an idea of the uncertainty in using this as an estimate of the underlying true population mean, . This is done by constructing a (confidence) interval-a range of values around the estimate-which we can be confident includes the true underlying value. Such a confidence interval (CI) for extends evenly either side of [bar.x] by a multiple of the standard error (SE) of the mean. Thus, for example, a 95% CI is the range of values from [bar.x]-(1.9600 SE) to [bar.x]+ (1.9600 SE),while a 99% CI is the range of values from [bar.x] - (2.5758 SE) to [bar.x] + (2.5758 SE). In general a 100(1 - [alpha])% CI, for is given by
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (1.4)
(Continues...)
Excerpted from Survival Analysisby David Machin Yin Bun Cheung Mahesh K.B. Parmar Copyright © 2006 by John Wiley & Sons, Ltd.. Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.
"About this title" may belong to another edition of this title.
US$ 3.00 shipping within U.S.A.
Destination, rates & speedsUS$ 6.69 shipping from United Kingdom to U.S.A.
Destination, rates & speedsSeller: PBShop.store UK, Fairford, GLOS, United Kingdom
HRD. Condition: New. New Book. Shipped from UK. Established seller since 2000. Seller Inventory # FW-9780470870402
Quantity: 15 available
Seller: Grand Eagle Retail, Bensenville, IL, U.S.A.
Hardcover. Condition: new. Hardcover. Well received in its first edition, Survival Analysis: A Practical Approach is completely revised to provide an accessible and practical guide to survival analysis techniques in diverse environments. Illustrated with many authentic examples, the book introduces basic statistical concepts and methods to construct survival curves, later developing them to encompass more specialised and complex models. During the years since the first edition there have been several new topics that have come to the fore and many new applications. Parallel developments in computer software programmes, used to implement these methodologies, are relied upon throughout the text to bring it up to date. Well received in its first edition, Survival Analysis: A Practical Approach is completely revised to provide an accessible and practical guide to survival analysis techniques in diverse environments. Shipping may be from multiple locations in the US or from the UK, depending on stock availability. Seller Inventory # 9780470870402
Quantity: 1 available
Seller: Goodwill Southern California, Los Angeles, CA, U.S.A.
Condition: good. Seller Inventory # 4CJS370014ZK
Quantity: 1 available
Seller: GreatBookPricesUK, Woodford Green, United Kingdom
Condition: New. Seller Inventory # 1765646-n
Quantity: Over 20 available
Seller: GreatBookPricesUK, Woodford Green, United Kingdom
Condition: As New. Unread book in perfect condition. Seller Inventory # 1765646
Quantity: Over 20 available
Seller: Ria Christie Collections, Uxbridge, United Kingdom
Condition: New. In. Seller Inventory # ria9780470870402_new
Quantity: Over 20 available
Seller: Majestic Books, Hounslow, United Kingdom
Condition: New. pp. 278 Illus. Seller Inventory # 7486406
Quantity: 3 available
Seller: THE SAINT BOOKSTORE, Southport, United Kingdom
Hardback. Condition: New. New copy - Usually dispatched within 4 working days. 702. Seller Inventory # B9780470870402
Quantity: Over 20 available
Seller: Books Puddle, New York, NY, U.S.A.
Condition: New. pp. 278. Seller Inventory # 26361497
Quantity: 3 available
Seller: GreatBookPrices, Columbia, MD, U.S.A.
Condition: New. Seller Inventory # 1765646-n
Quantity: 1 available