# Survival bracelet information

You are consulting for a clinical research group planning a trial to compare survival rates for proposed and standard cancer treatments.
Use the TWOSAMPLESURVIVAL statement with the TEST=LOGRANK option to compute the required sample size for the log-rank test. The required sample size per group to achieve a power of 0.8 is 228 if the median loss time is 20 years for the proposed treatment. This example covers two commonly used survival analysis models: the exponential model and the Weibull model. The data set e1684 contains the following variables: t is the failure time that equals the censoring time whether the observation was censored, v indicates whether the observation is an actual failure time or a censoring time, treatment indicates two levels of treatments, and ifn indicates the use of interferon as a treatment. Note that this formulation of the exponential distribution is different from what is used in the SAS probability function PDF. The two assignment statements that are commented out calculate the log-likelihood function by using the SAS functions LOGPDF and LOGSDF for the exponential distribution. The next part of this example shows fitting a Weibull regression to the data and then comparing the two models with DIC to see which one provides a better fit to the data. Note that this formulation of the Weibull distribution is different from what is used in the SAS probability function PDF.
As with the exponential model, in the absence of prior information about the parameters in this model, you can use diffuse normal priors on You might wish to choose a diffuse gamma distribution for Note that when , the Weibull survival likelihood reduces to the exponential survival likelihood.
The MONITOR= option indicates the parameters and quantities of interest that PROC MCMC tracks.
An examination of the parameter reveals that the exponential model might not be inappropriate here. The output from PROC FREQ shows that 100% of the 10000 simulated values for are less than 1. There is a clear decreasing trend over time of the survival probabilities for patients who receive the treatment.
The plot suggests that there is an effect of using interferon because patients who received interferon have sustained better survival probabilities than those who did not. Although the evidence from the Weibull model fit shows that the posterior distribution of has a significant amount of density mass less than 1, suggesting that the Weibull model is a better fit to the data than the exponential model, you might still be interested in comparing the two models more formally.
The posterior samples of beta0 and beta1 in the data set expsurvout1 are identical to those in the data set expsurvout.
Note: The terms event and failure are used interchangeably in this seminar, as are time to event and failure time.
In this seminar we will be analyzing the data of 500 subjects of the Worcester Heart Attack Study (referred to henceforth as WHAS500, distributed with Hosmer & Lemeshow(2008)). Understanding the mechanics behind survival analysis is aided by facility with the distributions used, which can be derived from the probability density function and cumulative density functions of survival times.
As an example, we can use the cdf to determine the probability of observing a survival time of up to 100 days. In the graph above we can see that the probability of surviving 200 days or fewer is near 50%. The survivor function, $S(t)$, describes the probability of surviving past time $t$, or $Pr(Time > t)$.
The hazard function, then, describes the relative likelihood of the event occurring at time $t$ ($f(t)$), conditional on the subject's survival up to that time $t$ ($S(t)$). As we have seen before, the hazard appears to be greatest at the beginning of follow-up time and then rapidly declines and finally levels off. Also useful to understand is the cumulative hazard function, which as the name implies, cumulates hazards over time.
Let us again think of the hazard function, $h(t)$, as the rate at which failures occur at time $t$. From these equations we can see that the cumulative hazard function $H(t)$ and the survival function $S(t)$ have a simple monotonic relationship, such that when the Survival function is at its maximum at the beginning of analysis time, the cumulative hazard function is at its minimum. We can estimate the cumulative hazard function using proc lifetest, the results of which we send to proc sgplot for plotting.
This seminar covers both proc lifetest and proc phreg, and data can be structured in one of 2 ways for survival analysis. A second way to structure the data that only proc phreg accepts is the "counting process" style of input that allows multiple rows of data per subject. This structuring allows the modeling of time-varying covariates, or explanatory variables whose values change across follow-up time. Any serious endeavor into data analysis should begin with data exploration, in which the researcher becomes familiar with the distributions and typical values of each variable individually, as well as relationships between pairs or sets of variables. We see in the table above, that the typical subject in our dataset is more likely male, 70 years of age, with a bmi of 26.6 and heart rate of 87. Looking at the table of "Product-Limit Survival Estimates" below, for the first interval, from 1 day to just before 2 days, $n_i$ = 500, $d_i$ = 8, so $\hat S(1) = \frac{500 - 8}{500} = 0.984$.
Survival analysis often begins with examination of the overall survival experience through non-parametric methods, such as Kaplan-Meier (product-limit) and life-table estimators of the survival function. At a minimum proc lifetest requires specification of a failure time variable, here lenfol, on the time statement. Without further specification, SAS will assume all times reported are uncensored, true failures.
We also specify the option atrisk on the proc lifetest statement to display the number at risk in our sample at various time points. Above we see the table of Kaplan-Meier estimates of the survival function produced by proc lifetest. From "LENFOL"=368 to 376, we see that there are several records where it appears no events occurred.
By default, proc lifetest graphs the Kaplan Meier estimate, even without the plot= option on the proc lifetest statement, so we could have used the same code from above that produced the table of Kaplan-Meier estimates to generate the graph. However, we would like to add confidence bands and the number at risk to the graph, so we add plots=survival(atrisk cb). The step function form of the survival function is apparent in the graph of the Kaplan-Meier estimate. Because of its simple relationship with the survival function, $S(t)=e^{-H(t)}$, the cumulative hazard function can be used to estimate the survival function. The Nelson-Aalen estimator is requested in SAS through the nelson option on the proc lifetest statement. Researchers are often interested in estimates of survival time at which 50% or 25% of the population have died or failed. Suppose that you suspect that the survival function is not the same among some of the groups in your study (some groups tend to fail more quickly than others). When provided with a grouping variable in a strata statement in proc lifetest, SAS will produce graphs of the survival function (unless other graphs are requested) stratified by the grouping variable as well as tests of equality of the survival function across strata. In the graph of the Kaplan-Meier estimator stratified by gender below, it appears that females generally have a worse survival experience. In the output we find three Chi-square based tests of the equality of the survival function over strata, which support our suspicion that survival differs between genders. Whereas with non-parametric methods we are typically studying the survival function, with regression methods we examine the hazard function, $h(t)$. In regression models for survival analysis, we attempt to estimate parameters which describe the relationship between our predictors and the hazard rate. Cox models are typically fitted by maximum likelihood methods, which estimate the regression parameters that maximize the probability of observing the given set of survival times.

The probability of observing subject $j$ fail out of all $R_j$ remaing at-risk subjects, then, is the proportion of the sum total of hazard rates of all $R_j$ subjects that is made up by subject $j$'s hazard rate.
We also would like survival curves based on our model, so we add plots=survival to the proc phreg statement, although as we shall see this specification is probably insufficient for what we want.
On the model statement, on the left side of the equation, we provide the follow up time variable, lenfol, and the censoring variable, fstat, with all censoring values listed in parentheses. Model Fit Statistics: Displays fit statistics which are typically used for model comparison and selection. Analysis of Maximum Likelihood Estimates: Displays model coefficients, tests of significance, and exponentiated coefficient as hazard ratio. When only plots=survival is specified on the proc phreg statement, SAS will produce one graph, a "reference curve" of the survival function at the reference level of all categorical predictors and at the mean of all continuous predictors.
In this model, this reference curve is for males at age 69.845947 Usually, we are interested in comparing survival functions between groups, so we will need to provide SAS with some additional instructions to get these graphs. Acquiring more than one curve, whether survival or hazard, after Cox regression in SAS requires use of the baseline statement in conjunction with the creation of a small dataset of covariate values at which to estimate our curves of interest. This expanded dataset can be named and then viewed with the out= option, but obtaining the out= dataset is not at all necessary to generate the survival plots. Both survival and cumulative hazard curves are available using the plots= option on the proc phreg statement, with the keywords survival and cumhaz, respectively. Let's get survival curves (cumulative hazard curves are also available) for males and female at the mean age of 69.845947 in the manner we just described. We request survival plots that are overlaid with the plot(overlay)=(survival) specification on the proc phreg statement. We also add the rowid=option on the baseline statement, which tells SAS to label the curves on our graph using the variable gender.
The survival curves for females is slightly higher than the curve for males, suggesting that the survival experience is possibly slightly better (if significant) for females, after controlling for age. In our previous model we examined the effects of gender and age on the hazard rate of dying after being hospitalized for heart attack.
In the code below we fit a Cox regression model where we allow examine the effects of gender, age, bmi, and heart rate on the hazard rate. This example highlights some of the new features of PROC LIFETEST for SAS 9.2, especially the survival plot with number of subjects at risk and multiple comparisons of survival curves. In the following statements, PROC LIFETEST is invoked to compute the product-limit estimate of the survivor function for each risk category. Klein and Moeschberger (1997, Section 4.4) describe in detail how to compute the Hall-Wellner (HW) and equal-precision (EP) confidence bands for the survivor function. Proportional hazard model including the design variables for age using deviation from mean coding. The planned data analysis is a log-rank test to nonparametrically compare the overall survival curves for the two treatments. The "Standard" curve has only one point, specifying an exponential form with a survival probability of 0.5 at year 5. Only six more patients are required in each group if the median loss time is as short as five years. The deviance information criterion (DIC) is used to do model selections, and you can also find programs that visualize posterior quantities. The next two assignment statements calculate the log likelihood by using the simplified formula. The posterior means for and are estimated with high precision, with small standard errors with respect to the standard deviation. Equivalently, by looking at the posterior distribution of , you can conclude whether fitting an exponential survival model would be more appropriate than the Weibull model. This is a very strong indication that the exponential model is too restrictive to model these data well. However, the effect might not be very significant, as the 95% credible intervals of the two groups do overlap. You can use the Bayesian model selection criterion (see the section Deviance Information Criterion (DIC)) to determine which model fits the data better.
To make meaningful comparisons, you must ensure that all [D]GENERAL functions include appropriate normalizing constants. A smaller DIC indicates a better fit to the data; hence, you can conclude that the Weibull model is more appropriate for this data set.
This study examined several factors, such as age, gender and BMI, that may influence survival time after heart attack.
That is, for some subjects we do not know when they died after heart attack, but we do know at least how many days they survived. Thus, each term in the product is the conditional probability of survival beyond time $t_i$, meaning the probability of surviving beyond time $t_i$, given the subject has survived up to time $t_i$. Each row of the table corresponds to an interval of time, beginning at the time in the "LENFOL" column for that row, and ending just before the time in the "LENFOL" column in the first subsequent row that has a different "LENFOL" value. When a subject dies at a particular time point, the step function drops, whereas in between failure times the graph remains flat. SAS will output both Kaplan Meier estimates of the survival function and Nelson-Aalen estimates of the cumulative hazard function in one table. In a nutshell, these statistics sum the weighted differences between the observed number of failures and the expected number of failures for each stratum at each timepoint, assuming the same survival function of each stratum.
From the plot we can see that the hazard function indeed appears higher at the beginning of follow-up time and then decreases until it levels off at around 500 days and stays low and mostly constant. In the following DATA step, data of 137 bone marrow transplant patients extracted from Klein and Moeschberger (1997) are saved in the data set BMT . Patients in the AML-Low Risk group experience disease free longer than those in the ALL group, who in turn fare better than those in the AML-High Risk group. There is no significant difference in disease-free survivor functions between the ALL and AML-High Risk groups (p=0.2779). You can use the DIFF= option in the STRATA statement to designate this risk group as the control and apply a multiple-comparison adjustment to the p-values for the paired comparison between the AML-Low Risk group with each of the other groups.
You can output these simultaneous confidence intervals to a SAS data set by using the CONFBAND= and OUTSURV= options in the PROC LIFETEST statement. Alternatively, you can use the simplified log-likelihood function, which is more computationally efficient. The first approach is slower because of the redundant calculation involved in calling both LOGPDF and LOGSDF.
This indicates that the mean estimates have stabilized and do not vary greatly in the course of the simulation. The array surv_ifn stores the expected survival probabilities for patients who received interferon over a period of 10 years.
As noted previously, if , then the Weibull survival distribution is the exponential survival distribution. You can examine the estimated survival probabilities over time individually, either through the posterior summary statistics or by looking at the kernel density plots. In this case, you want to overlay the two predicted curves for the two groups of patients and add the corresponding credible interval. The PROC MCMC DIC option requests the calculation of DIC, and the procedure displays the ODS output table DIC. You can see the equivalencing of the exponential model you fitted in Exponential Survival Model by running the following comparison.

At the time of transplant, each patient is classified into one of three risk categories: ALL (acute lymphoblastic leukemia), AML (acute myeloctic leukemia)-Low Risk, and AML-High Risk.
The PLOTS= option requests that the survival curves be plotted, and the ATRISK= suboption specifies the time points at which the at-risk numbers are displayed. You can display survival curves with pointwise and simultaneous confidence limits through ODS Graphics.
The survival curve for patients on the standard treatment is well known to be approximately exponential with a median survival time of five years. The GROUPSURVIVAL= option assigns the survival curves to the two groups, and the ACCRUALTIME= and FOLLOWUPTIME= options specify the accrual and follow-up times. This example shows you how to use PROC MCMC to analyze the treatment effect for the E1684 melanoma clinical trial data. Quantities of interest in survival analysis include the value of the survival function at specific times for specific treatments and the relationship between the survival curves for different treatments. Similarly, surv_noifn stores the expected survival probabilities for patients who did not received interferon. Alternatively, you might find it more informative to examine these quantities in relation with each other. The table includes the posterior mean of the deviation, , deviation at the estimate, , effective number of parameters, , and DIC.
The endpoint of interest is the disease-free survival time, which is the time to death or relapse or the end of the study in days. In the STRATA statement, the ADJUST=SIDAK option requests the idA?k multiple-comparison adjustment, and by default, all paired comparisons are carried out. When the survival data are stratified, displaying all the survival curves and their confidence limits in the same plot can make the plot appear cluttered.
These data were collected to assess the effectiveness of using interferon alpha-2b in chemotherapeutic treatment of melanoma.
With PROC MCMC, you can compute a sample from the posterior distribution of the interested survival functions at any number of points. The BEGINNODATA and ENDNODATA statements enclose the calculations for the survival probabilities.
For example, you can use a side-by-side box plot to display these posterior distributions by using PROC SGPLOT (Statistical Graphics Using ODS).
To generate the graph, you first take the posterior mean estimates from the ODS output table ds and the lower and upper HPD interval estimates is, store them in the data set surv, and draw the figure by using PROC SGPLOT. It is important to remember that the standardizing term, , which is a function of the data alone, is not taken into account in calculating the DIC.
Additionally, another variable counts the number of events occurring in each interval (either 0 or 1 in Cox regression, same as the censoring variable). Other nonparametric tests using other weighting schemes are available through the test= option on the strata statement.
Instead, we need only assume that whatever the baseline hazard function is, covariate effects multiplicatively shift the hazard function and these multiplicative shifts are constant over time. In this data set, the variable Group represents the patienta€™s risk category, the variable T represents the disease-free survival time, and the variable Status is the censoring indicator, with the value 1 indicating an event time and the value 0 a censored time. In the following statements, the PLOTS= specification requests that the survivor functions be displayed along with their pointwise confidence limits (CL) and Hall-Wellner confidence bands (CB=HW). Patients will be accrued uniformly over two years and then followed for an additional three years past the accrual period. The data in this example range from about 0 to 10 years, and the treatment of interest is the use of interferon. The assignment statements proceeding the MODEL statement calculate the log likelihood for the Weibull survival model.
First you need to take the posterior output data set weisurvout and stack variables that you want to plot. This term is irrelevant only if you compare two models that have the same likelihood function. As an example, imagine subject 1 in the table above, who died at 2,178 days, was in a treatment group of interest for the first 100 days after hospital admission.
The red curve representing the lowest BMI category is truncated on the right because the last person in that group died long before the end of followup time. The STRATA=PANEL specification requests that the survival curves be displayed in a panel of three plots, one for each risk group.
Some loss to follow-up is expected, with roughly exponential rates that would result in about 50% loss with the standard treatment within 10 years.
Like in the previous exponential model example, there are two ways to fit this model: using the SAS functions LOGPDF and LOGSDF, or using the simplified log likelihood functions. For example, to plot all the survival times for patients who received interferon, you want to stack surv_inf1a€“surv_inf10.
If you do not have identical likelihood functions, using DIC for model selection purposes without taking this standardizing term into account can produce incorrect results. The loss to follow-up with the proposed treatment is more difficult to predict, but 50% loss would be expected to occur sometime between years 5 and 20. An examination of the trace plots for , , and (not displayed here) reveals that the sampling has gone well, with no particular concerns about the convergence or mixing of the chains.
The macro %Stackdata takes an input data set dataset, stacks the wanted variables vars, and outputs them into the output data set. In addition, you want to be careful in interpreting the DIC whenever you use the GENERAL function to construct the log-likelihood, as the case in this example. Here we see the estimated pdf of survival times in the whas500 set, from which all censored observations were removed to aid presentation and explanation. In very large samples the Kaplan-Meier estimator and the transformed Nelson-Aalen (Breslow) estimator will converge. Using the GENERAL function, you can obtain identical posterior samples with two log-likelihood functions that differ only by a constant. This difference translates to a difference in the DIC calculation, which could be very misleading.
LIFETEST to compute the Kaplan-Meier (1958) curve, which is a All examples use the 12.1 release of SAS software from 2012. However, if you do not want to work out the mathematical detail or you are uncertain of the equivalence, a better way of comparing the DICs is to run the Weibull model twice: once with being a parameter and once with . This ensures that the likelihood functions are the same, and the DIC comparison is meaningful. This example also illustrates the use of the CL option, which displays pointwise confidence limits for the. The LIFETEST Procedure Product-Limit Survival Estimates Survival Standard Number the table as it appears in the text, but requires considerably more SAS code.
TO OBTAIN CONFIDENCE LIMITS AROUND THE INDIVIDUAL TIME POINT Simultaneous Confidence Intervals for Kaplan-Meier Curves .

### Comments to «Sas output survival curve generator»

1. writes:
Turn out to be damaged or dislocated, surgical you.

2. writes:
Erectile dysfunction impacts only eater workout routines the most glad.

3. writes:
Social media, burn to DVDs for memorable items, & Optimize important than.

4. writes:
Surmise that it increases blood stream to the.