本AdaMIG (v1.2)来自CDISC官网以下链接:
https://www.cdisc.org/standards/foundational/adam/adam-implementation-guide-v1-2-release-package
由于篇幅限制,分上下。
4 Implementation Issues, Standard Solutions, and Examples
4 实施问题,标准解决方案和示例
The ADaM standard variables (columns) are described in Section 3, Standard ADaM Variables. However, there is more to ADaM than just using ADaM standard variables. The purpose of this section is to provide additional guidance on how to implement ADaM standard datasets correctly. Each example provided in this section illustrates one compliant solution to a common analysis issue.
第3节“ 标准ADaM变量” 中介绍了ADaM标准变量(列)。但是,ADaM不仅仅是使用ADaM标准变量。本部分的目的是提供有关如何正确实现ADaM标准数据集的其他指导。本节中提供的每个示例都说明了一个针对常见分析问题的兼容解决方案。
Section 4.1, Examples of Treatment Variables for Common Trial Designs, provides examples of treatment variables for common trial designs.
第4.1节“常见试验设计的治疗变量示例”提供了常见试验设计的治疗变量示例。
Sections 4.2-4.9 are concerned with the BDS. These sections provide standard solutions to BDS implementation issues, illustrated with examples. Section 4.2, Creation of Derived Columns Versus Creation of Derived Rows, focuses on assembling the rows and columns of the dataset. Sections 4.3, Inclusion of All Observed and Derived Records for a Parameter Versus the Subset of Records Used for Analysis, and 4.4, Inclusion of Input Data that Are Not Analyzed but that Support a Derivation in the ADaM Dataset, discuss issues around the inclusion/exclusion of rows not used in an analysis. Sections 4.5, Identification of Records Used for Analysis,
4.6, Identification of Population-Specific Analyzed Records, and 4.7, Identification of Records which Satisfy a Predefined Criterion for Analysis Purposes, discuss issues around identification of rows for analysis. Section 4.8, Examples of Timing Variables, contains an example of the use of the BDS variables for phase, period, and
subperiod. Section 4.9, Examples of Bi-Directional Lab Toxicity Variables, contains an example using bi-directional toxicity grading to support the creation of shift tables. Section 4.10, Other Issues to Consider, provides comment on additional issues that may arise.
4.2-4.9节与BDS有关。这些部分提供了有关BDS实施问题的标准解决方案,并带有示例。第4.2节“派生列的创建与派生行的创建”专注于组装数据集的行和列。第4.3节,包含参数的所有观测记录和派生记录与用于分析的记录子集有关;第4.4节,包含未分析但支持ADaM数据集派生的输入数据,讨论了有关包含/排除的问题分析中未使用的行数。第4.5节,用于分析的记录的标识,4.6,特定于人群的分析记录的标识,以及4.7,满足用于分析目的的预定标准的记录的标识,讨论了有关标识要分析的行的问题。第4.8节“时序变量示例”包含一个在相位,周期和相位上使用BDS变量的示例。次时期。第4.9节“双向实验室毒性变量示例”包含一个使用双向毒性分级支持创建移动表的示例。第4.10节“其他要考虑的问题”对可能出现的其他问题进行了评论。
For examples of the OCCDS, refer to the separate document "ADaM Structure for Occurrence Data" (available at https://www.cdisc.org/standards/foundational/adam).
Due to space considerations, the examples do not show complete datasets with all of the required and permissible variables. Rather, only those variables needed to illustrate the point being discussed are shown.
有关OCCDS的示例,请参阅单独的文档“事件数据的ADaM结构”(可从 https://www.cdisc.org/standards/foundational/adam获取)。
出于篇幅考虑,这些示例并未显示包含所有必需变量和允许变量的完整数据集。相反,仅显示了说明所讨论问题所需的那些变量。
4.1 Examples of Treatment Variables for Common Trial Designs
4.1 常见试验设计的治疗变量示例
Examples 1-4 in this section illustrate the concepts related to treatment variables in ADSL for several different trial designs, including a parallel design, two crossover designs, and a parallel design with an open-label extension. Note that only selected variables are illustrated; these examples are not intended to imply that these are the only variables in ADSL. Examples 5 and 6 illustrate concepts related to treatment variables in BDS.
本节中的示例1-4说明了与ADSL中用于多个不同试验设计的处理变量有关的概念,这些设计包括并行设计,两个交叉设计以及带有开放标签扩展名的并行设计。请注意,仅显示了选定的变量。这些示例并非旨在暗示这些是ADSL中唯一的变量。示例5和6说明了与BDS中的治疗变量有关的概念。
Example 1
In Table 4.1.1, the treatment variables for three subjects in a parallel design study (one treatment period) are illustrated. Note that the third subject was randomized to active treatment yet received placebo instead. TR01SDT and TR01EDT are not required variables in trial designs that do not involve multiple treatment periods.
在表4.1.1中,说明了平行设计研究(一个治疗期)中三名受试者的治疗变量。注意,第三位受试者被随机分配接受积极治疗,但改为接受安慰剂。在不涉及多个治疗期的试验设计中,TR01SDT和TR01EDT不是必需变量。
Table 4.1.1 Randomized Parallel Design – ADSL Dataset
表4.1.1随机平行设计– ADSL数据集
Row
USUBJID
ARM
ACTARM
TRT01P
TRT01A
TRTSDT
TRTEDT
1
1001
Drug X 5 mg
Drug X 5 mg
Drug X 5 mg
Drug X 5 mg
23OCT2007
17DEC2007
2
1002
Placebo
Placebo
Placebo
Placebo
19JUL2006
20SEP2007
3
1003
Drug X 5 mg
Placebo
Drug X 5 mg
Placebo
01NOV2007
20NOV2007
Example 2
Table 4.1.2 illustrates the treatment variables for three subjects in a two-period crossover design. It should be noted that TRTSDT and TRTEDT are not displayed, but TRTSDT=TR01SDT and TRTEDT is the maximum of TR01EDT and TR02EDT as some subjects may have discontinued before receiving TRT02P. Note that Subjects 1002 and 1003 (in Rows 2 and 3) were each exposed to placebo for both trial periods.
表4.1.2说明了在两个时期的交叉设计中三个对象的治疗变量。应该注意的是,未显示TRTSDT和TRTEDT,但是TRTSDT = TR01SDT和TRTEDT是TR01EDT和TR02EDT的最大值,因为某些对象在接受TRT02P之前可能已经中断。请注意,受试者1002和1003(在第2行和第3行中)均在两个试验期内均接受了安慰剂。
表4.1.2两周期交叉设计– ADSL数据集
Row
USUBJID
TRTSEQP
TRT01P
TRT02P
TRTSEQA
TRT01A
TRT02A
TR01SDT
TR01EDT
TR02SDT
TR02EDT
1
1001
Placebo – Drug X
Placebo
Drug X
Placebo – Drug X
Placebo
Drug X
15FEB2006
03MAY2006
10MAY2006
15AUG2006
2
1002
Placebo – Drug X
Placebo
Drug X
Placebo – Placebo
Placebo
Placebo
01MAR2006
12JUN2006
20JUN2006
23SEP2006
3
1003
Drug X – Placebo
Drug X
Placebo
Placebo – Placebo
Placebo
Placebo
03FEB2006
25APR2006
01MAY2006
04AUG2006
Table 4.1.3 illustrates the treatment variables for three subjects in a three-period crossover design. It should be noted that TRTSDT and TRTEDT are not displayed, but TRTSDT=TR01SDT and TRTEDT is the maximum of TR01EDT, TR02EDT, and TR03EDT as some subjects may have discontinued before receiving TRT03P. In this trial, all subjects received the planned treatment at each period so the TRTxxA variables are not needed.
表4.1.3说明了三期交叉设计中三名受试者的治疗变量。应该注意的是,未显示TRTSDT和TRTEDT,但是TRTSDT = TR01SDT和TRTEDT是TR01EDT,TR02EDT和TR03EDT的最大值,因为某些对象在接受TRT03P之前可能已经中断。在该试验中,所有受试者均在每个时期接受了计划的治疗,因此不需要TRTxxA变量。
表4.1.3三周期交叉设计– ADSL数据集
Row
USUBJID
TRTSEQP
TRT01P
TRT02P
TRT03P
TR01SDT
TR01EDT
TR02SDT
TR02EDT
TR03SDT
TR03EDT
1
1001
Placebo – Drug X – Drug Y
Placebo
Drug X
Drug Y
15FEB2006
03MAY2006
10MAY2006
15AUG2006
23AUG2006
14NOV2006
2
1002
Drug Y – Placebo – Drug X
Drug Y
Placebo
Drug X
01MAR2006
12JUN2006
20JUN2006
23SEP2006
01OCT2006
05DEC2006
3
1003
Drug X – Drug Y – Placebo
Drug X
Drug Y
Placebo
03FEB2006
25APR2006
01MAY2006
04AUG2006
12AUG2006
15OCT2006
Table 4.1.4 illustrates the treatment variables for two subjects from a double-blind, parallel design study with an open-label extension. The variable TRT01P was used for the planned treatment to which the subject was randomized in the double-blind portion, and TRT02P was used for the planned treatment in the open- label portion.
表4.1.4说明了来自双盲,平行设计研究(带有开放标签扩展名)的两名受试者的治疗变量。变量TRT01P用于计划的治疗,受试者在双盲部分中被随机分配,而TRT02P用于计划的治疗在开放标签部分。
表4.1.4具有开放标签扩展名的并行设计研究– ADSL数据集
Row
USUBJID
TRTSEQP
TRT01P
TRT02P
TR01SDT
TR01EDT
TR02SDT
TR02EDT
1
1001
Drug X 5 mg - Drug X 5 mg
Drug X 5 mg
Drug X 5 mg
14AUG2007
20SEP2007
21SEP2007
15MAR2008
2
1002
Placebo - Drug X 5 mg
Placebo
Drug X 5 mg
05JUL2007
15AUG2007
17AUG2007
04FEB2008
Examples 5 and 6 build on the ADSL dataset illustrated in Table 4.1.4.
示例5和6建立在表4.1.4中所示的ADSL数据集上。
As stated in Section 3.3.2, Record-Level Treatment and Dose Variables for BDS Datasets, at least one treatment variable is required in a BDS dataset. This requirement is satisfied by any of the subject-level or record-level treatment variables (e.g., TRTxxP, TRTP). The following two examples illustrate some possible approaches for BDS treatment variables. These examples are not meant to imply a standard or best practice; they are for illustration purposes only. Please refer to Section 3.3.2, Record-Level Treatment and Dose Variables for BDS Datasets, for important additional information.
如第3.3.2节“ BDS数据集的记录级处理和剂量变量”所述,BDS数据集中至少需要一个处理变量。任何受试者级别或记录级别的治疗变量(例如,TRTxxP,TRTP)都可以满足此要求。以下两个示例说明了BDS处理变量的一些可能方法。这些示例并不意味着暗示标准或最佳实践。它们仅用于说明目的。有关重要的附加信息,请参阅第3.3.2节“ BDS数据集的记录级处理和剂量变量”。
In Table 4.1.5, the ADSL treatment variables have been included in the BDS dataset. In addition, TRTP contains the planned treatment associated with the assessment (e.g., at ADT). The inclusion of both the ADSL treatment variables and TRTP allows this dataset to support multiple analysis strategies. If the data are analyzed using the randomized treatment from the double-blinded trial, then TRT01P can be used as the treatment variable in the analysis. If the data are analyzed using the treatment assigned at the time of the assessment, then TRTP can be used as the treatment variable in the analysis. In this example, TRTP is blank for assessments that are not on-treatment.
在表4.1.5中,ADSL处理变量已包含在 BDS数据集中。此外,TRTP包含与评估相关的计划治疗(例如,在ADT)。同时包含ADSL处理变量和TRTP允许该数据集支持多种分析策略。如果使用双盲试验中的随机治疗方法分析数据,则可以将TRT01P用作分析中的治疗变量。如果数据是使用评估时指定的治疗方法进行分析,然后将TRTP用作分析中的治疗变量。在此示例中,对于未进行评估的评估,TRTP为空白。
Table 4.1.5 Parallel Design Study with an Open-label Extension – BDS Dataset, Illustration 1
表4.1.5具有开放标签扩展名的并行设计研究– BDS数据集,图1
Row
USUBJID
APERIOD
ADT
TRTP
TRT01P
TRT02P
1
1001
10AUG2007
Drug X 5 mg
Drug X 5 mg
2
1001
1
14AUG2007
Drug X 5 mg
Drug X 5 mg
Drug X 5 mg
3
1001
2
21SEP2007
Drug X 5 mg
Drug X 5 mg
Drug X 5 mg
4
1002
01JUL2007
Placebo
Drug X 5 mg
5
1002
1
05JUL2007
Placebo
Placebo
Drug X 5 mg
6
1002
2
17AUG2007
Drug X 5 mg
Placebo
Drug X 5 mg
Example 6
Table 4.1.6 demonstrates a different approach from the one illustrated in Example 5. In this approach, TRTP contains the treatment being used for the analysis of that record. Note that the assessments occurring prior to Period 1 have TRTP populated in order to support analysis of all records by planned treatment group, even though subjects had not yet been treated (see TR01SDT in Table 4.1.4).
表4.1.6演示了与示例5中所示方法不同的方法。在此方法中,TRTP包含用于分析该记录的处理。请注意,即使受试者尚未接受治疗,也要填充第1期之前进行的评估以支持计划治疗组对所有记录的分析(请参阅表4.1.4中的TR01SDT)。
Table 4.1.6 Parallel Design Study with an Open-label Extension – BDS Dataset, Illustration 2
表4.1.6 带有开放标签扩展名的平行设计研究– BDS数据集,图2
Row
USUBJID
APERIOD
ADT
TRTP
1
1001
10AUG2007
Drug X 5 mg
2
1001
1
14AUG2007
Drug X 5 mg
3
1001
2
21SEP2007
Drug X 5 mg
4
1002
01JUL2007
Placebo
5
1002
1
05JUL2007
Placebo
6
1002
2
17AUG2007
Drug X 5 mg
4.2 Creation of Derived Columns Versus Creation of Derived Rows
4.2 派生列的创建与派生行的创建
This section provides specific rules to use in building a BDS dataset. These rules are essential, because they ensure the BDS dataset is analysis-focused, with all analysis-enabling variables and supportive variables included in a predictable structure, while preventing a "horizontalization" of the dataset.
本节提供了用于构建BDS数据集的特定规则。这些规则至关重要,因为它们确保BDS数据集以分析为重点,所有可启用分析的变量和支持变量都包含在可预测的结构中,同时防止数据集的“水平化”。
The rows (i.e., records) in the ADaM BDS represent subject data for analysis parameters and timepoints (as applicable). There may be multiple rows within a given combination of subject, parameter, and timepoint, depending on the number of observations collected or derived, baseline definition, etc.
ADaM BDS中的行(即记录)代表分析参数和时间点(如适用)的主题数据。给定的主题,参数和时间点组合中可能有多行,具体取决于收集或导出的观察数,基线定义等。
The ADaM BDS structure contains a central set of columns (i.e., variables) that represent the data being analyzed. These variables include the value being analyzed (e.g., AVAL) and the description of the value being analyzed (e.g., PARAM). Other columns in the dataset provide more information about the value being analyzed (e.g., the subject identification) or describe and trace its derivation (e.g., DTYPE) or support its analysis (e.g., treatment variables, covariates). Standard columns exist for a variety of purposes, such as SDTM record identifiers for traceability, population and other record selection flags, analysis values, and some standard functions of analysis values.
Permissible columns are not limited to those whose variable names are specified in Section 3, Standard ADaM Variables, and may include study-specific analysis model covariates, subgrouping variables, variables supportive of traceability, and other variables needed for analysis or useful for review.
ADaM BDS结构包含一组代表要分析的数据的中央列(即变量)。这些变量包括要分析的值(例如AVAL)和要分析的值的描述(例如PARAM)。数据集中的其他列提供有关正在分析的值(例如,受试者标识)的更多信息,或者描述和跟踪其推导(例如,DTYPE),或支持其分析(例如,治疗变量,协变量)。标准列存在多种用途,例如用于可追溯性的SDTM记录标识符,填充和其他记录选择标记,分析值以及分析值的某些标准功能。允许的列不限于其变量名称在第3节“标准ADaM 变量” 中指定的列,并且可以包括特定于研究的分析模型协变量,分组变量,支持可追溯性的变量以及其他分析所需或对审核有用的变量。
The BDS is flexible in that derived data can be added to the collected data as additional rows and columns that support the analyses and provide traceability. However, there are some constraints on how to incorporate derived data in the BDS dataset. Specifically, the subject of this section is to address when derived data that are functions of analysis values should be added as additional columns, and when they should be added as additional rows instead.
BDS的灵活性在于,可以将派生数据作为支持分析并提供可追溯性的其他行和列添加到收集的数据中。但是,在如何将派生数据合并到BDS数据集中存在一些限制。具体而言,本节的主题是要解决何时应将作为分析值函数的派生数据添加为附加列,以及何时应将其作为附加行添加。
The precise sequence of steps involved in creating a BDS ADaM dataset varies according to operational and study- specific needs. For the purposes of this discussion, it is useful to consider two fundamental steps.
创建BDS ADaM数据集所涉及的步骤的确切顺序会根据操作和研究的特定需求而变化。为了便于讨论,考虑两个基本步骤是很有用的。
Create an initial dataset from the source datasets. The first step is to create a set of rows and columns more or less directly derived from or loaded from input datasets (primarily SDTM datasets and other ADaM datasets) into their appropriate places. This step will include creation and population of columns containing analysis parameter (PARAM), analysis timepoint (e.g., AVISIT) and analysis values (e.g., AVAL, AVALC). It would also include adding columns containing identifiers (e.g., STUDYID, USUBJID, SUBJID, SITEID) and other SDTM variables for traceability (e.g., VISIT, --SEQ). 从源数据集创建初始数据集。第一步是创建一组行或列,它们或多或少直接从输入数据集(主要是SDTM数据集和其他ADaM数据集)派生或加载到它们的适当位置。此步骤将包括创建和填充包含分析参数(PARAM),分析时间点(例如AVISIT)和分析值(例如AVAL,AVALC)的列。它还将包括添加包含标识符(例如,StudyID,USUBJID,SUBJID,SITEID)和其他可追溯性的SDTM变量(例如,VISIT,-SEQ)的列。
Add additional derived data as needed for the analysis. The second step consists of adding derived rows and columns based on the initial set of ADaM dataset records and columns. The rules below govern this step. These rules are further described and illustrated in the remaining subsections of this section. 根据分析需要添加其他派生数据。第二步包括根据ADaM数据集记录和列的初始集合添加派生的行和列。以下规则支配了此步骤。这些规则将在本节的其余小节中进一步描述和说明。
Rule 1: A parameter-invariant function of AVAL and BASE on the same row that does not involve a transform of BASE should be added as a new column.
规则1:应在不涉及BASE转换的同一行上添加AVAL和BASE的参数不变函数作为新列。
Rule 2: A transformation of AVAL that does not meet the conditions of Rule 1 should be added as a new parameter, and AVAL should contain the transformed value.
规则2:应添加不符合规则1条件的AVAL转换作为新参数,并且AVAL应该包含转换后的值。
Rule 3: A function of one or more rows within the same parameter for the purpose of creating an analysis timepoint should be added as a new row for the same parameter.
规则3:为了创建分析时间点,应将同一参数中一个或多个行的功能添加为同一参数的新行。
Rule 4: A function of multiple rows within a parameter should be added as a new parameter.
规则4:应将参数中具有多个行的功能添加为新参数。
Rule 5: A function of more than one parameter should be added as a new parameter.
规则5:应将一个以上参数的功能添加为新参数。
Rule 6: When there is more than one definition of baseline, each additional definition of baseline requires the creation of its own set of rows.
规则6:当存在多个基准定义时,每个附加的基准定义都需要创建自己的一组行。
It is important to understand that the rules outlined here are specific to rows and columns that are created based on data already present in the ADaM dataset. The rules do not apply to data that are copied or derived directly from other datasets (either SDTM or ADaM, or both). For example, how to include a transformation of AVAL within the same dataset is governed by the rules, but the inclusion of a covariate derived from another dataset (e.g., inclusion of a variable from ADSL) is not governed by these rules.
重要的是要了解,此处概述的规则特定于基于ADaM数据集中已经存在的数据创建的行和列。该规则不适用于直接从其他数据集(SDTM或ADaM或两者)复制或派生的数据。例如,如何在同一数据集中包含AVAL的转换受规则支配,但是从另一数据集派生的协变量的包含(例如,包含来自ADSL的变量)不受这些规则支配。
4.2.1 Rules for the Creation of Rows and Columns
4.2.1 创建行和列的规则
To preserve the BDS, it is necessary to place constraints on when one is allowed to create derived columns. Rule 1 describes when derived data belongs in columns. Rules 2-6 describe situations in which one should derive data in new rows, whether as entirely new parameters or as additional rows in existing parameters. In the subsections and examples that follow, there is some text that is in bold. The use of the bold font is to emphasize to the reader the importance of the concept or example that is being discussed.
为了保留BDS,必须对何时允许创建派生列设置约束。规则1说明了派生数据何时属于列。规则2-6描述了应该在新行中导出数据的情况,无论这些数据是全新的参数还是现有参数中的其他行。在随后的小节和示例中,一些文本以粗体显示。粗体字体的使用是要向读者强调所讨论的概念或示例的重要性。
4.2.1.1 Rule 1: A parameter-invariant function of AVAL and BASE on the same row
that does not involve a transform of BASE should be added as a new column.
4.2.1.1 规则1:在同一行中添加一个不涉及BASE变换的AVAL和BASE的参数不变函数作为新列。
The three conditions of Rule 1 for when a function of AVAL and BASE should be added as a column (i.e., a function column) are:
规则1的三个条件,何时将AVAL和BASE函数添加为列(即,函数列):
1.the function is of AVAL and, optionally, BASE, on the same row; and
1.该函数在同一行中具有AVAL,并且可选地具有BASE;和
2.the function is parameter-invariant; and
2.该函数是参数不变的;和
3.the function does not involve a transform of BASE.
3.该函数不涉及BASE的转换。
The remainder of the discussion of this rule is devoted to explaining these conditions.
关于该规则的其余讨论专门用于解释这些条件。
PARAM uniquely describes the contents of AVAL or AVALC. Often, AVAL itself is not the value that is needed for analysis.
PARAM唯一描述AVAL或AVALC的内容。通常,AVAL本身不是分析所需的值。
For example, in a change from baseline analysis, it is the change from baseline CHG that is analyzed. The change from baseline column CHG should be created according to Rule 1 because it satisfies the three conditions:
例如,在基线分析的变化中,分析的是基线CHG的变化。从基线列CHG的更改应根据规则1创建,因为它满足以下三个条件:
CHG is derived from AVAL and BASE on the same row.1.CHG源自同一行上的AVAL和BASE。
The same calculation applies on all rows in the dataset on which CHG is populated (the function CHG=AVAL-BASE does not vary according to PARAM). This second condition is known as the property of parameter-invariance; unless listed in Section 3, Standard ADaM Variables, a function of AVAL (and optionally BASE) may not be derived as a column if it is parameter-variant (i.e., is calculated differently for different parameters).2.相同的计算适用于在其中填充CHG的数据集中的所有行(函数CHG = AVAL-BASE不会根据PARAM改变)。第二个条件称为参数不变性。除非在第3节中列出,标准ADAM变量,AVAL的函数(和如果它是参数变量(即,对于不同参数的计算方式不同),则可能无法将BASE导出为列。
3. In the function CHG=AVAL-BASE, BASE is not transformed.
3.在函数CHG = AVAL-BASE中,BASE不转换。
Table 4.2.1.1.1 illustrates the CHG column. Note that the producer elected not to populate CHG on the screening or run-in rows, as they are pre-baseline. The baseline flag column ABLFL identifies the row that was used to populate the BASE column.
表4.2.1.1.1说明了CHG列。请注意,生产者选择不填充CHG在筛选或磨合行中,因为它们是基线之前的值。基线标志列ABLFL标识用于填充BASE列的行。
Table 4.2.1.1.1 Illustration of Rule 1: Creation of a Column Containing a Same-Row Parameter-Invariant Function of AVAL and BASE
表4.2.1.1.1规则1的说明:创建包含AVAL和BASE的相同行参数不变函数的列
Row
PARAM
PARAMCD
AVISIT
ABLFL
AVAL
BASE
CHG
1
Weight (kg)
WEIGHT
Screening
99
100
.
2
Weight (kg)
WEIGHT
Run-In
101
100
.
3
Weight (kg)
WEIGHT
Baseline
Y
100
100
0
4
Weight (kg)
WEIGHT
Week 24
94
100
-6
5
Weight (kg)
WEIGHT
Week 48
92
100
-8
6
Weight (kg)
WEIGHT
Week 52
95
100
-5
7
Pulse Rate (bpm)
PULSE
Screening
63
62
.
8
Pulse Rate (bpm)
PULSE
Run-In
67
62
.
9
Pulse Rate (bpm)
PULSE
Baseline
Y
62
62
0
10
Pulse Rate (bpm)
PULSE
Week 24
66
62
4
11
Pulse Rate (bpm)
PULSE
Week 48
70
62
8
12
Pulse Rate (bpm)
PULSE
Week 52
64
62
2
In contrast, consider the potential function column LOG10 = Log10(AVAL). This function satisfies all three conditions of Rule 1 and as such is allowed as a function column. However, LOG10BAS = Log10(BASE) and LOG10CHG = Log10(AVAL) – Log10(BASE) are not allowable columns as they involve a transform of BASE.
相反,考虑潜在功能列LOG10 = Log10(AVAL)。该函数满足规则1的所有三个条件,因此可以作为函数列使用。但是,不允许使用LOG10BAS = Log10(BASE)和LOG10CHG = Log10(AVAL)– Log10(BASE)列,因为它们涉及BASE的转换。
Therefore, if it is desired to perform change from baseline analysis in LOG10, columns for LOG10, baseline of LOG10 and change from baseline of LOG10 would be needed for analysis and review, then the Log10 transformation should instead be created as a new parameter, so that the usual columns AVAL, BASE, and CHG can be used. This is because columns for baseline of LOG10 and change from baseline of LOG10 would not satisfy the conditions of Rule 1. Baseline of LOG10 violates the first condition, because it is not generally a function of AVAL on the same row (does not generally vary by AVAL), and instead is a function only of AVAL on the baseline row. "Change from baseline of LOG10" = LOG10(AVAL) - LOG10(BASE) violates the third condition, because it contains the Log10 transform of BASE.
因此,如果需要对LOG10中的基线分析进行更改,则需要对LOG10的列,LOG10的基线以及对LOG10的基线进行更改以进行分析和检查,则应将Log10转换创建为新参数,因此可以使用常规列AVAL,BASE和CHG。这是因为用于LOG10基线和从LOG10基线变化的列将不满足规则1的条件。LOG10基线违反了第一个条件,因为它通常不是同一行上AVAL的函数(通常不会因AVAL),而仅是基线行上AVAL的函数。“从LOG10的基线更改” = LOG10(AVAL)-LOG10(BASE)违反了第三个条件,因为它包含BASE的Log10转换。
The intent is to use the standard columns as much as possible, to keep the structure as standard as possible, and avoid undue "horizontalization," while still permitting efficient use of function columns.
目的是尽可能使用标准列,以使结构尽可能保持标准,并避免过度的“水平化”,同时仍允许有效使用功能列。
Any function that satisfies the three conditions of Rule 1 is allowed as a column. If the function is listed in Section 3, Standard ADaM Variables, then the ADaM standard column name must be used just as CHG is used in Table 4.2.1.1.1.
满足规则1的三个条件的任何函数都可以作为一列。如果在第3节“标准ADaM变量”中列出了该函数,则必须使用ADaM标准列名,就像表4.2.1.1.1中使用CHG一样。
4.2.1.2 Rule 2: A transformation of AVAL that does not meet the conditions of Rule 1 should be added as a new parameter, and AVAL should contain the transformed value.
4.2.1.2 规则2:应添加不符合规则1条件的AVAL转换作为新参数,并且AVAL应包含转化值。
If the intention is to redefine AVAL, BASE, CHG, etc. in terms of a transform of AVAL, then a new parameter must be added in which PARAM describes the transform. The creation of a new parameter results, by definition, in the creation of a new set of rows.
如果要根据AVAL的转换重新定义AVAL,BASE,CHG等,则必须添加一个新参数,其中PARAM描述了转换。根据定义,创建新参数将导致创建一组新的行。
For example, as described in the discussion of Rule 1, in a change from baseline analysis of the logarithm of weight, AVAL should contain the log of weight, BASE should contain the baseline value of the log of weight, and CHG should contain the difference between the two. PARAM should contain a description of the transformed data contained in AVAL, e.g., "Log10 (Weight (kg))". In this way the ADaM standard accommodates an analysis of transformed data in the standard columns without creating a multiplicity of new special-purpose columns.
例如,如对规则1的讨论中所述,在对重量的对数进行基线分析的更改中,AVAL应该包含重量的对数,BASE应该包含重量的对数的基线值,CHG应该包含差值两者之间。PARAM应包含对AVAL中包含的转换数据的描述,例如“ Log10(重量(kg))”。这样,ADaM标准就可以在标准列中分析转换后的数据,而无需创建多个新的专用列。
In Table 4.2.1.2.1, the producer has chosen values of AVISITN that correspond to week number and which serve well for sorting and for plotting. VISITNUM is the SDTM visit number.
在表4.2.1.2.1中,生产者选择了与周数相对应的AVISITN值,这些值可以很好地用于排序和绘图。VISITNUM是SDTM访问号码。
Note that when source SDTM dataset variables, such as USUBJID, SUBJID, SITEID, VISIT, VISITNUM and --SEQ, are included in an ADaM dataset with their original SDTM variable names, their values must not be altered in any way.
请注意,当原始SDTM数据集变量(例如USUBJID,SUBJID,SITEID,VISIT,VISITNUM和--SEQ)以原始SDTM变量名称包含在ADaM数据集中时,不得以任何方式更改其值。
Table 4.2.1.2.1 Illustration of Rule 2: Creation of a New Parameter to Handle a Transformation
表4.2.1.2.1规则2的插图:创建用于处理转换的新参数
Row
PARAM
PARAMCD
VISIT
AVISIT
AVISITN
VISITNUM
ABLFL
AVAL
BASE
CHG
1
Weight (kg)
WEIGHT
Visit -1
Screening
-4
1
99
100
.
2
Weight (kg)
WEIGHT
Visit 0
Run-In
-2
2
101
100
.
3
Weight (kg)
WEIGHT
Visit 1
Baseline
0
3
Y
100
100
0
4
Weight (kg)
WEIGHT
Visit 12
Week 24
24
4
94
100
-6
5
Weight (kg)
WEIGHT
Visit 24
Week 48
48
5
92
100
-8
6
Weight (kg)
WEIGHT
Visit 26
Week 52
52
6
95
100
-5
7
Log10(Weight (kg))
L10WT
Visit -1
Screening
-4
1
1.9956
2
.
8
Log10(Weight (kg))
L10WT
Visit 0
Run-In
-2
2
2.0043
2
.
9
Log10(Weight (kg))
L10WT
Visit 1
Baseline
0
3
Y
2
2
0
10
Log10(Weight (kg))
L10WT
Visit 12
Week 24
24
4
1.9731
2
-0.0269
11
Log10(Weight (kg))
L10WT
Visit 24
Week 48
48
5
1.9638
2
-0.0362
12
Log10(Weight (kg))
L10WT
Visit 26
Week 52
52
6
1.9777
2
-0.0223
A related application of Rule 2 is in the case where it is necessary to support analysis and reporting in two different systems of units. In SDTM Findings domains such as LB, QS, EG, and so on, the --STRESN column is the only numeric result column, and is also the only standardized numeric result column. The --ORRES column contains a character representation of the collected result, in the collected units specified in the --ORRESU column. The --ORRES column is not standardized. So for example, if data are typically collected in conventional units, SDTM cannot accommodate standardized data in both conventional units and the International System of Units (SI). In SDTM, for any given --TEST, a producer can standardize in one system of units but not two. If one wishes to be able to analyze standardized results in both conventional units and in SI units, a transform in an ADaM dataset is needed. In each such case, a new parameter must be created in order to accommodate standardized data in the other system of units.
如果需要在两个不同的单位系统中支持分析和报告,则规则2的相关适用。在LB,QS,EG等SDTM Findings域中,-STRESN列是唯一的数字结果列,也是唯一的标准化数字结果列。--ORRES列以--ORRESU列中指定的收集单位包含收集结果的字符表示。--ORRES列未标准化。因此,例如,如果通常以常规单位收集数据,则SDTM无法同时容纳常规单位和国际单位制(SI)中的标准化数据。在SDTM中,对于任何给定的--TEST,生产者都可以在一个单位制中进行标准化,但不能在两个单位制中进行标准化。如果希望能够以常规单位和SI单位分析标准化结果,则需要在ADaM数据集中进行转换。在每种情况下,都必须创建一个新参数,以便在其他单位系统中容纳标准化数据。
The description in the PARAM column must contain the units, as well as any other information such as location and specimen type that is needed to ensure that PARAM uniquely describes what is in AVAL, and differentiates between parameters as needed. PARAM cannot be the same for different units.
PARAM列中的描述必须包含单位,以及任何其他信息,如位置和样本类型,以确保PARAM唯一描述AVAL中的内容,并根据需要区分参数。不同单位的PARAM不能相同。
Table 4.2.1.2.2 shows an example of data supporting analyses of low-density lipoprotein (LDL) cholesterol in both conventional units (mg/dL) and SI units (mmol/L). In this study, SDTM cholesterol data were standardized in mg/dL. In the ADaM dataset, two records, one for each system of units, were generated from each original SDTM record. As described in Section 4.10.5, Copying Values onto a New Record, as a general rule, when a record is derived from a single record in the dataset, retain on the derived record any variable values from the original record that do not change and that make sense in the context of the new record (e.g., --SEQ, VISIT, VISITNUM, --TPT, covariates, etc.).
表4.2.1.2.2显示了一个数据示例,该数据支持以常规单位(mg / dL)和SI单位(mmol / L)的低密度脂蛋白(LDL)胆固醇分析。在这项研究中,SDTM胆固醇数据以mg / dL为标准。在ADaM数据集中,从每个原始SDTM记录中生成了两个记录,每个单位系统对应一个记录。如第4.10.5节“将值复制到新记录中”所述,作为一般规则,当从数据集中的单个记录派生一条记录时,将原始记录中任何不变的变量值保留在派生记录上,以及在新记录的上下文中有意义的(例如–SEQ,VISIT,VISITNUM,-TPT,协变量等)。
Table 4.2.1.2.2 Illustration of Rule 2: Creation of a New Parameter to Handle a Second System of Units
表4.2.1.2.2规则2的图示:创建新参数以处理第二个单位系统
Row
PARAM
PARAMCD
AVISIT
AVISITN
VISITNUM
LBSEQ
ABLFL
AVAL
BASE
CHG
PCHG
1
LDL Cholesterol (mg/dL)
LDL
Screening
-2
1
2829
206.3
213.4
2
LDL Cholesterol (mg/dL)
LDL
Run-In
-1
2
2830
202.1
213.4
3
LDL Cholesterol (mg/dL)
LDL
Week 0
0
3
2831
Y
213.4
213.4
0.0
0.00
4
LDL Cholesterol (mg/dL)
LDL
Week 5
5
4
2832
107.4
213.4
-106.0
-49.67
5
LDL Cholesterol (mg/dL)
LDL
Week 11
11
5
2833
90.2
213.4
-123.2
-57.73
6
LDL Cholesterol (mg/dL)
LDL
Week 17
17
6
2834
96.8
213.4
-116.6
-54.64
7
LDL Cholesterol (mg/dL)
LDL
Week 23
23
7
2835
104.0
213.4
-109.4
-51.27
8
LDL Cholesterol (mmol/L)
LDLT
Screening
-2
1
2829
5.3349
5.5185
9
LDL Cholesterol (mmol/L)
LDLT
Run-In
-1
2
2830
5.2263
5.5185
10
LDL Cholesterol (mmol/L)
LDLT
Week 0
0
3
2831
Y
5.5185
5.5185
0.0000
0.00
11
LDL Cholesterol (mmol/L)
LDLT
Week 5
5
4
2832
2.7773
5.5185
-2.7412
-49.67
12
LDL Cholesterol (mmol/L)
LDLT
Week 11
11
5
2833
2.3326
5.5185
-3.1859
-57.73
13
LDL Cholesterol (mmol/L)
LDLT
Week 17
17
6
2834
2.5032
5.5185
-3.0153
-54.64
14
LDL Cholesterol (mmol/L)
LDLT
Week 23
23
7
2835
2.6894
5.5185
-2.8291
-51.27
4.2.1.2 Rule 3: A function of one or more rows within the same parameter for the purpose of creating an analysis timepoint should be added as a new row for the same parameter.
4.2.1.2 规则3:为了创建分析时间点,应将同一参数中一个或多个行的功能添加为同一参数的新行。
For analysis purposes, there is often a need to impute missing data, or to create a derived conceptual timepoint. Such derivations should result in the creation of new derived records within the same parameter.
为了进行分析,通常需要估算缺失的数据或创建派生的概念性时间点。这样的派生应导致在同一参数内创建新的派生记录。
As described in Section 4.10.5, Copying Values onto a New Record, as a general rule, when a record is derived from a single record in the dataset, retain on the derived record any variable values from the original record that do not change and that make sense in the context of the new record (e.g., --SEQ, VISIT, VISITNUM, --TPT, covariates). When a record is derived from multiple records, retain on the derived record all variable values that are constant across the original records, that do not change, and which make sense in the context of the new record. Note that there are situations when retention of values from an original record or records would make no sense on the derived record; in such cases, do not retain those values.
如第4.10.5节“将值复制到新记录中”所述,作为一般规则,当从数据集中的一条记录派生一条记录时,将原始记录中任何不变的变量值保留在派生记录上,以及在新记录的上下文中有意义的代码(例如--SEQ,VISIT,VISITNUM,-TPT,协变量)。当从多个记录派生一个记录时,请在派生记录上保留所有在原始记录中不变的,不变的,在新记录的上下文中有意义的变量值。请注意,在某些情况下,保留原始记录或多个记录中的值在派生记录上毫无意义;在这种情况下,请勿保留这些值。
For example, suppose that the analysis endpoint value is defined as the average of the last two available post-baseline values. In this case, a new row should be added, with a corresponding description in AVISIT, and the DTYPE (derivation type) column should contain a description on that row such as "AVERAGE" to indicate both that the row was derived, and also the derivation method. The metadata associated with AVISIT=Endpoint should adequately describe which records are used in the definition of the average. Note that even though the set of records for the log transformation of weight are derived, DTYPE is not populated for every row. DTYPE should be used to indicate rows that are derived within a given value of PARAM and is not to be used as an indication of whether the record exists in SDTM.
例如,假设将分析终点值定义为最后两个可用的基线后值的平均值。在这种情况下,应添加新行,并在AVISIT中添加相应的描述,并且DTYPE(派生类型)列应包含该行的描述,例如“ AVERAGE”,以表明该行是派生的,并且推导方法。与AVISIT = Endpoint关联的元数据应充分描述在平均值定义中使用了哪些记录。请注意,即使导出了用于权重的对数转换的记录集,也不会为每一行填充DTYPE。DTYPE应该用于指示在给定的PARAM值内派生的行,而不能用作指示SDTM中是否存在记录的指示。
In Table 4.2.1.3.1, VISITNUM is not retained on the derived record because VISITNUM is not constant on the precursor records, and also makes no sense in the derived analysis timepoint, which is an average that in most cases will span multiple visits. Similarly VSSEQ is not constant across multiple original records, so VSSEQ is not populated on the derived record. PARAM and BASE should be retained because they are constant on the precursor records and make sense in the context of the new record. For the new record, AVAL and change are recalculated, and AVISIT, AVISITN, and DTYPE are populated appropriately. Note that the metadata will specify the algorithm used for the calculation (in this example, the rows being averaged).
在表4.2.1.3.1中,VISITNUM不会保留在派生记录上,因为VISITNUM在前体记录上不是恒定的,并且在派生分析时间点也没有意义,这是大多数情况下平均多次访问的平均值。同样,VSSEQ在多个原始记录中不是恒定的,因此VSSEQ不会填充在派生记录上。PARAM和BASE应该保留,因为它们在前体记录中是恒定的,并且在新记录的上下文。对于新记录,将重新计算AVAL和变更,并适当填充AVISIT,AVISITN和DTYPE。请注意,元数据将指定用于计算的算法(在此示例中,将对行进行平均)。
AVISIT and AVISITN are defined by the producer. AVISIT and AVISITN are not necessarily defined the same for the individual parameters within a dataset. The definition and derivation of the values of AVISIT, and any dependence on parameter, should be described in metadata. In this example, the producer decided to set AVISITN to 9999 on the derived AVISIT=Endpoint records.
AVISIT和AVISITN由生产者定义。对于数据集中的各个参数,AVISIT和AVISITN不一定定义相同。AVISIT值的定义和派生以及对参数的任何依赖都应在元数据中描述。在此示例中,生产者决定在派生的AVISIT = Endpoint记录上将AVISITN设置为9999。
Table 4.2.1.3.1 Illustration of Rule 3: Creation of a New Row to Handle a Derived Analysis Timepoint
表4.2.1.3.1规则3的插图:创建新行以处理派生的分析时间点
Row
PARAM
AVISIT
AVISITN
VISITNUM
VSSEQ
ABLFL
AVAL
BASE
CHG
DTYPE
1
Weight (kg)
Screening
-4
1
1164
99
100
2
Weight (kg)
Run-In
-2
2
1165
101
100
3
Weight (kg)
Baseline
0
3
1166
Y
100
100
0
4
Weight (kg)
Week 24
24
4
1167
94
100
-6
5
Weight (kg)
Week 48
48
5
1168
92
100
-8
6
Weight (kg)
Week 52
52
6
1169
95
100
-5
7
Weight (kg)
Endpoint
9999
93.5
100
-6.5
AVERAGE
8
Log10(Weight (kg))
Screening
-4
1
1164
1.9956
2
9
Log10(Weight (kg))
Run-In
-2
2
1165
2.0043
2
10
Log10(Weight (kg))
Baseline
0
3
1166
Y
2
2
0
11
Log10(Weight (kg))
Week 24
24
4
1167
1.9731
2
-0.0269
12
Log10(Weight (kg))
Week 48
48
5
1168
1.9638
2
-0.0362
13
Log10(Weight (kg))
Week 52
52
6
1169
1.9777
2
-0.0223
14
Log10(Weight (kg))
Endpoint
9999
1.9708
2
-0.0292
AVERAGE
An extension of Rule 3 is necessary in the case where there is record-level population flagging.
在存在记录级人口标记的情况下,必须扩展规则3。
For example, assume the SAP states that if the subject is off drug for 7 days prior to a visit, the measurement collected at that visit is not included in the per- protocol analysis. Then, for some subjects, the last two available values may be different for Intent-to-Treat and for Per-Protocol analyses, so that the calculated endpoint averages would be different. For such subjects, two distinct derived endpoint rows would be needed, the appropriate row for each analysis indicated by the record-level population flags ITTRFL and PPROTRFL.
例如,假设SAP指出,如果受试者在就诊前7天禁药,则按协议分析中不包括该访视收集的测量值。然后,对于某些主题,对于Intent-to-Treat和Per-Protocol分析,最后两个可用值可能会有所不同,因此计算得出的端点平均值会有所不同。对于此类受试者,将需要两个不同的派生端点行,每个分析的适当行由记录级人口标志ITTRFL和PPROTRFL指示。
In Table 4.2.1.3.2, the analyzed endpoint value varies according to the population. For example, for PARAM=Weight (kg), the last two available ITT values are 92 and 95, whose average is 93.5; whereas the last two Per-Protocol values are 94 and 92, whose average is 93. That is why two derived Endpoint rows are required for this subject. For other subjects, the ITT and Per-Protocol data that are input to the Endpoint average may be the same; in that case, only one Endpoint record would be needed, on which ITTRFL and PPROTRFL would both be set to Y. Values of AVISIT and AVISITN are producer-controlled. As in the example in Table 4.2.1.4, the producer decided to set AVISITN to 9999 on the derived AVISIT=Endpoint records. Note that the metadata will specify the algorithm used for the calculation (in this example, the rows being averaged).
在表4.2.1.3.2中,分析的终点值根据总体而有所不同。例如,对于PARAM = Weight(kg),最后两个可用的ITT值是92和95,平均值是93.5;而最后两个Per-Protocol值分别是94和92,平均值是93。这就是为什么此主题需要两个派生的Endpoint行的原因。对于其他主题,输入到端点平均值的ITT和按协议数据可能相同;在这种情况下,只需要一个Endpoint记录,在该记录上ITTRFL和PPROTRFL都将设置为Y。AVISIT和AVISITN的值由生产者控制。如表4.2.1.4中的示例所示,生产者决定在派生的AVISIT = Endpoint记录上将AVISITN设置为9999。请注意,元数据将指定用于计算的算法(在此示例中,将对行进行平均)。
Table 4.2.1.3.2 Illustration of Rule 3: Creation of New Rows to Handle a Derived Analysis Timepoint When There is Record-Level Population Flagging
表4.2.1.3.2规则3的说明:存在记录级总体标记时,创建新行以处理派生的分析时间点
Row
PARAM
AVISIT
AVISITN
VISITNUM
VSSEQ
ABLFL
AVAL
BASE
CHG
DTYPE
ITTRFL
PPROTRFL
1
Weight (kg)
Screening
-4
1
1164
99
100
Y
Y
2
Weight (kg)
Run-In
-2
2
1165
101
100
Y
Y
3
Weight (kg)
Baseline
0
3
1166
Y
100
100
0
Y
Y
4
Weight (kg)
Week 24
24
4
1167
94
100
-6
Y
Y
5
Weight (kg)
Week 48
48
5
1168
92
100
-8
Y
Y
6
Weight (kg)
Week 52
52
6
1169
95
100
-5
Y
7
Weight (kg)
Endpoint
9999
93.5
100
-6.5
AVERAGE
Y
8
Weight (kg)
Endpoint
9999
93
100
-7
AVERAGE
Y
9
Log10 (Weight (kg))
Screening
-4
1
1164
1.9956
2
Y
Y
10
Log10 (Weight (kg))
Run-In
-2
2
1165
2.0043
2
Y
Y
11
Log10 (Weight (kg))
Baseline
0
3
1166
Y
2
2
0
Y
Y
12
Log10 (Weight (kg))
Week 24
24
4
1167
1.9731
2
-0.0269
Y
Y
13
Log10 (Weight (kg))
Week 48
48
5
1168
1.9638
2
-0.0362
Y
Y
14
Log10 (Weight (kg))
Week 52
52
6
1169
1.9777
2
-0.0223
Y
15
Log10 (Weight (kg))
Endpoint
9999
1.9708
2
-0.0292
AVERAGE
Y
16
Log10 (Weight (kg))
Endpoint
9999
1.9685
2
-0.0315
AVERAGE
Y
In Table 4.2.1.3.3, missing post-baseline values were imputed by last observation carried forward, and also by worst observation carried forward.
在表4.2.1.3.3中,遗漏的基线后值由结转的最后一个观察值和结转的最差观察值估算。
In this study, at Week 8, there was a scheduled Visit 6 (VISITNUM 6). At that visit, blood pressure was collected. However, for this subject, either there was no visit 6, or there was a visit 6, but no data on blood pressure were collected. The SAP says that missing post-baseline data should be imputed (derived) by two methods: last observation carried forward (LOCF) and worst observation carried forward (WOCF).
在这项研究中,在第8周,安排了第6次访问(VISITNUM 6)。在那次访问中,收集了血压。但是,对于该受试者,要么没有访问6,要么没有访问6,但是没有收集到血压数据。SAP表示,应该通过两种方法来估算(推导)丢失的基线后数据:上次结转观测值(LOCF)和最差的结转观测值(WOCF)。
For LOCF analysis, the missing Week 8 result was imputed by carrying forward the most recent prior available post-baseline value, which is the VISITNUM 5 value. That the Week 8 value was imputed is indicated by LOCF in the derivation type (DTYPE) column.
对于LOCF分析,通过结转最新的先前可用的基线后值(即VISITNUM 5值)来估算缺失的第8周结果。LOCF在派生类型(DTYPE)列中指示估算了第8周的值。
For WOCF analysis, even though the unscheduled VISITNUM 4.1 value was not chosen to represent the Week 2 analysis timepoint, it was used to impute the missing Week 8 timepoint because it was the worst post-baseline result up to that point.
对于WOCF分析,即使未选择未计划的VISITNUM 4.1值来表示第2周的分析时间点,它也被用来估算缺少的第8周的时间点,因为这是迄今为止该点最差的基线后结果。
The exact algorithms employed in the record derivation methods (LOCF and WOCF in this case) must be indicated in the metadata for DTYPE.
记录派生方法(在这种情况下为LOCF和WOCF)中使用的确切算法必须在DTYPE的元数据中指出。
Traceability is enhanced by the addition of the SDTM VISITNUM and --SEQ columns. The combination of USUBJID and VSSEQ provides a link to the exact input record in the SDTM VS dataset. On the derived LOCF and WOCF rows, VISITNUM and VSSEQ provide clarity about where the value came from.
通过添加SDTM VISITNUM和--SEQ列,可追溯性得到增强。USUBJID和VSSEQ的组合提供了指向SDTM VS数据集中确切输入记录的链接。在派生的LOCF和WOCF行上,VISITNUM和VSSEQ可清楚说明值的来源。
There are several other concepts presented in this example. Analysis relative day (ADY) in this protocol is defined relative to date of first dose. In many but not all protocols, ADY would equal the value of the SDTM --DY variable (or --STDY for some kinds of data). The data presented here illustrate that this particular subject did not take drug until two days after randomization, so the value of ADY is -2 at the randomization visit, Visit 3 (VISITNUM 3). As is the case for SDTM study day, there is no day 0 for ADY.
此示例中还介绍了其他几个概念。该协议中的分析相对天数(ADY)是相对于首次给药日期定义的。在许多但不是全部协议中,ADY等于SDTM --DY变量的值(对于某些类型的数据为–STDY)。此处提供的数据表明,该特定受试者直到随机分组后两天才开始服药,因此在随机分组访视(访视3(VISITNUM 3))时ADY的值为-2。与SDTM学习日一样,ADY没有第0天。
In this protocol, if there are multiple datapoints within an analysis time window, the value that is observed closest to a pre-specified target planned relative day is the value that is chosen to represent the analysis timepoint. For this study and parameter, AWTARGET=VISITDY (Planned Study Day) from SDTM, and ADY=VSDY. AWTDIFF is the absolute value of ADY - AWTARGET, adjusted for the fact that there is no day 0 (so that if ADY and AWTARGET have different signs, then AWTDIFF=|ADY - AWTARGET| - 1).
在此协议中,如果分析时间窗口内有多个数据点,则最接近预先指定的目标计划相对日观察到的值就是选择的代表分析时间点的值。对于此研究和参数,来自SDTM的AWTARGET = VISITDY(计划学习日),而ADY = VSDY。AWTDIFF是ADY-AWTARGET的绝对值,已针对没有第0天的事实进行了调整(因此,如果ADY和AWTARGET具有不同的符号,则AWTDIFF = | ADY-AWTARGET |-1)。
For AVISIT=Week 2, there were two values observed, at study days 13 and 17 (rows 4 and 5). Day 13 is closer to the target, day 14. So the day 13 record (row
4) is chosen for analysis, as denoted by the analysis flag ANL01FL=Y.
对于AVISIT =第2周,在研究的第13天和第17天(第4行和第5行)观察到两个值。第13天更接近目标,即第14天。因此,第13天的记录(行
选择4)进行分析,如分析标记ANL01FL = Y所示。
AVISIT by itself functions as a description of an analysis time window. AVISIT, DTYPE, and ANL01FL are all needed to identify the records to be used in a given analysis.
AVISIT本身就是对分析时间窗口的描述。都需要AVISIT,DTYPE和ANL01FL来标识要在给定分析中使用的记录。
On the derived AVIST=Week 8 records, AWTARGET was set to the target for Week 8, and AWTDIFF was calculated accordingly. It did not make sense to retain the values of AWTARGET and AWTDIFF from the original records.
在派生的AVIST =第8周记录上,将AWTARGET设置为第8周的目标,并相应地计算了AWTDIFF。从原始记录中保留AWTARGET和AWTDIFF的值是没有意义的。
Table 4.2.1.3.3 Illustration of Rule 3: Creation of New Rows to Handle Imputation of Missing Values by LOCF and WOCF
表4.2.1.3.3规则3的插图:LOCF和WOCF创建新行以处理缺失值的估算
Row
PARAM
AVISIT
AVISITN
VISITNUM
VSSEQ
ABLFL
AVAL
BASE
CHG
DTYPE
ADY
AWTARGET
AWTDIFF
ANL01FL
1
Systolic BP (mm Hg)
Screening
-4
1
3821
120
114
.
-30
-28
2
Y
2
Systolic BP (mm Hg)
Run-In
-2
2
3822
116
114
.
-16
-14
2
Y
3
Systolic BP (mm Hg)
Week 0
0
3
3823
Y
114
114
0
-2
1
2
Y
4
Systolic BP (mm Hg)
Week 2
2
4
3824
118
114
4
13
14
1
Y
5
Systolic BP (mm Hg)
Week 2
2
4.1
3825
126
114
12
17
14
3
6
Systolic BP (mm Hg)
Week 4
4
5
3826
122
114
8
23
28
5
Y
7
Systolic BP (mm Hg)
Week 8
8
5
3826
122
114
8
LOCF
23
56
33
Y
8
Systolic BP (mm Hg)
Week 8
8
4.1
3825
126
114
12
WOCF
17
56
39
Y
9
Systolic BP (mm Hg)
Week 12
12
7
3827
134
114
20
83
84
1
Y
Table 4.2.1.3.4 contains an example of data supporting change from baseline analyses of migraine pain. In this study, missing post-baseline data are imputed by the methods of Baseline Observation Carried Forward (BLOCF) and LOCF.
表4.2.1.3.4包含一个示例数据,该数据支持偏头痛疼痛基线分析的变化。在这项研究中,缺少基线后数据是通过“基线观察结转”(BLOCF)和LOCF的方法来估算的。
When a migraine headache occurs, subjects self-administer a single dose of blinded study treatment. Subjects assess migraine pain at planned timepoints Pre- Dose, 30 Minutes Post-Dose, 1 Hour Post-Dose, and 2 Hours Post-Dose. Collected data on migraine pain are tabulated in the SDTM Findings About domain.
当发生偏头痛时,受试者会自行服用单剂量的盲法研究治疗药物。受试者在给药前,给药后30分钟,给药后1小时和给药后2小时的计划时间点评估偏头痛。有关偏头痛疼痛的收集数据列在SDTM Finding About About域中。
ATPT is the analysis timepoint description. ATPTN is the analysis timepoint number. FATPTNUM is the collected timepoint number from SDTM. AVALC contains the pain assessment, and AVAL contains the numeric coded value of the assessment. AVAL is a one-to-one map to AVALC.
ATPT是分析时间点描述。ATPTN是分析时间点编号。FATPTNUM是从SDTM收集的时间点编号。AVALC包含疼痛评估,而AVAL包含评估的数字编码值。AVAL是AVALC的一对一映射。
Subject 000276 did not continue to provide data after 1 Hour Post-Dose. For this subject, the 2 Hours Post-Dose planned observation must be imputed. Therefore, subject 000276 is excluded from an observed case analysis of Migraine Pain at 2 Hours Post-Dose. Subject 001863 had complete data, so no imputation was necessary.
给药后1小时后,受试者000276未继续提供数据。对于此主题,必须估算给药后2小时的计划观察时间。因此,受试者000276被排除在给药后2小时的偏头痛观察病例分析中。受试者001863具有完整的数据,因此无需进行估算。
The data for both subjects are included in the BLOCF and LOCF analyses of Migraine Pain at 2 Hours Post-Dose.
给药后2小时的偏头痛的BLOCF和LOCF分析中都包括了这两个受试者的数据。
Table 4.2.1.3.4 Illustration of Rule 3: Creation of New Rows to Handle Imputation of Missing Values by BLOCF and LOCF
表4.2.1.3.4规则3的说明:创建新行以处理BLOCF和LOCF的缺失值估算
Row
USUBJID
TRTP
PARAM
ATPT
ATPTN
FATPTNUM
FASEQ
ABLFL
AVAL
AVALC
BASE
CHG
DTYPE
1
000276
Placebo
Migraine Pain
Pre-Dose
0
1
14
Y
3
Severe Pain
3
0
2
000276
Placebo
Migraine Pain
30 Minutes Post-Dose
0.5
2
22
2
Moderate Pain
3
-1
3
000276
Placebo
Migraine Pain
1 Hour Post-Dose
1
3
27
1
Mild Pain
3
-2
4
000276
Placebo
Migraine Pain
2 Hours Post-Dose
2
1
14
3
Severe Pain
3
0
BLOCF
5
000276
Placebo
Migraine Pain
2 Hours Post-Dose
2
3
27
1
Mild Pain
3
-2
LOCF
6
001863
Soma 30 mg
Migraine Pain
Pre-Dose
0
1
638
Y
3
Severe Pain
3
0
7
001863
Soma 30 mg
Migraine Pain
30 Minutes Post-Dose
0.5
2
639
1
Mild Pain
1
-2
8
001863
Soma 30 mg
Migraine Pain
1 Hour Post-Dose
1
3
640
1
Mild Pain
1
-2
9
001863
Soma 30 mg
Migraine Pain
2 Hours Post-Dose
2
4
641
1
Mild Pain
1
-2
Table 4.2.1.3.5 contains an example of some of the columns in a dataset supporting analysis of a 2-period crossover study.
表4.2.1.3.5包含了支持2周期交叉研究分析的数据集中某些列的示例。
In a crossover trial design, all subjects are planned to receive all of the study treatments. The sequence of treatments is randomized. If in a study there are two treatments in a crossover design, two treatment periods are necessary.
在交叉试验设计中,所有受试者均计划接受所有研究治疗。治疗顺序是随机的。如果研究中交叉设计中有两种治疗方法,则需要两个治疗期。
In this example, the planned visits are 1 (Screening and beginning of placebo run-in period), 2 (Week -2, halfway through placebo run-in period), 3 (Week 0, end of placebo run-in and randomization), 4 (Week 4, the end of the first treatment period), and 5 (Week 8, the end of the second treatment period). Baseline is defined in the SAP as the average of the Week -2 (VISIT 2) and Week 0 (VISIT 3) measurements. This baseline is used for the analysis of both the first and the second crossover periods. USUBJID 0987_4252 has no VISIT 2 measurement, so the average is just the Week 0 (VISIT 3) measurement.
在此示例中,计划的访问次数为1(安慰剂磨合期的筛选和开始),2(每周-2,安慰剂磨合期的中途),3(每周0,安慰剂磨合期和随机分组) ,4(第4周,第一个治疗期结束)和5(第8周,第二个治疗期结束)。在SAP中,基准定义为Week -2(VISIT 2)和Week 0(VISIT 3)测量的平均值。该基线用于分析第一和第二交叉时间段。USUBJID 0987_4252没有VISIT 2测量,因此平均值仅为Week 0(VISIT 3)测量。
Within any post-baseline week window, the last observation is used to characterize that week. For example, for USUBJID 0987_3984, the VISIT 5 (row 7) value is used to characterize AVISIT=Week 8, as opposed to the earlier VISIT 4.1 value (row 6), which was also observed during the Week 8 time window. The variable ANL01FL is used in this study to identify the record selected for analysis when there are multiple records for a given AVISIT, and must be used in conjunction with other selection variables in order to identify the exact set of records used in a given analysis or summary.
在任何基线后的一周窗口中,最后的观察值用于表征该周。例如,对于USUBJID 0987_3984,使用VISIT 5(第7行)值来表征AVISIT =第8周,而之前的VISIT 4.1值(第6行)则与之相对,后者在第8周的时间窗口中也观察到。在本研究中,变量ANL01FL用于在给定AVISIT有多个记录时识别选择进行分析的记录,并且必须与其他选择变量结合使用,以识别在给定分析中使用的确切记录集或概要。
APERIODC is the crossover period character description.
APERIODC是交叉时间段字符描述。
Note that in general, APERIODC is not the same as EPOCH. For example, it is possible in some cases that boundaries of APERIODs would not align exactly with boundaries of EPOCHs. A simple example is a post-discontinuation record that is associated with the most recent treatment period for analysis.
请注意,通常,APERIODC与EPOCH不同。例如,在某些情况下,APERIOD的边界可能与EPOCH的边界不完全一致。一个简单的例子是停产后记录,该记录与最近的治疗期相关联以进行分析。
TRTSEQP, from ADSL, is the planned ordering of crossover treatments. TRTP is the treatment variable that will be used in the analysis of this dataset. The two endpoint records are derived only for the subjects who have data for both periods.
来自ADSL的TRTSEQP是计划的交叉处理顺序。TRTP是将在此数据集的分析中使用的处理变量。仅针对同时拥有两个时期数据的受试者得出两个端点记录。
The conventions used in AVISITN are producer-defined. In this example, the producer has decided that AVISITN contains the values -0.5, 9998, and 9999 for derived baseline, period 1 endpoint, and period 2 endpoint records, respectively, and week number otherwise.
AVISITN中使用的约定是生产者定义的。在此示例中,生产者已决定AVISITN分别为导出的基线,期间1终结点和期间2终结点记录包含值-0.5、9998和9999,否则包含周数。
It should be noted that in this example, the producer elected to define APERIOD only for the on-treatment visits, therefore leaving TRTP, APERIOD, and APERIODC empty on other records. This is not meant to imply a standard or best practice.
应当注意,在此示例中,生产者选择仅为进行中的访问定义APERIOD,因此在其他记录上将TRTP,APERIOD和APERIODC留空。这并不意味着暗示标准或最佳实践。
Table 4.2.1.3.5 Illustration of Rule 3: Creation of Endpoint Rows to Facilitate Analysis of a Crossover Design
表4.2.1.3.5规则3的插图:创建端点行以促进对交叉设计的分析
Row
USUBJID
PARAMCD
AVISIT
AVISITN
VISITNUM
DTYPE
ANL01FL
TRTP
APERIOD
APERIODC
TRTSEQP
AVAL
ABLFL
BASE
CHG
1
0987_3984
ALT
Screening
-4
1
Y
Drug B, Drug A
16
17
.
2
0987_3984
ALT
Week -2
-2
2
Y
Drug B, Drug A
16
17
.
3
0987_3984
ALT
Baseline
-0.5
AVERAGE
Y
Drug B, Drug A
17
Y
17
0
4
0987_3984
ALT
Week 0
0
3
Y
Drug B, Drug A
18
17
.
5
0987_3984
ALT
Week 4
4
4
Y
Drug B
1
Period 1
Drug B, Drug A
14
17
-3
6
0987_3984
ALT
Week 8
8
4.1
Drug A
2
Period 2
Drug B, Drug A
10
17
-7
7
0987_3984
ALT
Week 8
8
5
Y
Drug A
2
Period 2
Drug B, Drug A
12
17
-5
8
0987_3984
ALT
Period 1 Endpoint
9998
4
ENDPOINT
Y
Drug B
1
Period 1
Drug B, Drug A
14
17
-3
9
0987_3984
ALT
Period 2 Endpoint
9999
5
ENDPOINT
Y
Drug A
2
Period 2
Drug B, Drug A
12
17
-5
10
0987_4252
ALT
Screening
-4
1
Y
Drug A, Drug B
12
11
.
11
0987_4252
ALT
Baseline
-0.5
AVERAGE
Y
Drug A, Drug B
11
Y
11
0
12
0987_4252
ALT
Week 0
0
3
Y
Drug A, Drug B
11
11
.
13
0987_4252
ALT
Week 4
4
4
Y
Drug A
1
Period 1
Drug A, Drug B
14
11
3
14
0987_4252
ALT
Week 8
8
5
Y
Drug B
2
Period 2
Drug A, Drug B
15
11
4
15
0987_4252
ALT
Period 1 Endpoint
9998
4
ENDPOINT
Y
Drug A
1
Period 1
Drug A, Drug B
14
11
3
16
0987_4252
ALT
Period 2 Endpoint
9999
5
ENDPOINT
Y
Drug B
2
Period 2
Drug A, Drug B
15
11
4
4.2.1.3 Rule 4: A function of multiple rows within a parameter should be added as a new parameter.
4.2..1.3 规则4:应将参数中具有多个行的功能添加为新参数。
Rule 4 is a special case of Rule 2. The functions covered by this rule violate the second condition of Rule 1 (they are not same-row functions of AVAL), and may also violate the first and third conditions.
规则4是规则2的特例。此规则涵盖的功能违反了规则1的第二条件(它们不是AVAL的同一行功能),并且还可能违反了第一和第三条件。
Table 4.2.1.4.1 shows an example of a clinical trial of a human immunodeficiency virus (HIV) vaccine, where blood samples are drawn at each visit, and CD4 cell count is measured. To assess efficacy, it is important to look at the cumulative effect over time on CD4 cell count during follow-up after administration.
表4.2.1.4.1显示了一种人类免疫缺陷病毒(HIV)疫苗的临床试验示例,其中在每次访问时抽取血液样本,并测量CD4细胞计数。为了评估疗效,重要的是要观察给药后随访期间随时间对CD4细胞计数的累积影响。
Let AVAL(t) equal the value of CD4 cell count at post-baseline visit t, and let VISITDY(t) be the planned study day of visit t. CD4AUC (cumulative daily CD4 count over follow-up) is calculated at any given post-baseline visit as follows:
设AVAL(t)等于基线后t期的CD4细胞计数,将VISITDY(t)作为基线后t期的计划研究日。在基线后的任何一次随访中,计算CD4AUC(累计的每日CD4细胞计数)如下:
CD4AUC at baseline visit is set to 0.基线访问时CD4AUC设置为0。
CD4AUC(t) = CD4AUC(t-1) + [ 0.5 * AVAL(t-1) + 0.5 * AVAL(t) ] * [ VISITDY(t) - VISITDY(t-1) ].
CD4AUC is not a simple same-row function of BASE and AVAL. It is calculated based on data from multiple observations (rows) of CD4 data, so it should be added as a new parameter rather than as a new column. CD4AUC is not defined pre-baseline, which is why there is no Week -1 for this parameter.
CD4AUCMB (cumulative average change from baseline in daily CD4 count over follow-up) is calculated as
CD4AUC不是BASE和AVAL的简单同行功能。它是根据CD4数据的多个观测值(行)中的数据计算得出的,因此应将其添加为新参数而不是新列。未在基线前定义CD4AUC,这就是为什么此参数没有Week -1的原因。
CD4AUCMB(每日平均CD4计数与随访值相对于基线的累计平均变化)计算为
CD4AUCMB(t) = CD4AUC(t) / [ VISITDY(t) - 1 ] - baseline value of CD4 cell count.
CD4AUCMB is a function of both CD4AUC and the baseline value of CD4, so it also must be its own parameter (see Rule 5 below). CD4AUCMB is not defined for pre-baseline and baseline records and therefore these records are not represented within this value of PARAM.
CD4AUCMB是CD4AUC和CD4基线值的函数,因此它也必须是其自己的参数(请参见下面的规则5)。CD4AUCMB未为基线前和基线记录定义,因此这些记录未在此PARAM值内表示。
Table 4.2.1.4.1 Illustration of Rule 4: Creation of a New Parameter to Handle a Function of More Than One Row of a Parameter
表4.2.1.4.1规则4的图示:创建新参数以处理超过一行参数的函数
Row
PARAM
PARAMCD
AVISIT
VISITDY
ABLFL
AVAL
BASE
1
CD4 (cells/mm3)
CD4
Week -1
-7
75
76
2
CD4 (cells/mm3)
CD4
Week 0
1
Y
76
76
3
CD4 (cells/mm3)
CD4
Week 2
15
128
76
4
CD4 (cells/mm3)
CD4
Week 4
29
125
76
5
CD4 (cells/mm3)
CD4
Week 8
57
191
76
6
CD4 (cells/mm3)
CD4
Week 12
85
167
76
7
CD4 (cells/mm3)
CD4
Week 16
113
136
76
8
CD4 Cumulative AUC
CD4AUC
Week 0
1
Y
0
0
9
CD4 Cumulative AUC
CD4AUC
Week 2
15
1428
0
10
CD4 Cumulative AUC
CD4AUC
Week 4
29
3199
0
11
CD4 Cumulative AUC
CD4AUC
Week 8
57
7623
0
12
CD4 Cumulative AUC
CD4AUC
Week 12
85
12635
0
13
CD4 Cumulative AUC
CD4AUC
Week 16
113
16877
0
14
CD4 Cumulative AUCMB
CD4AUCMB
Week 2
15
26
15
CD4 Cumulative AUCMB
CD4AUCMB
Week 4
29
38.25
16
CD4 Cumulative AUCMB
CD4AUCMB
Week 8
57
60.125
17
CD4 Cumulative AUCMB
CD4AUCMB
Week 12
85
74.4167
18
CD4 Cumulative AUCMB
CD4AUCMB
Week 16
113
74.6875
4.2.1.4 Rule 5: A function of more than one parameter should be added as a new parameter.
4.2.1.4 规则5:应将一个以上参数的功能添加为新参数。
There is often a need to derive for analysis a parameter that was not collected. Such parameters may be quite complex functions of data from multiple SDTM domains and domain classes. Rule 5 addresses the case where a parameter is derived from other parameters already present in the dataset.
通常需要导出未收集的参数进行分析。这些参数可能是来自多个SDTM域和域类的数据的非常复杂的功能。规则5解决了从数据集中已经存在的其他参数派生参数的情况。
For example, a questionnaire total domain score is calculated as a function of more than one observed question. The total domain score should be added as a new parameter, with its corresponding set of derived rows. For this derived parameter, the value of PARAM could be "Total Domain Score", and the value of the total domain score would be stored in the standard AVAL column, the baseline value would be stored in the standard BASE column, change from baseline would be stored in CHG, as usual.
例如,根据一个以上观察到的问题来计算问卷的总域得分。应将域总得分添加为新参数,并带有其对应的一组导出行。对于此派生的参数,PARAM的值可以是“总域得分”,总域得分的值将存储在标准AVAL列中,基线值将存储在标准BASE列中,与基线的变化将照常存储在CHG中。
In the example in Table 4.2.1.5.1, blood samples are drawn at every visit, and laboratory test measurements of total cholesterol and high-density lipoprotein cholesterol are found in the SDTM LB dataset. The protocol calls for analysis of each individual lab analyte, and also for an analysis of the ratio of total cholesterol (C) to high-density lipoprotein (HDL) cholesterol. The ADaM dataset contains parameters for each of the two measured lab tests, as well as a new set of derived rows where the description in PARAM is "Total Cholesterol:HDL-C ratio", and AVAL contains the calculated ratio at each timepoint.
在表4.2.1.5.1的示例中,每次访问都抽取血液样本,并且在SDTM LB数据集中可以找到总胆固醇和高密度脂蛋白胆固醇的实验室测试测量值。该协议要求分析每种实验室分析物,还需要分析总胆固醇(C)与高密度脂蛋白(HDL)胆固醇的比率。ADaM数据集包含两个测量的实验室测试中的每一个的参数,以及一组新的派生行,其中PARAM中的描述为“总胆固醇:HDL-C比率”,而AVAL包含每个时间点的计算比率。
The analysis of percent change from baseline (PCHG) is of interest for all three parameters and is therefore populated on all records. In general, however, if percent change is not analyzed for a particular value of PARAM, then it is not necessary to populate PCHG for those rows.
所有三个参数都需要分析相对于基线的变化百分比(PCHG),因此将其填充在所有记录中。但是,通常,如果未针对特定的PARAM值分析百分比变化,则无需为这些行填充PCHG。
Table 4.2.1.5.1 Illustration of Rule 5: Creation of New Parameter to Handle a Function of More Than One Parameter
表4.2.1.5.1规则5的图示:创建新参数以处理一个以上参数的功能
Row
PARAM
PARAMCD
AVISIT
AVISITN
VISITNUM
LBSEQ
ABLFL
AVAL
BASE
CHG
PCHG
1
Total Cholesterol (mg/dL)
CHOL
Screening
-2
1
39394
265
266
.
.
2
Total Cholesterol (mg/dL)
CHOL
Run-In
-1
2
25593
278
266
.
.
3
Total Cholesterol (mg/dL)
CHOL
Week 0
0
3
23213
Y
266
266
0
0.000
4
Total Cholesterol (mg/dL)
CHOL
Week 2
2
4
32952
259
266
-7
-2.632
5
Total Cholesterol (mg/dL)
CHOL
Week 4
4
5
12768
235
266
-31
-11.654
6
Total Cholesterol (mg/dL)
CHOL
Week 8
8
6
18773
242
266
-24
-9.023
7
Total Cholesterol (mg/dL)
CHOL
Week 12
12
7
28829
217
266
-49
-18.421
8
High-Density Lipoprotein Chol (mg/dL)
HDL
Screening
-2
1
32437
44
42
.
.
9
High-Density Lipoprotein Chol (mg/dL)
HDL
Run-In
-1
2
26884
40
42
.
.
10
High-Density Lipoprotein Chol (mg/dL)
HDL
Week 0
0
3
52657
Y
42
42
0
0.000
11
High-Density Lipoprotein Chol (mg/dL)
HDL
Week 2
2
4
38469
43
42
1
2.381
12
High-Density Lipoprotein Chol (mg/dL)
HDL
Week 4
4
5
12650
47
42
5
11.905
13
High-Density Lipoprotein Chol (mg/dL)
HDL
Week 8
8
6
24345
46
42
4
9.524
14
High-Density Lipoprotein Chol (mg/dL)
HDL
Week 12
12
7
23484
47
42
5
11.905
15
Total Cholesterol:HDL-C ratio
CHOLH
Screening
-2
1
6.023
6.333
.
.
16
Total Cholesterol:HDL-C ratio
CHOLH
Run-In
-1
2
6.950
6.333
.
.
17
Total Cholesterol:HDL-C ratio
CHOLH
Week 0
0
3
Y
6.333
6.333
0.000
0.000
18
Total Cholesterol:HDL-C ratio
CHOLH
Week 2
2
4
6.023
6.333
-0.310
-4.896
19
Total Cholesterol:HDL-C ratio
CHOLH
Week 4
4
5
5.000
6.333
-1.333
-21.053
Row
PARAM
PARAMCD
AVISIT
AVISITN
VISITNUM
LBSEQ
ABLFL
AVAL
BASE
CHG
PCHG
20
Total Cholesterol:HDL-C ratio
CHOLH
Week 8
8
6
5.261
6.333
-1.072
-16.934
21
Total Cholesterol:HDL-C ratio
CHOLH
Week 12
12
7
4.617
6.333
-1.716
-27.100
Rule 6: When there is more than one definition of baseline, each additional definition of baseline requires the creation of its own set of rows. 规则6:当存在多个基准定义时,每个附加的基准定义都需要创建自己的一组行。In case there is more than one definition of baseline in an ADaM dataset, new rows must be created for each additional alternative definition of baseline. There will therefore be multiple sets of rows, where each set of rows corresponds to a particular definition of baseline. Whenever there is more than one definition of baseline, the BASETYPE column is required. BASETYPE identifies the definition of baseline that corresponds to the value of BASE in each row. There is only one BASE column, and only one column for each qualifying function of AVAL and BASE.
如果ADaM数据集中有多个基准定义,则必须为基准的每个其他替代定义创建新行。因此,将有多组行,其中每组行对应于基线的特定定义。只要有一个以上的基准定义,就需要BASETYPE列。BASETYPE标识基线的定义,该定义与每一行中BASE的值相对应。对于AVAL和BASE的每个限定功能,只有BASE列,只有一列。
Table 4.2.1.6.1 presents a dataset supporting shift analysis from three different baselines. Accordingly, it makes use of the BASETYPE variable described above. The ANRIND, BNRIND, and SHIFTy variables are also illustrated. In this example, the three baselines of interest characterize different portions of the study: run-in, double-blind, and open-label. For any datapoint, it is desired to have the ability to analyze shift from the most recent baseline or any prior baseline. Rows 1-12 are the initial set of rows representing all of the collected data. They permit analysis of the shift in normal range indicator from the run-in baseline to any value in the run-in, double-blind, or open-label portions of the study. Additional sets of rows are added to support analysis of shift from the double-blind and open-label baselines: rows 13-19 permit analysis of the shift from the double-blind baseline normal range indicator for data in either the double-blind or open- label portions of the study; and rows 20-22 support analysis of shift from open-label baseline for data in the open-label portion of the study.
表4.2.1.6.1给出了支持从三个不同基准进行班次分析的数据集。因此,它利用了上述的BASETYPE变量。还说明了ANRIND,BNRIND和SHIFTy变量。在此示例中,三个感兴趣的基线表征了研究的不同部分:磨合,双盲和开放标签。对于任何数据点,都希望能够分析从最新基线或任何先前基线的偏移。第1-12行是代表所有收集数据的初始行集。他们可以分析正常范围指标从磨合基线到研究的磨合,双盲或开放标签部分中任何值的变化。添加了其他行集以支持从双盲和开放标签基线的偏移分析:第13至19行允许分析双盲基线正常范围指标的偏移,以获取研究的双盲或开放标签部分中的数据;第20-22行支持对本研究的开放标签部分中的数据从开放标签基线的转变进行分析。
Note that only the rows needed for the analysis are included in the additional sets. For example, the set of rows for the shift from the double-blind baseline does not include the rows for EPOCH="RUN-IN" and EPOCH="STABILIZATION" as they are not analyzed using the double-blind baseline.
请注意,仅分析所需的行才包含在其他集中。例如,从双盲基线转移的行集不包括EPOCH =“ RUN-IN”和EPOCH =“ STABILIZATION”的行,因为它们没有使用双盲基线进行分析。
For space reasons, the ANLzzFL variable is not shown, although it would be needed to identify which record is selected in cases of multiple observed records within an analysis timepoint, as is the case for AVISIT=WEEK 12 (DB) for this subject and parameter.
出于空间原因,未显示ANLzzFL变量,尽管在分析时间点内有多个观察到的记录的情况下,将需要识别选择哪个记录,就像该主题和参数的AVISIT = WEEK 12(DB)一样。
Table 4.2.1.6.1 Illustration of Rule 6: Creation of New Rows to Handle Multiple Baseline Definitions - Supporting Comparisons to Any Prior Baseline
表4.2.1.6.1规则6的插图:创建新行以处理多个基准定义-支持与任何先前基准的比较
Row
BASETYPE
EPOCH
AVISIT
LBSEQ
AVAL
ANRLO
ANRHI
ANRIND
ABLFL
BASE
BNRIND
SHIFT1
1
RUN-IN
RUN-IN
BSLN (RUN-IN)
111
34.5
15.4
48.5
NORMAL
Y
34.5
NORMAL
2
RUN-IN
RUN-IN
WK 8 (RUN-IN)
168
11.6
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
3
RUN-IN
RUN-IN
END POINT (RUN-IN)
168
11.6
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
4
RUN-IN
STABILIZATION
WK 14 (STAB.)
200
13.1
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
5
RUN-IN
STABILIZATION
END POINT (STAB.)
200
13.1
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
6
RUN-IN
DOUBLE-BLIND
BSLN (DB)
200
13.1
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
7
RUN-IN
DOUBLE-BLIND
WK 12 (DB)
295
13.7
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
8
RUN-IN
DOUBLE-BLIND
WK 12 (DB)
300
19.7
15.4
48.5
NORMAL
34.5
NORMAL
NORMAL to NORMAL
9
RUN-IN
DOUBLE-BLIND
END POINT (DB)
300
19.7
15.4
48.5
NORMAL
34.5
NORMAL
NORMAL to NORMAL
10
RUN-IN
OPEN-LABEL
BSLN (OPEN)
300
19.7
15.4
48.5
NORMAL
34.5
NORMAL
NORMAL to NORMAL
11
RUN-IN
OPEN-LABEL
WK 24 (OPEN)
350
28.1
15.4
48.5
NORMAL
34.5
NORMAL
NORMAL to NORMAL
12
RUN-IN
OPEN-LABEL
END POINT (OPEN)
350
28.1
15.4
48.5
NORMAL
34.5
NORMAL
NORMAL to NORMAL
Row
BASETYPE
EPOCH
AVISIT
LBSEQ
AVAL
ANRLO
ANRHI
ANRIND
ABLFL
BASE
BNRIND
SHIFT1
13
DBL-BLIND
DOUBLE-BLIND
BSLN (DB)
200
13.1
15.4
48.5
LOW
Y
13.1
LOW
14
DBL-BLIND
DOUBLE-BLIND
WK 12 (DB)
295
13.7
15.4
48.5
LOW
13.1
LOW
LOW to LOW
15
DBL-BLIND
DOUBLE-BLIND
WK 12 (DB)
300
19.7
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
16
DBL-BLIND
DOUBLE-BLIND
END POINT (DB)
300
19.7
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
17
DBL-BLIND
OPEN-LABEL
BSLN (OPEN)
300
19.7
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
18
DBL-BLIND
OPEN-LABEL
WK 24 (OPEN)
350
28.1
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
19
DBL-BLIND
OPEN-LABEL
END POINT (OPEN)
350
28.1
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
20
OPEN-LABEL
OPEN-LABEL
BSLN (OPEN)
300
19.7
15.4
48.5
NORMAL
Y
19.7
NORMAL
21
OPEN-LABEL
OPEN-LABEL
WK 24 (OPEN)
350
28.1
15.4
48.5
NORMAL
19.7
NORMAL
NORMAL to NORMAL
22
OPEN-LABEL
OPEN-LABEL
END POINT (OPEN)
350
28.1
15.4
48.5
NORMAL
19.7
NORMAL
NORMAL to NORMAL
The example in Table 4.2.1.6.1 supports the ability to analyze shift from the most recent baseline or any prior baseline. In contrast, if it is needed only to have the ability to analyze shift from the most recent baseline, then the dataset does not need as many rows.
表4.2.1.6.1中的示例支持分析从最新基准或任何先前基准开始的偏移的功能。相反,如果仅需要具有分析来自最新基准线的偏移的能力,则数据集不需要那么多的行。
Table 4.2.1.6.2 illustrates an arrangement supporting analysis from the most recent baseline only. Because there is more than one definition of baseline, the BASETYPE variable is still needed.
表4.2.1.6.2说明了仅支持从最新基准进行分析的安排。因为基线定义不止一个,所以仍需要BASETYPE变量。
Table 4.2.1.6.2 Illustration of Rule 6: Creation of New Rows to Handle Multiple Baseline Definitions - Supporting Comparison to Most Recent Baseline
表4.2.1.6.2规则6的插图:创建用于处理多个基准定义的新行-支持与最新基准的比较
Row
BASETYPE
EPOCH
AVISIT
LBSEQ
AVAL
ANRLO
ANRHI
ANRIND
ABLFL
BASE
BNRIND
SHIFT1
1
RUN-IN
RUN-IN
BSLN (RUN-IN)
111
34.5
15.4
48.5
NORMAL
Y
34.5
NORMAL
2
RUN-IN
RUN-IN
WK 8 (RUN-IN)
168
11.6
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
3
RUN-IN
RUN-IN
END POINT (RUN-IN)
168
11.6
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
4
RUN-IN
STABILIZATION
WK 14 (STAB.)
200
13.1
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
5
RUN-IN
STABILIZATION
END POINT (STAB.)
200
13.1
15.4
48.5
LOW
34.5
NORMAL
NORMAL to LOW
6
DBL-BLIND
DOUBLE-BLIND
BSLN (DB)
200
13.1
15.4
48.5
LOW
Y
13.1
LOW
7
DBL-BLIND
DOUBLE-BLIND
WK 12 (DB)
295
13.7
15.4
48.5
LOW
13.1
LOW
LOW to LOW
8
DBL-BLIND
DOUBLE-BLIND
WK 12 (DB)
300
19.7
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
9
DBL-BLIND
DOUBLE-BLIND
END POINT (DB)
300
19.7
15.4
48.5
NORMAL
13.1
LOW
LOW to NORMAL
10
OPEN-LABEL
OPEN-LABEL
BSLN (OPEN)
300
19.7
15.4
48.5
NORMAL
Y
19.7
NORMAL
11
OPEN-LABEL
OPEN-LABEL
WK 24 (OPEN)
350
28.1
15.4
48.5
NORMAL
19.7
NORMAL
NORMAL to NORMAL
12
OPEN-LABEL
OPEN-LABEL
END POINT (OPEN)
350
28.1
15.4
48.5
NORMAL
19.7
NORMAL
NORMAL to NORMAL
Table 4.2.1.6.1 and Table 4.2.1.6.2 illustrate example solutions in the case where different baselines are needed to characterize different portions of a study. In general, however, there might be other reasons that more than one definition of baseline might be needed. It could also be that there are multiple ways to construct a particular baseline value (e.g., last value prior to treatment, average value at the baseline visit, minimum value prior to treatment, etc.).
表4.2.1.6.1和表4.2.1.6.2说明了在需要不同基准来表征研究的不同部分时的示例解决方案。但是,总的来说,可能还有其他原因可能需要一个以上的基准定义。也可能有多种方法来构建特定的基线值(例如,治疗前的最后一个值,基线就诊时的平均值,治疗前的最小值等)。
For a given parameter, whenever there is more than one definition of baseline, BASETYPE is required and must be populated. For any given parameter and subject, whenever there is more than one definition of baseline, the number of records flagged with ABLFL=Y is equal to the number of values of BASETYPE.
对于给定的参数,只要有一个以上的基线定义,就需要BASETYPE,并且必须填充BASETYPE。对于任何给定的参数和主题,只要有一个以上的基准定义,用ABLFL = Y标记的记录数就等于BASETYPE的值数。
4.3 Inclusion of All Observed and Derived Records for a Parameter Versus the Subset of Records Used for Analysis
4.3 包括参数的所有观测记录和派生记录与用于分析的记录子集
This section discusses whether the ADaM dataset should include all rows of an analysis parameter, or only the subset of rows that are used for analysis. A value of AVAL or AVALC for an analysis parameter at a specific timepoint may be observed (i.e., collected on the case report form or in an electronic diary at that timepoint), it may be imputed because it was missing, or it may be derived from a combination of other values.
本节讨论ADaM数据集是应包含分析参数的所有行,还是仅包含用于分析的行的子集。可以观察到特定时间点的分析参数的AVAL或AVALC值(即,在该时间点收集在病例报告表或电子日记中),可能由于缺少而被推论,或者可以推导得出来自其他值的组合。
To illustrate the issue being presented, assume that the total scores for Questionnaire A (administered at Visits 1, 2, and 3) are in the SDTM QS dataset as illustrated below. Any missing total scores are imputed by carrying the last post-baseline (post-Visit 1) total score forward. The total score for Visit 3 will be analyzed.
为了说明所提出的问题,假设问卷A的总分(在访问1、2和3中进行管理)在SDTM QS数据集中,如下所示。通过将最后一个基准后(访问后1)总分向前推进,可以估算出所有缺失的总分。将分析访问3的总得分。
In the SDTM QS dataset data shown below, Subject 0001 has data for Visits 1, 2, and 3; Subject 0002 will not be included in the analysis, as there are no post-baseline data for the subject; Subject 0003 has data for Visits 1 and 2, but is missing data for Visit 3.
在下面显示的SDTM QS数据集数据中,主题0001具有访问1、2和3的数据;主题0002将不包含在分析中,因为没有该主题的基准后数据;主题0003具有访问1和2的数据,但缺少访问3的数据。
Table 4.3.1 Illustration of Issue, Data as Found in SDTM QS Dataset
表4.3.1问题说明,在SDTM QS数据集中找到的数据
qs.xpt
Row
DOMAIN
USUBJID
QSSEQ
QSTESTCD
QSCAT
QSSTRESN
VISITNUM
1
QS
0001
101
TOTSCORE
QUES-A
7
1
2
QS
0001
201
TOTSCORE
QUES-A
12
2
3
QS
0001
555
TOTSCORE
QUES-A
14
3
4
QS
0002
91
TOTSCORE
QUES-A
4
1
5
QS
0003
156
TOTSCORE
QUES-A
2
1
6
QS
0003
300
TOTSCORE
QUES-A
6
2
The questions that arise are whether or not the ADaM dataset should contain data for Subject 0002 even though the subject is not included in the analysis and if the ADaM dataset should contain totals for Visits 1 and 2 even though the data being analyzed are from Visit 3.
出现的问题是,即使分析中未包含主题,ADaM数据集是否也应包含主题0002的数据;即使所分析的数据来自访问3,ADaM数据集是否也应包含访问1和访问2的总数。
ADaM MethodologyADaM方法论
The ADaM methodology is to include all observed and derived rows for a given analysis parameter. The inclusion of all the rows in the ADaM dataset, including those not used in the analysis, requires a way to identify the rows used in the specified analysis. The advantage of this approach is that the inclusion of all rows makes it easier to verify that the selection and derived timepoint processing was done correctly, thus providing useful traceability. In addition, the data are also then available to enable other analyses, including sensitivity analyses. However, this approach increases the size of the dataset; it also introduces a risk that the appropriate selection criteria will not be used, producing incorrect analysis results.
ADaM方法将包括给定分析参数的所有观察到的行和派生的行。要在ADaM数据集中包含所有行,包括分析中未使用的行,就需要一种方法来标识指定分析中使用的行。这种方法的优点是,所有行的包含使您更容易验证选择和派生的时间点处理是否正确完成,从而提供有用的可跟踪性。此外,这些数据还可用于启用其他分析,包括敏感性分析。但是,这种方法增加了数据集的大小。这还会带来不使用适当选择标准的风险,从而产生错误的分析结果。
Regulatory reviewers prefer that the path followed in creating and/or selecting analysis rows be clearly delineated and traceable all the way back to the originating rows in the SDTM dataset, if possible and within reason. Simply including the algorithm in the metadata is often not sufficient, as any complicated data manipulations may not be clearly identified (e.g., how missing pieces of the input data were handled). Retaining in one dataset all of the observed and derived rows for the analysis parameter provides the clearest traceability in the most flexible manner within the standard BDS. The resulting dataset also provides the most flexibility for testing the robustness of an analysis (e.g., using a different imputation method).
监管审核者更喜欢在可能的情况下,在创建和/或选择分析行时遵循的路径应清晰描绘,并一直追溯到SDTM数据集中的原始行。仅将算法包括在元数据中通常是不够的,因为可能无法清楚地识别出任何复杂的数据操作(例如,如何处理丢失的输入数据)。在标准BDS中,以最灵活的方式将所有观察到的和派生的分析参数行保留在一个数据集中,可以提供最清晰的可追溯性。所得数据集还为测试分析的鲁棒性提供了最大的灵活性(例如,使用不同的插补方法)。
Example 1
例子1
In the example discussed above (Table 4.3.1), the ADaM dataset would contain the following rows (Table 4.3.2) for the total score parameter:
在上面讨论的示例(表4.3.1)中,ADaM数据集将包含以下总得分参数行(表4.3.2):
Table 4.3.2 Example 1: ADaM Dataset
表4.3.2示例1:ADaM数据集
Row
PARAMCD
USUBJID
VISITNUM
AVISITN
AVISIT
AVAL
DTYPE
QSSEQ
1
TOTSCORE
0001
1
1
Visit 1
7
101
2
TOTSCORE
0001
2
2
Visit 2
12
201
3
TOTSCORE
0001
3
3
Visit 3
14
555
4
TOTSCORE
0002
1
1
Visit 1
4
91
5
TOTSCORE
0003
1
1
Visit 1
2
156
6
TOTSCORE
0003
2
2
Visit 2
6
300
7
TOTSCORE
0003
2
3
Visit 3
6
LOCF
300
For the analysis discussed above, the data to be analyzed are selected by specifying that AVISITN=3 (or AVISIT=Visit 3).
对于上面讨论的分析,通过指定AVISITN = 3(或AVISIT = Visit 3)来选择要分析的数据。
It should be noted that this approach does not require the inclusion of all rows from the input dataset. For example, if the SDTM QS dataset contains data for several different questionnaires, then data from questionnaires other than the one being analyzed do not have to be included in the ADaM dataset.
应该注意的是,这种方法不需要包含输入数据集中的所有行。例如,如果SDTM QS数据集包含几个不同调查表的数据,则ADaM数据集中不必包含来自被调查表之外的其他调查表的数据。
Example
In the following example (Table 4.3.3 and Table 4.3.4), the Q01 assessment is scheduled to be performed at visits 1, 3, 5, and 7, and results are to be summarized at those visits. Subject 1099 has data for the assessment at visits 1, 2, and 7. (Note that though the assessment was not scheduled to be performed at Visit 2, the data show the assessment was performed at that time for that subject.) Subject 2001 is not in the Full Analysis Set. Subject 3023 has two assessments at Visit 5, and the study's analysis plan specifies that only the first occurrence within a visit will be analyzed; however, as this subject does not have a Visit 7 row in the data, the later of the Visit 5 rows is carried forward into Visit 7. The SDTM dataset that is the basis for the ADaM dataset has the following rows:
例
在以下示例中(表4.3.3和表4.3.4),计划在访问1、3、5和7进行Q01评估,并对这些访问进行总结。受试者1099在访问1、2和7时具有评估数据。(请注意,尽管未计划在访问2中进行评估,但数据显示当时对该受试者进行了评估。)受试者2001年为不在完整分析集中。受试者3023在第5次访问时进行了两次评估,研究的分析计划指定仅对访问中的第一个事件进行分析;但是,由于该主题的数据中没有第7个访问行,因此第5个访问行中的后一个结转到第7个访问中。作为ADaM数据集基础的SDTM数据集具有以下行:
Table 4.3.3 Example 2: Data as Found in SDTM QS Dataset
表4.3.3示例2:在SDTM QS数据集中找到的数据
Row
USUBJID
QSSEQ
QSTESTCD
QSSTRESN
VISITNUM
VISIT
QSDTC
1
1099
111
Q01
25
1
BASELINE
2005-04-04
2
1099
121
Q01
24
2
VISIT 2
2005-05-02
3
1099
132
Q01
15
7
VISIT 7
2005-08-22
4
2001
150
Q01
27
1
BASELINE
2005-02-05
5
3023
117
Q01
31
1
BASELINE
2005-06-30
6
3023
123
Q01
29
3
VISIT 3
2005-07-25
7
3023
134
Q01
28
5
VISIT 5
2005-08-20
8
3023
135
Q01
25
5
VISIT 5
2005-08-21
The ADaM dataset contains rows corresponding to those found in SDTM as well as rows created by LOCF for the missing visit assessments, together with the flags and other columns needed to identify the rows to be included in a given analysis:
ADaM数据集包含与SDTM中找到的行相对应的行,以及LOCF为遗失访问评估创建的行,以及标识要包含在给定分析中的行所需的标志和其他列:
Table 4.3.4 Example 2: ADaM Dataset
表4.3.4示例2:ADaM数据集
Row
PARAMCD
USUBJID
VISITNUM
VISIT
AVISITN
AVISIT
AVAL
DTYPE
ANL01FL
FASFL
QSSEQ
1
Q01
1099
1
BASELINE
1
BASELINE
25
Y
Y
111
2
Q01
1099
2
VISIT 2
24
Y
121
3
Q01
1099
2
VISIT 2
3
VISIT 3
24
LOCF
Y
Y
121
4
Q01
1099
2
VISIT 2
5
VISIT 5
24
LOCF
Y
Y
121
5
Q01
1099
7
VISIT 7
7
VISIT 7
15
Y
Y
132
6
Q01
2001
1
BASELINE
1
BASELINE
27
Y
N
150
7
Q01
3023
1
BASELINE
1
BASELINE
31
Y
Y
117
8
Q01
3023
3
VISIT 3
3
VISIT 3
29
Y
Y
123
9
Q01
3023
5
VISIT 5
5
VISIT 5
28
Y
Y
134
10
Q01
3023
5
VISIT 5
5
VISIT 5
25
Y
135
11
Q01
3023
5
VISIT 5
7
VISIT 7
25
LOCF
Y
Y
135
Selection criteria applicable to this example include:
适用于此示例的选择标准包括:
DTYPE is null identifies the data as found in the SDTM dataset.DTYPE为null表示在SDTM数据集中找到的数据。
DTYPE="LOCF" specifies the method used to derive the added rows, and indicates that those rows were derived.DTYPE =“ LOCF”指定用于派生添加的行的方法,并指示派生那些行。
The Subject-level flag FASFL="Y" identifies the subjects who are members of the Full Analysis Set.主题级别标志FASFL =“ Y”标识属于完全分析集的主题。
ANL01FL="Y" identifies the rows chosen to represent each AVISIT. There were multiple observations for subject 3023 at AVISITN=5 and therefore in this example, rows with ANL01FL="Y" are the ones that have been chosen to represent their respective analysis timepoints.ANL01FL =“ Y”标识为代表每个AVISIT而选择的行。在AVISITN = 5处有多个针对对象3023的观察值,因此在此示例中,已选择ANL01FL =“ Y”的行表示它们各自的分析时间点。
ANL01FL is null for subject 1099 for VISIT="VISIT 2" (row 2) because visit 2 is an unscheduled visit for this questionnaire and Visit 2 will not be presented in the analyses; AVISITN and AVISIT are also null because they do not map to visits used for analyses described in the study's analysis plan.VISIT =“ VISIT 2”(第2行)的主题1099的ANL01FL为空,因为访问2是此调查表的非计划访问,并且访问2将不会出现在分析中;AVISITN和AVISIT也为空,因为它们未映射到研究分析计划中描述的用于分析的访问。
The combination of "(ANL01FL="Y" and FASFL="Y" and AVISITN=5)" identifies the rows used in a FAS analysis of Visit 5 data.“((ANL01FL =“ Y” and FASFL =“ Y” and AVISITN = 5)“的组合标识了访问5数据的FAS分析中使用的行。
The other approach considered was to include in the ADaM dataset only the rows that are actually used in the analysis of the analysis parameter. In Example 1 above, only Visit 3 rows that were either observed or derived by LOCF would be included in the ADaM dataset. The main advantage of this approach would be to simplify the analysis, as no selection clause would need to be used to identify the appropriate rows for inclusion in the analysis. However, the primary disadvantages would be the loss of traceability and the loss of flexibility for testing the robustness of the analysis. Because of these disadvantages, this approach was not chosen.
考虑的另一种方法是在ADaM数据集中仅包含在分析参数分析中实际使用的行。在上面的示例1中,在ADaM数据集中仅包含LOCF观察到或派生的“访问3”行。这种方法的主要优点是简化了分析,因为不需要使用选择子句来标识要包含在分析中的适当行。但是,主要缺点是可追溯性的丧失和测试分析的稳健性的灵活性的丧失。由于这些缺点,因此未选择此方法。
包含未分析但支持ADaM数据集派生的输入数据
Section 4.3, Inclusion of All Observed and Derived Records for a Parameter Versus the Subset of Records Used for Analysis, states that for a given analysis parameter, all observed and derived rows of that parameter should be included in the dataset, not just the rows that are used in the analysis. Section 4.3 is a simple case of a more general topic addressed in this section.
第4.3节“包含参数的所有观察和派生记录与用于分析的记录子集”指出,对于给定的分析参数,该参数的所有观察到的和派生的行都应包括在数据集中,而不仅是那些用于分析。第4.3节是本节中讨论的更一般主题的简单案例。
This section addresses the broader issue of whether an ADaM dataset should contain the input data used in the derivation of the analysis data as well as the actual data being analyzed. This includes:
本节解决了一个更广泛的问题,即ADaM数据集是否应包含在导出分析数据以及所分析的实际数据时所使用的输入数据。这包括:
Input data rows and columns to support traceability of the derivation of analyzed rows and columns输入数据行和列以支持对所分析的行和列进行派生的可追溯性
Raw or derived predecessor parameters that are not analyzed themselves but are used to derive an analyzed parameter原始或派生的前代参数,这些参数本身不会进行分析,但可用于派生已分析的参数
These input data rows and columns could come from a single dataset or multiple datasets as necessary to derive the analysis data captured in AVAL or AVALC, as described by the analysis parameter.
如需要,这些输入数据行和列可以来自单个数据集或多个数据集,以导出在AVAL或AVALC中捕获的分析数据,如分析参数所述。
ADaM datasets are developed to facilitate intended analyses. In the ADaM Model document, it is assumed that the original data sources for ADaM datasets are SDTM datasets, even when ADaM datasets are derived from other ADaM datasets. ADaM has features that enable traceability from analysis results to ADaM datasets and from ADaM datasets to SDTM datasets.
开发ADaM数据集有助于进行预期的分析。在ADaM模型文档中,即使ADaM数据集是从其他ADaM数据集派生的,也假定ADaM数据集的原始数据源是SDTM数据集。ADaM具有使从分析结果到ADaM数据集以及从ADaM数据集到SDTM数据集的可追溯性的功能。
The ADaM methodology to achieve the expected traceability is to describe the derivation algorithms in the metadata and, if practical and feasible, to include supportive rows as appropriate for tracea输入数据行和列以支持对所分析的行和列进行派生的可追溯性bility. To include the input data as rows in the ADaM dataset, columns should be added where feasible to indicate the source of the input data. While this methodology increases both the size of the dataset and the complexity of selecting the appropriate rows for analysis, it also provides input data in an immediately accessible manner. In addition, intermediate values can be retained if appropriate flags are used to distinguish them.
实现预期可追溯性的ADaM方法是描述元数据中的派生算法,并且在可行和可行的情况下,包括适用于可追溯性的支持行。为了将输入数据作为行包含在ADaM数据集中,应在可行的地方添加列以指示输入数据的来源。虽然此方法既增加了数据集的大小,又增加了选择适当的行进行分析的复杂性,但它也以立即可访问的方式提供了输入数据。另外,如果使用适当的标志来区分中间值,则可以保留中间值。
In general, it is strongly recommended to include as much supporting data as is needed for traceability. However, there are situations in which this may not be practical. For example, if an analyzed parameter is a summary derived from a very large number of raw e-diary input records, it may be neither useful nor practical to include all of the raw e-diary records as rows in the ADaM dataset.
通常,强烈建议包含尽可能多的支持数据,以实现可追溯性。但是,在某些情况下这可能不切实际。例如,如果分析的参数是从大量原始电子日记输入记录中得出的摘要,则将所有原始电子日记记录作为行包含在ADaM数据集中可能既无用,也不实用。
The remainder of this section addresses cases where the ADaM datasets contain not only the analysis data but also input data that are necessary to provide clearer traceability of the algorithms used to derive the analysis data. In addition to the actual values used in the analysis, the dataset may include rows not used in the analysis, rows containing input data, and rows containing intermediate values computed during the derivation of the analysis data. Flags or other columns are used to distinguish the various data types as well as to provide a traceable path from the input data to the value used in the analysis. The analysis results metadata specify how the appropriate rows are identified (by a specific selection clause). The identification of rows used in an analysis is addressed in Sections 4.5, Identification of Records Used for Analysis, and 4.6, Identification of Population-Specific Analyzed Records.
本节的其余部分介绍了以下情况:ADaM数据集不仅包含分析数据,而且还包含输入数据,这些输入数据对于提供更清晰的用于导出分析数据的算法具有可追溯性。除了分析中使用的实际值外,数据集还可以包括分析中未使用的行,包含输入数据的行以及包含在分析数据推导期间计算出的中间值的行。标志或其他列用于区分各种数据类型,并提供从输入数据到分析所用值的可追溯路径。分析结果元数据指定如何(通过特定的选择子句)识别适当的行。在第4.5节中介绍了分析中使用的行的标识,识别用于分析的记录,以及4.6识别特定于人口的分析记录。
Unless the input data are already present as column(s) on the row (e.g., as covariate(s) or supportive variable(s)), the input data will be retained as rows in the ADaM dataset. The analysis value column (AVAL and/or AVALC) on the retained input data row will contain a value for the analysis parameter. Not all columns from the input dataset are carried into the ADaM dataset; instead, additional variables will be included indicating the source of the input data – domain, variable name, and sequence number. This approach allows the inclusion of input data from multiple domains. If the input data are already included in columns on the analysis parameter row (e.g., as covariates or supportive information), there is no need to include additional rows for those input data. The decision regarding keeping the input data as rows or columns will therefore be dictated by the types of input data and whether they are used for other purposes in the ADaM dataset.
除非输入数据已经作为行的列(例如,协变量或支持变量)存在,否则输入数据将作为行保留在ADaM数据集中。保留的输入数据行上的分析值列(AVAL和/或AVALC)将包含分析参数的值。并非来自输入数据集的所有列都被带入ADaM数据集。相反,将包括其他变量来指示输入数据的来源-域,变量名和序列号。这种方法允许包含来自多个域的输入数据。如果输入数据已经包含在分析参数行的列中(例如,作为协变量或支持信息),则无需为这些输入数据包括其他行。
Retaining in one dataset all data used in the determination of the analysis parameter value will provide the clearest traceability in the most flexible manner within the standard ADaM BDS. This large dataset also provides the most flexibility for testing the robustness of an analysis.
将用于确定分析参数值的所有数据保留在一个数据集中,将以最灵活的方式在标准ADaM BDS中提供最清晰的可追溯性。这个庞大的数据集还为测试分析的鲁棒性提供了最大的灵活性。
If it is determined that this large dataset is too cumbersome, the producer can choose to provide two datasets, one that contains all rows and another that is a subset of the first, containing only the rows used in the specified analysis. To ensure traceability, the metadata for the subset ADaM dataset will refer back to the full ADaM dataset as the immediate predecessor. This approach provides the needed traceability along with a dataset that can be used in an analysis without specifying a selection clause. The producer will need to ensure consistency is maintained between the two datasets. There is also potential confusion about which dataset supported an analysis, if analysis results metadata is not provided for that analysis.
将用于确定分析参数值的所有数据保留在一个数据集中,将以最灵活的方式在标准ADaM BDS中提供最清晰的可追溯性。这个庞大的数据集还为测试分析的鲁棒性提供了最大的灵活性。
Example 1
例子1
An ADaM dataset was created to support time-to-event analysis of a hypertension event. The analysis parameter was the study day of a hypertension event, defined to be the earliest study day among those of the following events: hospital admission, diastolic blood pressure exceeded 90, and systolic blood pressure exceeded 140. If a subject did not experience any of these events, the subject would be analyzed as censored on the day he/she exited the study.
创建了ADaM数据集以支持对高血压事件进行事件分析。分析参数是高血压事件的研究日,定义为以下事件中最早的研究日:入院,舒张压超过90,收缩压超过140。这些事件发生时,受试者将在退出研究之日进行审查。
Table 4.4.1 Example 1: Data as Found in SDTM VS Dataset
表4.4.1示例1:在SDTM VS数据集中找到的数据
Row
USUBJID
VSSEQ
VSTESTCD
VSSTRESN
VISITNUM
VSDTC
VSDY
1
2010
22
SYSBP
115
1
2004-08-05
1
2
2010
23
DIABP
75
1
2004-08-05
1
3
2010
101
SYSBP
120
2
2004-08-12
8
4
2010
102
DIABP
90
2
2004-08-12
8
5
2010
207
SYSBP
135
3
2004-08-19
15
6
2010
208
DIABP
92
3
2004-08-19
15
7
2010
238
SYSBP
138
4
2004-08-25
21
8
2010
239
DIABP
95
4
2004-08-25
21
9
3082
27
SYSBP
120
1
2004-09-08
1
10
3082
28
DIABP
80
1
2004-09-08
1
11
3082
119
SYSBP
125
2
2004-09-15
8
12
3082
120
DIABP
84
2
2004-09-15
8
Table 4.4.2 Example 1: Data as Found in SDTM DS Dataset
表4.4.2示例1:在SDTM DS数据集中找到的数据
Row
USUBJID
DSSEQ
DSTERM
DSDECOD
DSSTDTC
DSSTDY
1
2010
25
Subject Randomized
RANDOMIZED
2004-08-05
1
2
2010
301
Subject Completed
COMPLETED
2004-08-26
22
3
3082
20
Subject Randomized
RANDOMIZED
2004-09-08
1
4
3082
130
Subject Completed
COMPLETED
2004-09-17
10
Table 4.4.3 Example 1: Data as Found in SDTM HO Dataset
表4.4.3示例1:在SDTM HO数据集中找到的数据
Row
USUBJID
HOSEQ
HOTERM
HODECOD
HOSTDTC
HOENDTC
HOSTDY
HOENDY
1
2010
99
HOSPITAL
HOSPITAL
2004-08-13
2004-08-15
9
11
2
2010
199
HOSPITAL
HOSPITAL
2004-08-20
2004-08-22
16
18
The ADaM methodology is illustrated in Table 4.4.4. Using this methodology, one would include all of the sub- events used to derive the analysis parameter "HYPEREVT" as analysis parameters (i.e., rows), and create the input domain, input variable, and input sequence columns (SRC* columns) to identify where the input rows came from. AVAL for PARAMCD="HOSPADM" is the earliest relative day of hospitalization. AVAL for PARAMCD="DBP" is the earliest relative day that diastolic blood pressure exceeded 90. AVAL for PARAMCD="SBP" is the earliest relative day that systolic blood pressure exceeded 140. If a subject did not experience a particular sub-event, a row is still created for that sub-event indicating the subject was censored (CNSR=1) on the day the subject exited the study and the SRC* columns reference the DS dataset. AVAL for PARAMCD="HYPEREVT" is derived as the earliest event of the three: HOSPADM, DBP, and SBP (the minimum AVAL of those three that have CNSR=0 will be the earliest relative day of the three types of events); a subject who meets one of these three conditions has CNSR=0 for PARAMCD="HYPEREVT" to indicate the subject had an event. If a subject does not meet one of the three conditions (i.e., all three records have CNSR=1), then the subjectPARAMCD="HYPEREVT" is derived as the relative day that the subject exited the study and CNSR=1 is used to indicate the subject is censored. The analysis will focus on HYPEREVT, but HOSPADM, DBP and SBP are included to support traceability, and also to enable future analysis of the sub-events should it be desired. In this example, the SRC* variables were populated for the derived event (PARAMCD="HYPEREVT"), as described in Section 3.3.9, Datapoint Traceability Variables.
表4.4.4说明了ADaM方法。使用这种方法,可以将用于导出分析参数“ HYPEREVT”的所有子事件作为分析参数(即行),并创建输入域,输入变量和输入序列列(SRC *列)以确定输入行的来源。PARAMCD =“ HOSPADM”的AVAL是最早的相对住院日。PARAMCD =“ DBP”的AVAL是舒张压超过90的最早相对天数。PARAMCD =“ SBP”的AVAL是收缩压超过140的最早相对天数。如果受试者未经历过特定的子事件,仍会为该子事件创建一行,指示该对象退出研究之日该对象已被审查(CNSR = 1),并且SRC *列引用了DS数据集。PARAMCD =“ HYPEREVT”的AVAL源自以下三个事件中的最早事件:HOSPADM,DBP和SBP(具有CNSR = 0的三个事件中的最小AVAL将是这三种事件中最早的相对天数);满足这三个条件之一的对象的PARAMCD =“ HYPEREVT”的CNSR = 0,表明该对象发生了事件。如果某个主体不满足这三个条件之一(即,所有三个记录的CNSR = 1),则对该主体进行审查;否则,将对其进行审查。也就是说,对于 和SBP(CNSR = 0的三个事件中的最小AVAL将是这三种事件中最早的相对天数);满足这三个条件之一的对象的PARAMCD =“ HYPEREVT”的CNSR = 0,表明该对象发生了事件。如果某个主体不满足这三个条件之一(即,所有三个记录的CNSR = 1),则对该主体进行审查;否则,将对其进行审查。也就是说,对于 和SBP(CNSR = 0的三个事件中的最小AVAL将是这三种事件中最早的相对天数);满足这三个条件之一的受检者的PARAMCD =“ HYPEREVT”的CNSR = 0,表明该受治者有事件。如果某个主体不满足这三个条件之一(即,所有三个记录的CNSR = 1),则对该主体进行审查;否则,将对其进行审查。也就是说,对于
PARAMCD =“ HYPEREVT”派生为受试者退出研究的相对天数,CNSR = 1用于指示该受试者已被审查。分析将集中在HYPEREVT上,但是还包括HOSPADM,DBP和SBP以支持可追溯性,并且还可以在需要时对子事件进行将来的分析。在此示例中,针对派生事件(PARAMCD =“ HYPEREVT”)填充了SRC *变量,如第3.3.9节“数据点可跟踪性变量”中所述。
The main advantage of this structure is that it can handle sub-event input rows from many domains in only 3 standard supportive columns (i.e., SRCDOM, SRCVAR, and SRCSEQ). This approach is preferred because it is standardized, scalable, and supports analysis of sub-events.
这种结构的主要优点是,它只能在3个标准支持列(即SRCDOM,SRCVAR和SRCSEQ)中处理来自多个域的子事件输入行。首选此方法,因为它是标准化的,可伸缩的,并且支持子事件的分析。
Table 4.4.4 Example 1: ADaM Dataset
表4.4.4示例1:ADaM数据集
Row
USUBJID
PARAM
PARAMCD
AVAL
CNSR
EVNTDESC
SRCDOM
SRCVAR
SRCSEQ
1
2010
Time to First Hospital Admission (day)
HOSPADM
9
0
FIRST HOSPITAL ADMISSION
HO
HOSTDY
99
2
2010
Time to First DBP>90 (day)
DBP
15
0
FIRST DBP>90
VS
VSDY
208
3
2010
Time to First SBP>140 (day)
SBP
22
1
COMPLETED THE STUDY
DS
DSSTDY
301
4
2010
Time to Hypertension Event (day)
HYPEREVT
9
0
HYPERTEN. EVENT
HO
HOSTDY
99
5
3082
Time to First Hospital Admission (day)
HOSPADM
10
1
COMPLETED THE STUDY
DS
DSSTDY
130
6
3082
Time to First DBP>90 (day)
DBP
10
1
COMPLETED THE STUDY
DS
DSSTDY
130
7
3082
Time to First SBP>140 (day)
SBP
10
1
COMPLETED THE STUDY
DS
DSSTDY
130
8
3082
Time to Hypertension Event (day)
HYPEREVT
10
1
COMPLETED THE STUDY
DS
DSSTDY
130
Example 2
例子2
In this example, the analysis parameter is glomerular filtration rate (GFR). The analysis value for this parameter is derived from plasma creatinine, BUN, and albumin values from the LB dataset, as well as age, race, and sex.
在此示例中,分析参数是肾小球滤过率(GFR)。该参数的分析值来自LB数据集的血浆肌酐,BUN和白蛋白值,以及年龄,种族和性别。
Table 4.4.5 Example 2: Data as Found in SDTM LB Dataset
表4.4.5示例2:在SDTM LB数据集中找到的数据
Row
USUBJID
VISITNUM
LBSEQ
LBTEST
LBTESTCD
LBSTRESN
LBSTRESU
1
3000
3
98
Creatinine
CREAT
78.2
micromol/L
2
3000
3
115
Blood Urea Nitrogen
BUN
9.1
mmol/L
3
3000
3
120
Albumin
ALB
40
g/L
Additional rows are not created for the input data age, race, and sex, as they are functioning as covariates in the ADaM dataset. The analysis records in Table 4.4.6 are identified by PARAMCD=MDRD_GFR, the parameter code for PARAM="Glomerular Filtration Rate (GFR) (ml/min/1.73m**2)". In this example, because all of the data come from a single source dataset (in this example, the LB dataset), the LBSEQ variable is retained for traceability, though it would also be valid to instead use the ADaM SRC variables.
不会为输入数据的年龄,种族和性别创建其他行,因为它们在ADaM数据集中充当协变量。表4.4.6中的分析记录通过PARAMCD = MDRD_GFR进行标识,PARAMCD = MDRD_GFR是PARAM =“球滤率(GFR)(ml / min / 1.73m ** 2)”的参数代码。在此示例中,因为所有数据都来自单个源数据集(在此示例中为LB数据集),所以保留LBSEQ变量以实现可追溯性,尽管代替使用ADaM SRC变量也是有效的。
Table 4.4.6 Example 2: ADaM Dataset
表4.4.6示例2:ADaM数据集
Row
USUBJID
AGE
SEX
RACE
PARAM
PARAMCD
VISITNUM
AVAL
LBSEQ
1
3000
52
F
ASIAN
Creatinine (micromol/L)
CREAT
3
78.2
98
2
3000
52
F
ASIAN
Blood Urea Nitrogen (mmol/L)
BUN
3
9.1
115
3
3000
52
F
ASIAN
Albumin (g/L)
ALB
3
40
120
4
3000
52
F
ASIAN
Glomerular Filtration Rate (GFR) (ml/min/1.73m**2)
MDRD_GFR
3
76.77
An ADaM dataset is created to contain the time to pain relief (ADTTPRLF), based on data in another ADaM dataset (ADPAIN). Pain relief is defined as a reduction in pain from moderate or severe at baseline (i.e., pain severity of at least 2) to mild or no pain (i.e., pain severity of no more than 1), with no use of rescue medication from baseline to that timepoint (i.e., RESCUEFL null at that timepoint and for the subject's records prior to that timepoint). Subjects who do not achieve pain relief are censored at their last pain severity assessment. Missing data are imputed in ADPAIN using LOCF. Because the source dataset is an ADaM dataset, the SRCDOM, SRCVAR, and SRCSEQ variables are used for datapoint traceability.
基于另一个ADaM数据集(ADPAIN)中的数据,创建一个ADaM数据集以包含缓解疼痛的时间(ADTTPRLF)。缓解疼痛的定义是,从基线开始不使用急救药物,将疼痛从基线的中度或重度(即,疼痛严重度至少为2)减轻至轻度或无疼痛(即,疼痛严重度不超过1)。到该时间点(即,该时间点的RESCUEFL为空,并且该时间点之前受试者的记录为空)。未能减轻疼痛的受试者将在最后一次疼痛严重程度评估时接受检查。使用LOCF在ADPAIN中估算丢失的数据。因为源数据集是ADaM数据集,所以SRCDOM,SRCVAR和SRCSEQ变量用于数据点可追溯性。
Table 4.4.7 Example 3: Data as Found in ADPAIN (Source ADaM Dataset)
表4.4.7示例3:在ADPAIN中找到的数据(源ADaM数据集)
Row
USUBJID
ASEQ
PARAM
PARAMCD
ATPT
ATPTN
AVAL
AVALC
BASEC
CRIT1
CRIT1FL
DTYPE
RESCUEFL
QSSEQ
1
101-001
1
Pain Severity
SEVERITY
BSLN
0
3
Severe
Severe
Pain relief
N
100
2
101-001
2
Pain Severity
SEVERITY
30 MIN
30
2
Moderate
Severe
Pain relief
N
101
3
101-001
3
Pain Severity
SEVERITY
1 HR
60
1
Mild
Severe
Pain relief
Y
102
4
101-001
4
Pain Severity
SEVERITY
90 MIN
90
1
Mild
Severe
Pain relief
Y
103
5
101-001
5
Pain Severity
SEVERITY
2 HR
120
0
None
Severe
Pain relief
Y
104
6
101-002
1
Pain Severity
SEVERITY
BSLN
0
3
Severe
Severe
Pain relief
N
111
7
101-002
2
Pain Severity
SEVERITY
30 MIN
30
3
Severe
Severe
Pain relief
N
112
8
101-002
3
Pain Severity
SEVERITY
1 HR
60
2
Moderate
Severe
Pain relief
N
Y
113
9
101-002
4
Pain Severity
SEVERITY
90 MIN
90
2
Moderate
Severe
Pain relief
N
Y
114
10
101-002
5
Pain Severity
SEVERITY
2 HR
120
1
Mild
Severe
Pain relief
N
Y
115
11
101-003
1
Pain Severity
SEVERITY
BSLN
0
3
Severe
Severe
Pain relief
N
276
12
101-003
2
Pain Severity
SEVERITY
30 MIN
30
2
Moderate
Severe
Pain relief
N
277
13
101-003
3
Pain Severity
SEVERITY
1 HR
60
1
Mild
Severe
Pain relief
Y
278
14
101-003
4
Pain Severity
SEVERITY
90 MIN
90
1
Mild
Severe
Pain relief
Y
LOCF
278
15
101-003
5
Pain Severity
SEVERITY
2 HR
120
1
Mild
Severe
Pain relief
Y
LOCF
278
Table 4.4.8 Example 3: ADaM Dataset ADTTPRLF
表4.4.8示例3:ADaM数据集ADTTPRLF
Row
USUBJID
PARAM
PARAMCD
AVAL
CNSR
SRCDOM
SRCVAR
SRCSEQ
1
101-001
Time to First Pain Relief (minutes)
TTPRLF
60
0
ADPAIN
ATPTN
3
2
101-002
Time to First Pain Relief (minutes)
TTPRLF
120
1
ADPAIN
ATPTN
5
3
101-003
Time to First Pain Relief (minutes)
TTPRLF
60
0
ADPAIN
ATPTN
3
A second approach that was considered was to describe the derivation algorithms in metadata and include the input data as columns in the ADaM dataset. Pointer columns would be added to indicate the source of the input data (e.g., variable name, sequence number). This option would allow all pertinent input data to be retained on the relevant analyzed row (i.e., all sub-events would be shown on the same row as a compound event), which might help simplify verification of the calculation of the analysis parameter. However, this approach would clearly increase the number of columns in the ADaM dataset and would require naming the variables in a clear and concise manner. The approach also assumes that the only data to be retained are the original input values. Another drawback of this approach is that if there were a need in the future to analyze the sub-events, sub-event parameters would have to be added to have an ADaM-conformant structure supporting the analysis of sub-events. For these reasons, this approach was not chosen.
考虑的第二种方法是在元数据中描述派生算法,并将输入数据作为列包含在ADaM数据集中。将添加指针列以指示输入数据的来源(例如,变量名,序列号)。此选项将允许所有相关的输入数据保留在相关的分析行上(即,所有子事件将与复合事件显示在同一行上),这可能有助于简化对分析参数计算的验证。但是,此方法将明显增加ADaM数据集中的列数,并且需要以简洁明了的方式命名变量。该方法还假定要保留的唯一数据是原始输入值。这种方法的另一个缺点是,如果将来需要分析子事件,则必须添加子事件参数以具有符合ADaM的结构,以支持子事件分析。由于这些原因,未选择此方法。
A third approach that was considered was to describe the derivation algorithms in metadata and include no input data or identification of the input data in the ADaM dataset. The advantage of this approach would be simplification of the ADaM dataset. However, due to this simplified structure, there would be a loss of traceability between the data collected in the study (i.e., SDTM dataset) and the data analyzed (i.e., ADaM dataset). Unless the derivation algorithms described in the metadata were straightforward, verification of the analysis data computation could be very challenging or even impossible. This approach should not be used.
考虑的第三种方法是在元数据中描述派生算法,并且不包含输入数据或ADaM数据集中的输入数据标识。这种方法的优点是可以简化ADaM数据集。但是,由于这种简化的结构,在研究中收集的数据(即SDTM数据集)与分析的数据(即ADaM数据集)之间将失去可追溯性。除非元数据中描述的推导算法简单明了,否则分析数据计算的验证可能非常具有挑战性,甚至是不可能的。不应使用此方法。
第四章(上)完