PTSD a. Brief introduction of the topic and explain why it is important to mental health nursing. b. Summarize the article; include key points of th

PTSD
a. Brief introduction of the topic and explain why it is important to mental health nursing.
b. Summarize the article; include key points of the article.
c. Discuss how you could use the information for your practice in caring for patients; give specific examples.
d. Conclusion slide to synthesize overall findings related to mental health nursing.
Power point presentation format

RESEARCH ARTICLE Open Access

Don't use plagiarized sources. Get Your Custom Assignment on
PTSD a. Brief introduction of the topic and explain why it is important to mental health nursing. b. Summarize the article; include key points of th
From as Little as $13/Page

Utilization of machine learning to test the
impact of cognitive processing and
emotion recognition on the development
of PTSD following trauma exposure
Mareike Augsburger1* and Isaac R. Galatzer-Levy2,3

Abstract

Background: Though lifetime exposure to traumatic events is significant, only a minority of individuals develops
symptoms of posttraumatic stress disorder (PTSD). Post-trauma alterations in neurocognitive and affective
functioning are likely to reflect changes in underlying brain networks that are predictive of PTSD. These constructs
are assumed to interact in a highly complex way. The aim of this exploratory study was to apply machine learning
models to investigate the contribution of these interactions on PTSD symptom development and identify measures
indicative of circuit related dysfunction.

Methods: N = 94 participants admitted to the emergency room of an inner-city hospital after trauma exposure
completed a battery of neurocognitive and emotional tests 1 month after the incident. Different machine learning
algorithms were applied to predict PTSD symptom severity and clusters after 3 months based.

Results: Overall, model accuracy did not differ between PTSD clusters, though the importance of cognitive and
emotional domains demonstrated both key differences and overlap. Alterations in higher-order executive functioning,
speed of information processing, and processing of emotionally incongruent cues were the most important predictors.

Conclusions: Data-driven approaches are a powerful tool to investigate complex interactions and can enhance the
mechanistic understanding of PTSD. The study identifies important relationships between cognitive processing and
emotion recognition that may be valuable to predict and understand mechanisms of risk and resilience responses to
trauma prospectively.

Keywords: PTSD, Machine learning, Neuro-cognitive functioning, Emotion recognition, Symptom development

Background
The majority of individuals will experience a life-
threatening or potentially traumatic event across their
life course that puts them at risk for post-traumatic
psychopathology [1]. According to the Diagnostic and
Statistical Manual of Mental Disorders (DSM-5),

posttraumatic stress disorder (PTSD) is characterized by
four symptom clusters: Constant re-experiencing (cluster
B), avoidance of stimuli associated with the traumatic
event (cluster C), increased physiological arousal (cluster
E), along with negative alterations in mood and cogni-
tion (cluster D), thus resulting in a significant impair-
ment in daily life [2]. However, proportions of those
suffering from chronic PTSD symptoms are relatively
small compared to the high incidence of trauma expos-
ure (c.f [3].). Yet, the early identification of individuals at

The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
changes were made. The images or other third party material in this article are included in the article’s Creative Commons
licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons
licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the
data made available in this article, unless otherwise stated in a credit line to the data.

* Correspondence: [emailprotected]
1Department of Psychology, University of Zurich, Binzmuehlestrasse 14, 8050
New York, USA
Full list of author information is available at the end of the article

Augsburger and Galatzer-Levy BMC Psychiatry (2020) 20:325
https://doi.org/10.1186/s12888-020-02728-4

http://crossmark.crossref.org/dialog/?doi=10.1186/s12888-020-02728-4&domain=pdf

http://orcid.org/0000-0002-6564-0717

http://creativecommons.org/licenses/by/4.0/

http://creativecommons.org/publicdomain/zero/1.0/

mailto:[emailprotected]

risk for later pathologic development has remained chal-
lenging [4, 5].
The relationship between cognitive and emotional

information-processing of stimuli in the brain presents a
hallmark characteristic of PTSD and is reflected in cog-
nitive and affective dysregulations [68]. In their review,
Aupperle et al. point to the importance of executive dys-
function, attention, working memory disturbances, sus-
tained attention, inhibition, and flexibility in and
switching of attention. In contrast, dimensions of plan-
ning and strategy seem to be less affected [9]. Consist-
ently, information processing speed, verbal learning,
verbal memory, attention, and working memory have
demonstrated the strongest effects in differentiating indi-
viduals with PTSD from their healthy counterparts in a
meta-analysis [10].
Emotional information processing has also been shown

to be reduced in PTSD. More specifically, patients with
PTSD consistently demonstrate deficits in recognition of
correct emotions in facial stimuli compared to healthy
controls [11]. Researchers argue that an increased re-
activity to body sensations after trauma exposure along
with increased aversion may lead to avoidance [12].
Taken together, understanding the impact and relation
between dysregulations in cognitive and emotion infor-
mation processing following trauma can inform progno-
sis, diagnosis, and treatment selection as it relates to
PTSD. However, these dysregulations cannot be consid-
ered as distinct processes but might reflect overlapping
constructs of underlying neural mechanisms [13, 14].
For instance, it has been shown that impaired inhibition
affects the controllability of emotional cues [8, 15], and
thus might lead to symptoms of re-experiencing [9].
Moreover, a study with veterans demonstrated that only
interactions between emotional reactivity and impair-
ment in executive functioning were associated with more
severe PTSD symptoms, but none of the variables alone
[6]. Finally. treatment response in PTSD has been shown
to be associated with distinct predictors of cognitive and
emotional processes [7]. This indicates that the nature of
the relationship as it predicts distinct domains is neces-
sary to identify risk and understand the underlying
mechanisms.
In light of these findings combining information from

cognitive and emotion processing might provide a better
understanding of how PTSD develops after trauma ex-
posure. Since studies suggest that interactions between
specific facets on various stages facilitate the exacerba-
tion of PTSD symptoms, (c.f [6, 9, 13, 15].), an analysis
approach is required that can accommodate complex in-
teractions of highly-dimensional data. Hereby, classical
statistical testing methods quickly reach their limits due
to problems associated with inflated error probability in
light of multiple testing and reduced power.

Furthermore, only a limited number of predictors can be
included in traditional models at the same time. A
promising approach is offered by machine learning algo-
rithms. Such models can be utilized to determine shared
predictive accuracy of a variable set and can be used to
gain insights into interactions between variables. They
can accommodate relatively large variable-to-sample ra-
tios to identify interactions across multiple variables [16,
17]. For instance, unbiased predictions could be derived
even with a small sample of N = 40 [18] and sample size
does not affect model robustness when applying nested
cross-validation procedures [19]. Finally, machine learn-
ing models have been increasingly applied for investigat-
ing predictors for outcomes of health-related behavior
(e.g. [16, 20]), and particularly in the area of traumatic
stress (e.g., [21, 22]). In light of the evidence that cogni-
tive and emotional information processing are important
in the exacerbation of PTSD, the aim of the current
study was to characterize and test the relationship and
predictive accuracy of multiple relevant domains of cog-
nitive processing and emotion recognition as they im-
pact PTSD and distinct symptom cluster severity. More
specifically, the study sought to investigate the potential
of different machine learning models and its predictive
capabilities with respect to identify individuals with ele-
vated PTSD symptom severity by simultaneously com-
bining a number of variables that might have altered
post-exposure. Thus, we deliberately focused on predic-
tors from cognitive processing and emotion recognition
domains irrespective of other variables that are known
to serve as a risk factor for PTSD. The selection of spe-
cific variables was based on most significant associations
and cognitive tests reported in the review from Aupperle
et al. [9] and Scott et al. [10]. More specifically, tests as-
sociated with attention/working memory, sustained at-
tention and inhibition, flexibility, verbal memory and
processing speed were chosen. Regarding emotional pro-
cessing, recognition of emotions in facial stimuli were
investigated, thus following previous investigations [11].
Since this study was of exploratory nature to test the ap-
plicability of machine learning models within this set-
ting, no further hypotheses were specified.

Methods
This study was part of a larger research project assessing
trajectories of mental health after exposure to a trau-
matic event (NYU/Bellevue Stress and Resilience longi-
tudinal study).

Participants
English-speaking adults between 18 and 70 who were
admitted to the General Emergency Department (ED) of
the Bellevue Hospital Center, New York City after ex-
posure to a potentially traumatic event were asked for

Augsburger and Galatzer-Levy BMC Psychiatry (2020) 20:325 Page 2 of 11

study participation. An event was considered traumatic
as defined in the diagnostic classification for PTSD in
DSM-IV (criterion A) [23]. Cases of domestic violence
were not included. Further inclusion criteria were: no
symptoms of past or present psychosis; no admission to
the Psychiatric ED; and not currently being in custody of
the police or the Department of Corrections. In order to
increase risk for pathologic development following trau-
matic experiences, at the initial screening in the ED indi-
viduals were asked about their current level of a distress
on a Subjective Units of Distress Scale (SUDS) ranging
0100. A score 60 or an intense emotional reaction
during the interview was considered eligible for study
participation.
Out of all persons admitted to the ED due to a trau-

matic event, n = 338 individuals were eligible for study
participation. Reasons for admission were falls (21%),
bike accidents (16%), hits as a pedestrian (19%), motor-
vehicle accidents (17%), assaults (10%) and other event
types (17%) such as gun-shots, lacerations or seizures.
N = 111 individuals took part in the 1-month follow-up,
and n = 105 completed 3-month follow-up, respectively.
In the current analyses, all participants with complete
neuro-cognitive and emotional assessment were in-
cluded (N = 94). Of these, n = 58 were men and n = 36
were women. Mean age of the final sample was 36.95
years (SD = 13.83, 1967 years) at baseline and partici-
pants had in average 14.99 years of education (SD = 3.35,
418). The majority (50%) were Caucasians, followed by
African Americans (19%), Asians or Hispanics (both
3%). 9% preferred not to specify, and 16% indicated an-
other other ethnic group. There was no significant dif-
ference between those having completed the assessment
and non-completers regarding initial SUDS rating, age,
ethnic group, level of education. However, completers
reported significantly more bike accidents and fewer as-
saults than expected (both p < .05). Procedures All new admissions to the hospital after trauma expos- ure were checked for study eligibility, starting in fall 2014. If eligible and after having provided informed con- sent, participants were initially assessed within the emer- gency room setting and followed-up within the first week after discharge from the hospital for a phone screening of 30-min mean duration (not further reported here). Participants were again invited after 1-month for an in-visit. After 3-months post-incident a follow-up ap- pointment was scheduled, lasting about 20 min. The current analyses include this 1 and 3 months follow-up information. Data collection for later follow-ups was still in progress. For the 1-month assessment, participants got reimbursed with $100, and $30 for the 3-months follow-up, respectively. The research team was composed of an experienced research coordinator and several research assistants (Master or PhD students) working under close supervi- sion. They had received intense training in handling trauma populations. Measures Predictors at 1-month follow-up: Neuro-cognitive functioning and emotion recognition The computer-assisted and widely applied test battery WebNeuro provides neuropsychological tests for cog- nitive performance. Conformity to touch-screen equiva- lent (IntegNeuro) has been demonstrated [24], and the latter presents comparable validity and reliability in comparison to paper-and-pencil test versions [25, 26].
Relevant constructs and tests were selected according to
the previously reported findings with the exception of
the Choice Reaction Time Test. Outcome measures
within each test were selected according to the Web-
Neuro manual [27]. In addition, variables with high col-
linearity (> .80) were removed by inspecting pair-wise
correlations and removing the variable with largest mean
absolute correlation. The labeling of constructs for tests
also follows the WebNeuro manual [27] and can differ
from classifications of domains used by other authors.
Speed of information processing was measured with the

Choice Reaction Time test. Participants had to identify
the correct position of a green illuminated target appear-
ing at one of four target positions (black-filled circles) by
pressing the matching button as quickly as possible. In
total, there were 20 trials and targets appeared in a
pseudo-random order at one of the four positions. Reac-
tion time (RT) was used. This test was chosen because it
is less affected by mild traumatic brain injury [28].

Sustained attention In the Continuous Performance
Task, a series with one of four letters (D, B, G, or C) was
presented. Participants pressed a button when two iden-
tical letters consecutively appeared. In total, 125 letters
were presented (85 non-target and 20 target letters). Er-
rors of commission (false identification of non-targets),
errors of omission (non-identification of targets) and RT
were used.

Attentional flexibility Similar to the Trail Making Test
Version B [29], 13 digits (113) and 12 letters (A-L)
were presented. Participants were asked to touch digits
and letters in an alternating and ascending sequence
(1A2B, ). Time to completion was used.

Executive functioning/inhibition These domains were
measured with a Go/No-Go and a Verbal Interference

Augsburger and Galatzer-Levy BMC Psychiatry (2020) 20:325 Page 3 of 11

Task. For the Go/No-Go task participants were asked to
hit the space bar as quickly as possible if the word
press was shown in green letters and inhibit the move-
ment accordingly, if the word was shown in red letters.
Errors of commission, errors of omission and RT were
chosen as relevant variables. The Verbal Inference Task
is similar to the Word-Color-Stroop task [30]. Partici-
pants were asked to identify name and color of words
presented with congruent or incongruent color-word
combinations. Errors during incongruent trials was used.

Attention and working memory In the Digit Span test,
a series of digits, gradually increasing from 3 to 9 digits,
was recalled. Maximum recall span was measured.

Verbal learning capacity Equivalent to the California
Verbal Learning Test, a list comprised of 12 words was
presented in 3 consecutive trials. Participants were asked
for immediate recall after each trial. Mean number of
correctly recalled words was chosen.

Implicit emotion recognition The standardized stimuli
set [31] includes faces of 12 persons (both 6 women and
men) with a total of 72 facial expressions. In the first
part of the task (explicit emotion recognition), a pseudo-
random order of 48 faces from 8 different persons was
presented. Participants were asked to select emotion la-
bels corresponding to six facial expressions (happiness,
fear, sadness, anger, disgust, and neutral). After a series
of filter tasks of about 20 min, the implicit emotion rec-
ognition task was applied. A random selection of 24 fa-
cial expressions with six emotions (two male and two
female sets of emotions each) from the first task were
presented in a pseudo-random order together with 24
completely new stimuli with otherwise identical proper-
ties. In each trial, participants had to select the previ-
ously presented face.
Since, the study aim was to measure an emotional bias

by influence of a previous exposition towards emotions
on later emotion recognition capabilities and thus the
tendency to avoid particular emotions automatically,
only the implicit emotion recognition task was included
in the analyses. Due to negligible differences in recog-
nition of specific emotions, scores were averaged
across all emotions. Accuracy for both incongruent
(different primer and distracting emotions) trials and
congruent (same primer and distracting emotions) tri-
als was used as variables. For further information
about the task, see [32].

Outcome at 3-months: PTSD overall symptom severity and
cluster-specific symptoms
The Posttraumatic Symptom Checklist for DSM-5 (PCL-
5) was used [33]. It is comprised of 20 items and each

corresponds to a DSM-5 diagnostic criterion for PTSD.
Participants indicate the severity of symptoms during
the past month on a 5-point Likert scale ranging from 0
(not at all) to 4 (extremely), resulting in total sum score
range from 0 to 80 reflecting overall symptom severity.
Additionally, severity of diagnostic clusters of PTSD can
be computed. Intrusion symptoms (cluster B) can range
from 0 to 20, symptoms of avoidance (cluster C) be-
tween 0 and 8, negative alterations in cognition and
mood (cluster D) between 0 and 28, and alterations in
arousal and activity (cluster E) between 0 and 24, re-
spectively. The PCL-5 shows excellent psychometric
properties and is one of the most-used self-report mea-
sures for PTSD [34]. A cut-off value > 33 was considered
indicative of a provisional diagnosis for PTSD. For the
analysis, sum scores were used to be able to include as
much information as possible for learning associations.
Cronbach alpha was .95 in the current sample for the
total sum score.

Data analysis
Missing values and data pre-processing
In total, 10% of the dataset had missing values. These
were imputed using a recursive partitioning approach by
means of random forest, which is suitable for mixed-
data. Due to its non-parametric fashion, random forests
do not require a-priori specification of variable distribu-
tions. The algorithm outperforms other common imple-
mentation techniques in terms of imputation error such
as k-nearest neighbor or multiple imputation by chained
equations [35, 36]. Regarding data pre-processing, vari-
ables were scaled and centered if required for a specific
algorithm (e.g. for Support Vector Machines or Neural
Networks).
In order to explore dysfunctions, all test scores were

compared to a normative cohort using peer regression
modeling for age, gender and education [27]. For emo-
tion recognition, only norms for emotion identification
were reported. Z-scores within 1SD were considered
average, and 1 but 2 were considered borderline
below average performance [37].

Machine learning algorithms
The implementation followed recommended procedures
(see [38]). Since predictive performance within a given
dataset is unknown in advance, it is recommended to
test a range of models (c.f [38].). In the current studies,
supervised algorithms that have been frequently applied
in mental health studies were compared. More specific-
ally, support vector machines (SVMs), random forests,
boosted models and neural networks were tested (c.f
[39, 40].). In addition, two other models (basic decision
trees and bagged trees) were applied that have been
shown to be robust towards noisy data and applicable in

Augsburger and Galatzer-Levy BMC Psychiatry (2020) 20:325 Page 4 of 11

a broad range of settings whilst being somewhat inter-
pretable (c.f [38].).
SVMs are characterized by finding a linear separation

(hyperplane) that best differentiates the outcome based
on the predictor values [41]. Furthermore, basic classifi-
cation and regression trees (CART), random forests
(RF), boosted and bagged models all belong to the cat-
egory of decision trees. In these models, the outcome is
predicted by partitioning each predictor based on a
series of if-then statements. During model bagging sev-
eral decision trees are averaged by repeated resampling
of the data [42]. Furthermore, RF is also a tree-based en-
semble learning technique. It works similar to bagged
trees, but at each step of the tree-building procedure
only random subset of the predictors is included (see
[43]). Boosted regression trees are also a tree-ensemble
method but in contrast to other techniques, each step
during model building is based on the residuals that
could not be explained in the step before (see [44]). Fi-
nally, neural networks combine predictors into multiple
hidden units. In a second step, the outcome is modeled
by these hidden units (see [45]). Model specification de-
tails are reported in the online supplementary material.
As stated above, machine learning models are charac-

terized by model-inherent parameters that are systemat-
ically tuned to receive best prediction performance. In
order to avoid over-fitting (models perform well in the
current sample but can be poorly generalized), more
complex models are penalized during the model building
process. Furthermore, at each step during tuning, only a
small portion of data is used. A left-out set is subse-
quently used for evaluating predictive performance. In
the current study, 10-fold cross-validation with five repe-
titions was applied with model building at each step of
the cross-validation process. This nested procedure was
chosen instead of a separate training and test set in light
of the limited sample size [38]. It has shown to result in
robust estimates in small samples [18, 19].
In order to evaluate predictive performance, two indi-

ces were used for quantifying prediction error. Root
means squared error (RMSE) indicates the magnitude of
residuals left in the model derived from observed minus
predicted values [38]. Thus, lower values of RMSE were
preferred. Since this measure is scale-depended, it can-
not be used to compare model performance across dif-
ferent outcomes. For this reason, R-squared (squared
observed versus fitted values) was used. R-squared is
interpreted as the proportion of variation in the outcome
that can be explained by the predictors. Thus, higher
values were preferred. Both RMSE and R-squared are
recommended when testing the predictive capability of
machine learning models with continuous outcome [38].
Indices derived from each step were averaged to derive
one single final estimate. Since there is no recommended

cutoff for optimal values of RMSE and R-squared, differ-
ences in model performance were compared based on
pairwise t-tests with Bonferroni adjustment for multiple
testing. In addition, variable importance scores were
computed. Values were scaled from 0 to 100 with larger
values indicating higher contribution in the model.
Whilst no statistical test is available for drawing conclu-
sions about the relevance of specific predictors, we chose
to explore patterns. For this reason, the three most im-
portant predictors for each model were considered. Fi-
nally, in order to quantify interactions between features,
Friedmans H was calculated (see [46]). Friedmans H
can be interpreted as the portion of variance that is ex-
plained by the interaction when controlling for other ef-
fects. The index can take values between 0 and 1.
Currently there is no statistical significance test available
so again we described patterns of the five most fre-
quently occurring two-way interactions across models.
R [47] with packages missForest [48] and caret [49] as

well as respective dependencies were used for statistical
analyses.

Results
Descriptive statistics
Table 1 displays descriptive statistics of all predictor var-
iables at the 1-month assessment. A comparison with
the normative cohort revealed only minor deviations (z-
scores < |.5|). Only in the continuous performance test, participants mean reaction time was in a borderline range (for details see Table 1A in the online supplemen- tary material). Thus, there is no evidence for severe neu- rocognitive impairment in this sample. Regarding the outcome, mean PTSD symptom severity as measured by the PCL-5 was 23.38 (SD = 15.74), range 062. Of these, 24% met the cut-off above33 indicative of a provisional diagnosis. For disorder-specific sub- clusters, mean scores were 5.11 (SD = 4.12) for cluster intrusion symptoms, 2.68 (SD = 2.10) for avoidance, 8.29 (SD = 6.29) for changes in mood and cognition, and 7.45 (SD = 5.04) for hyperarousal, respectively. Data-driven predictions within outcomes Final model parameters and associated fit values are re- ported in Tables 2A-6A in the online supportive mater- ial. Comparing accuracy within the same outcome for overall PTSD symptom severity, values for RMSE were between 14.32 (boosted tress) and 15.53 (CART model). Pairwise t-tests indicated that SVM and CART models were significantly worse than the bagged tree and ran- dom forest model (all p < .001). In addition, the random forest model was also superior to the neural network model (p = .04). No other significant differences emerged (all p > .05). Thus, the random forest model was

Augsburger and Galatzer-Levy BMC Psychiatry (2020) 20:325 Page 5 of 11

considered the optimal model for overall PTSD scores
(see Figure S1 in the online supportive material for
details).
For symptoms of re-experiencing (PTSD cluster B),

RMSE values of the tree-based ensemble methods
(bagged tree, boosted tree and random forest; RMSE be-
tween 3.533.58) were significantly lower compared to
the SVM (RMSE = 3.87, all p < .004) and CART models (RMSE = 3.88, all p < .05), but not to the neural network model (RMSE = 3.82, p = .51 for bagged trees and p = .18 for random forests). The boosted tree model came close to significance (p = .08) and was therefore chosen as best model (see Figure S2 in the online supportive material for details). Concerning symptoms of avoidance (PTSD cluster C), random forest (RMSE = 1.92) and boosted tree (RMSE = 1.958) models had significantly lower values than the SVM (RMSE = 3.87, both p < .001), CART (RMSE = 2.03, both p < .04) and the neural network model (RMSE = 2.11, both p < .009). Random forest and boosted models did not differ from each other (p = .46) and the bagged tree model (RMSE = 1.963, both p > .5). Yet,
the bagged tree model was only superior to the SVM
model (p <. 001), but not the CART and neural net- work model (p = > .17). Consequently, random forest
and boosted tree models were chosen as final models
for PTSD cluster C (see Figure S3 in the online sup-
portive material for details).
For alterations in cognition and mood (PTSD cluster

D), a very similar pattern occurred. Both the random
forest (RMSE = 5.82) and the boosted tree (RMSE = 5.81)
model were significantly lower than the SVM (RMSE =
6.08), CART (RMSE = 6.46) and neural network
(RMSE = 6.63) model (all p < .05) but did not differ from the bagged tree model (RMSE = 5.88, p > .9). The bagged
tree model was superior to the CART and neural

network (both p < .001), but not the SVM (p = .06) model. Again, random forest and boosted models were considered optimal for PTSD cluster D (see Figure S4 in the online supportive material for details). For symptoms of hyper-arousal (PTSD cluster E), the two models with lowest scores, that is the CART (RMSE = 4.64) and boosted tree model (RMSE = 4.62), were significantly better than the SVM (RMSE = 4.99, both p < .001) and neural network (RMSE = 5.10, both p < .05) models. Only the boosted tree model was also superior to the bagged tree (RMSE = 4.76, p = .03) and close to significance for the random forest model (RMSE = 4.75, p = .08). Accordingly, it was chosen as the optimal model for the prediction of PTSD cluster E (see Figure S5 in the online supportive material for details). Model performance across outcomes In a next step, best performing models per outcome were compared based on maximized R-squared, That is, the random forest model was chosen for overall PTSD symptom severity (Rsquared = .28), cluster C (Rsquared = .25) and cluster D (Rsquared = .20). Further- more, boosted models were selected for PTSD cluster B (Rsquared = .36), cluster C (Rsquared = .22), cluster D (Rsquared = .20) and cluster E (Rsquared = .23). Despite these descriptive differences, no model performed exclusively better (all p > .19). There was only a non-
significant trend for the boosted model for cluster B out-
performing the random forest model for cluster C
(p = .06) as well as the boosted model for cluster D
(p = .05). See Fig. 1 for details.

Importance of predictor variables
Figure 2 visualizes variable importance for all selected
models. Of note, some variables had almost equal scores
and were therefore all considered, thus including four

Table 1 Mean and standard deviation (SD) of predictor variables

Test Variable [unit] Mean (SD)

Choice Reaction Time RT [ms] 451.73 (177.66)

Continuous Performance # of Errors 7.49 (18.19)

RT [ms] 593.05 (122.55)

Go/No-Go RT [ms] 321.64 (64.04)

# of Errors 6.00 (6.03)

Verbal Interference # of Errors 1.85 (3.91)

Digit span # of Maximum Digits 6.52 (1.77)

Verbal learning # of Errors 4.46 (7.61)

Digit-Letter-test Completion Time [ms] 62,241.62 (54,964.60)

# of Errors 1.80 (3.78)

Face recognition Accuracy Incongruent [%] 85.61% (19.90)

Accuracy Congruent [%] 95.04% (9.16)

RT Reaction Time, ms milliseconds, # number

Augsburger and Galatzer-Levy BMC Psychiatry (2020) 20:325 Page 6 of 11

predictors for overall symptom severity and the boosted
model for cluster C.
Accuracy during incongruent trials in the Face Recog-

nition task was the most or second most important pre-
dictor in all models. This was followed by errors in the
Verbal Interference task for PTSD symptom severity (to-
gether with RT in the Go/No-Go task), cluster B and E,
and errors of commission in the Continuous

Performance test for overall PTSD, cluster B, C (together
with RT in the Choice Reaction Time test) and E. Fur-
thermore, regarding the prediction of cluster D, RTs in
the Go/No-Go and Continuous Performance tests were
relevant. The latter test was also the second most im-
portant predictor for the boosted cluster C model.
In addition, accuracy in the Verbal Learning memory

task (for overall PTSD symptoms, boosted model cluster

Fig. 1 Differences in mean R-squared and associated confidence intervals for pairwise comparisons of random forest (RF) and boosted models.
The letters refer to the PTSD cluster B-E, Total refers to the model with overall symptom severity

Choice Reaction Time: RT

Continuous Performance: errors of commission

Continuous Performance: errors of omission

Continuous Performance: RT

Digit Span: digits

DigitLetter: time

Face Recognition C: accuracy

Face Recognition I: accuracy

Go/NoGo: errors of commission

Go/NoGo: errors of omission

Go/NoGo: RT

Verbal Interference: errors

Verbal