AVD Raj A
Please go through instructions carefully
In the 1st document, I need the outline covering all the Professor’s requirements.
In the 2nd document, I need the research paper covering all the Professor’s requirements.
Outline for Assignment 1
Using the research guide and the assignment 1 instructions, develop your outline. Submit the outline in an MS Word document file type. Utilize the standards in APA 7 for all citations or references in the outline. Ensure that the document includes your name. Do not include your student identification number. You may use the cover page from the student paper template, but it is not required. The assignment 1 instructions are at the bottom of this content folder.
Submit your outline on or before the due date.
By submitting this paper, you agree:
(1) that you are submitting your paper to be used and stored as part of the SafeAssign services in accordance with the Blackboard Privacy Policy;
(2) that your institution may use your paper in accordance with UC’s policies; and
(3) that the use of SafeAssign will be without recourse against Blackboard Inc. and its affiliates.
4/24/20 Assignment 1 AZ UT.docx P a g e | 1
Research Assignment 1
The Outline for Research Assignment 1 and Research Assignment 1 will use this document.
Use the Documenting Research Guide to understand how to use the information in this document for
either of these submissions.
Ask questions if needed!
Topic: The Center for Disease Control and Prevention (CDC) uses the social vulnerability
index (SVI) to evaluate the impact of disasters on communities, weighting the damage
with social factors in the states of Arizona and Utah.
Problem: The data consolidated by the CDC is used to determine the most vulnerable areas
should a disaster occur. In a perfect world, the indicators of vulnerability would
represent the people correctly. Currently, this far-from-perfect method is the best that
has been developed. There may be indicators that are not adequately predictive of
social vulnerability.
Question 1: What relationships exist in the states of Arizona and Utah between the socioeconomic
indicators, household, and composition indicators, disability indicators, and social
vulnerability when using the data consolidated by the CDC (2018a)?
Question 2: What indicators in the states of Arizona and Utah between the socioeconomic
indicators, household, and composition indicators, disability indicators have the most
influence in predicting social vulnerability when using the data consolidated by the CDC
(2018a)?
Data:
The data and data dictionaries are online.
o Center for Disease Control and Prevention. (2018a). Social Vulnerability Index [data
set]. https://svi.cdc.gov/Documents/Data/2018_SVI_Data/CSV/SVI2018_US.csv
o Center for Disease Control and Prevention. (2018b). Social Vulnerability Index [code
book]. https://svi.cdc.gov/Documents/Data/2018_SVI_Data/SVI2018Documentation.pdf
o Note: The raw data must be this report in its original form when it enters the R script
file. Use the data dictionary to understand the data.
Create a subset of the data to represent the sample of secondary data in this analysis.
o The SVI indexs variable name is
RPL_THEMES, in column 99
o Socioeconomic
Persons below the poverty
estimate
Civilian unemployed estimate
Per capita income estimate
Persons with no high school
diploma
o Household and composition disability
features
Ages 65 and older
Ages 17 and under
Persons with a disability, over the
age of 5
Single-parent households
o The state field
4/24/20 Assignment 1 AZ UT.docx P a g e | 2
Note: Do not use more than one indicator for each measure defined in this section.
Variable names preceded with E_ are actual measures, while M_ represents the
margin of error estimates.
Other prefixes are follow-on calculations or qualitative information, do not include variables
that are not identified in the research questions, as listed in the data section.
Do not include the margin of error estimates at this time.
Considering the research questions, after subsetting, there will be 10 variables used in this
analysis.
Data Cleaning:
Do not remove missing values during cleaning. If missing values need to be removed for
analysis method, do it during the preparation for analysis. A code represents missing values.
Use the data dictionary to understand the data sample and how missing values are
represented.
When changing an object or part of an object, validate the change that occurred as expected.
The steps that are taken in cleaning are not discussed in the research paper.
There is a code that represents missing values; ensure this is found in the data dictionary!
These values will have to be recoded as NA. Not figuring it out? Please email me.
Analyze:
Conduct two types of analysis: visual analysis to identify relationships and a random forest
model to identify influential indicators in predicting the social vulnerability.
The sub-stages of Analyze are necessary at least two times; profile, prepare, and apply. This
method is for programming, not documenting research.
During the visual analysis, only present meaningful visuals to understand what the
relationships exist between the indicators for the social vulnerability index.
Ensure you establish that the model is valid and reliable before discussing the influential
indicators.
Documenting research:
Results, Impact of the Results:
Ensure that assertions and assessments in the results and discussion sections are derived
from the analysis in R.
Do not speculate. Use evidence. When documenting the results, consider the generalizability.
Future Recommendations:
Include recommendations for future analysis, based on the research in R.
An example might look something like this:
o An opportunity for further research, based on gaps found in the random forest modeling,
is to look at the ability to tune the parameters further, to improve the performance in
predicting the
o Additionally, an opportunity for future research is exploration modeling to determine
what other variables, when eliminated, have little or no impact on the ability to predict
the SVI based on the supporting characteristics in the data.
4/24/20 Assignment 1 AZ UT.docx P a g e | 3
Bonus challenge:
Create a random forest model for each state that is assigned. Ensure that this analysis is within the
scope of the research.
Tip: An additional research question that meets the five criteria from the first lecture will bring
this analysis within the scope. Make sure the question is structured to encompass the additional
research. The challenge does not replace the original research requirements for this assignment.
Required files to submit:
1) Research paper in APA 7 format; MS Word document file type
2) R Script; final version
Important Information:
You will receive an email confirming the submission. Should you receive that email, your
submission is received.
o An error is derived from the use of SafeAssign. SafeAssign does not recognize r file
types. The warning does not impact the submission.
The research paper will be written in a professional writing style, following APA 7 student
paper format, use the student paper template.
o The document shall be 3-5 pages and at least 1000 words. The page count does
include the cover page, tables, or figures, or the reference page.
o Ensure that every reference in the reference list is also cited in the text.
o Do not forget to cite and reference the source of the data.
It is ill-advised to modify the problem statement and research questions provided.
If the research problem or research questions are modified, the requirements of the analysis
will not change.
There are several different versions of this assignment. If the submitted work is in line with a
different version than assigned, the submitted work is a demonstration of academic
dishonesty. Do not share the work with peers. Do not accept work that you did not do.
Take a look at the rubric to get the best grade possible. Documenting Research Guide Last Revised: 8/30/2020 1
Documenting Research Guide
Contents
Outline Structure and Content ……………………………………………………………………………………………………………….. 2
Outline Example coinciding with Unit 3 …………………………………………………………………………………………………. 4
Writing Tips ………………………………………………………………………………………………………………………………………… 7
Example Research Paper coinciding with Unit 3, annotated ………………………………………………………………………. 8
Example Research Paper coinciding with Unit 3 ……………………………………………………………………………………. 22
Documenting Research Guide Last Revised: 8/30/2020 2
Outline Structure and Content
The outline is an organization document to provide structure for the research paper. Use the outline to document
research.
Section 1, Level 1 Section Heading: This heading is the title of the paper.
Background, topic, introduction
Describe the broader context in which the problem exists, the topic
Lead the reader to the problem statement
Do not explicitly state the problem, research questions, or methodology
This section introduces the research topic and provides a high-level summary of what the reader can expect
to find in the rest of the paper.
Section 1, Level 2 Section Heading: Statement of the Problem
This section may come straight from an assignment’s instructions
Provide the ideal, current, and intent of the problem for research
Section 2, Level 1 Section Heading:Research Methodology
Begins with an introduction to all the content in the research methodology section
Section 2, Level 2 Section Heading: Research Questions
This may come straight from the assignment’s instructions
Ensure that developed questions conform to the standards defined in the first lecture
Section 2, Level 2 Section Heading: Sample Data
Review the sample data variable names do not identify what the content represents so do not use
variable names!
Explain and describe what each of the variables represents, connecting the sample to the
background, problem, and question so the reader can understand what the data represents and why it
is suitable data to answer the research questions
Section 2, Level 2 Section Heading: Analysis Method and Limitations
A plan, defining what type of analysis will address each research question.
The plan will include statistical assumptions, limitations to the analysis method, and mitigating steps
taken for the limitations.
This section is not a programming plan! This section does not include the programming procedure or
steps. Define this section before conducting any programming or analysis.
This section finishes with a summary of the content of the section
Develop everything above this statement is before analysis. Work on everything below after the analysis.
EXCEPTION: Develop the reference section before and after analysis.
Some of the elements above this statement could change after analysis.
Section 3, Level 1 Section Heading:Results and Discussion
Begins with an introduction to all the content in the results and discussion section
Documenting Research Guide Last Revised: 8/30/2020 3
This objective of the research is to answer the research questions the purpose of this section.
o If there is more than one research question, address them individually
o Pay attention to the generalizability!
Provide interpretations of all results!
Inclusion of figures or tables must conform to APA 7 standards
Include all findings, even if they do not support the desired outcome
o If the analysis method for the findings has statistical assumptions, address the statistical
assumptions before presenting the findings
There is no programming code in this section.
Finishes with a summary of the content in this section
Section 3, Level 2 Section Heading: Recommendations for Future Research
After the analysis, how can it be improved?
Different analysis methods?
Different sample data?
Different data structures?
The recommendations must come from the research; do not recommend different data collection methods that
is not part of the research! (This course only uses secondary data, data consolidated by someone else.)
Section 4, Level 1 Section Heading:Conclusion
The conclusion is a summary of everything in the entire paper
Do not introduce new ideas in the conclusion
Highlight key points of the research or findings
Section 5, Level 1 Section Heading:References
Reference section and references per APA 7
Some of the standards for this section per APA 7
o References always begin on a new page
use insert new page to ensure this section starts at the top of a separate page from the rest
of the document
o References are in alphabetical order
o Annotated with a hanging indent
The reference begins flush with the left one-inch margin
Indent wrapped text is one-half inch
Documenting Research Guide Last Revised: 8/30/2020 4
Outline Example: based on the analysis in Unit 3
The 2016 Presidential Campaign Polling
The 2016 election was tumultuous
o Distinct perception Trump would not win
o Bias may have played a part
o Polling samples
o shy voters
The research includes analysis of the polls’ results and how the
results relate to the outcome of the election.
Statement of the Problem
Neutral polling, collected from a sample genuinely representative of
the voters, will provide an accurate prediction of the winner of an
election. Polling seemed to indicate that Clinton was going to win,
but the electoral vote significantly favored the Trump campaign.
Exploration of the polling results throughout the campaign and a
particularly close look at the ratings at the end of the campaign may
provide insight into the source of the significantly different outcome
than the media portrayed with the election of President Trump.
Research Methodology
Research Question
Considering the 2016 presidential campaign, using the polling data
consolidated by Silver et al. (2016) and the election results
consolidated by Ballotpedia. (n.d.), what relationships exist between
the polling and the 2016 election results that indicate that President
Trump would win the election?
Sample Data
Note: Keep in mind that if the data used in an assignment has
variables not used in the analysis, those variables are not part of the
sample! Take note of this in the data. There are several fields not
discussed here, because the fields were not part of the analysis
The secondary sample data from Silver et al. (2016) includes
polling data that represents
o Location: fifty states, national polls, and Washington
DC
o Dates: November 2015 to November 2016, the ending
date for each poll
o Size: the sample size of each poll
The title is capitalized in
title case. This is the
first section heading and
the title of the paper in
the final document.
For most of the course this is
provided. In the outline and
research paper, the entire
statement is provided.
Cite the source(s) of the
sample data.
Provide a summary of the
document in the introduction.
While the outline has sentence fragments and bullets throughout the research paper will not. The
organizational statements in the outline are written as well-developed paragraphs in the research paper.
In APA 7, a level 1
section heading is in
bold, centered between
the one inch left and
right margins.
In APA 7, a level 2
section heading is in
bold, flush to the one
inch left margin.
All research questions
belong in the outline.
Explain the sample in
words.
Explain how the data is
represented, such as parts
per million or percentage
of votes.
Documenting Research Guide Last Revised: 8/30/2020 5
o Vote: the percentage of votes for President Trump
and for Clinton each poll in the data
The secondary sample data used from Ballotpedia (n.d.)
represents:
o fifty states and Washington DC
o electoral votes available in each state
o 2016 election vote percentage of each state for
President Trump and Clinton
Analysis Method and Limitations
What relationships exist between the pre-election polling attributes, the
2016 election, and each state’s allocated electoral votes that indicate that
President Trump would win the election?
assessed via visual analysis
o not parametric, therefore no statistical assumptions
o limitations of visual analysis
high dimensionality is challenging to assess
possibility of inadequate assessment leading to
incorrect conclusions
the more comparisons, the higher likelihood of false
discoveries (Zhao et al., 2017)
o mitigation for inadequate assessment
explore interesting findings via multiple facets, to
ensure adequate assessment
o mitigation for false discoveries
Attempt to view any key finding from multiple
perspectives, to validate the finding
Develop everything above this statement for the outline, along with the
reference section.
Develop everything below after the analysis, along with the reference
section. There may be updates to the other sections.
Type of analysis for each research
question; list each question!
Declare how this method can
address each of the research
questions.
Declare any statistical assumptions
for this method of analysis with a
credible reference.
Provide limitations to the method
of analysis and methods to
mitigate limitation if it impacts
the validity or reliability of the
research.
In other words, if the limitation
can lead to incorrect conclusions,
how will correct conclusions be
determined?
Declare the headings for
the remaining fields
The design for analysis
Documenting Research Guide Last Revised: 8/30/2020 6
Results and Discussion
Recommendations for Future Research
Conclusion
References
Include the reference(s) of the data, in APA
7.
Include a citation for every
reference
Include a reference for every
citation
The reference section begins on a separate page.
Documenting Research Guide Last Revised: 8/30/2020 7
Writing Tips
When writing a paper or developing a presentation, always include a summary of the document within
the introduction and the conclusion.
Focus the writing on the purpose: solve the problem, answer the question, or prove the expected
outcome. In this course, the assignments will all have research questions. Focus on the questions.
Write concisely. This is not a persuasive paper. Writing superfluously devalues your work.
When you finish writing:
o Read the document aloud.
This is the single, most effective method to identify elements of the document that
require editing.
Think about the problem, research questions or the expected outcome:
Did you focus on it throughout the document?
Did you provide answers to the research question(s)?
o If you are not particularly confident in your writing:
Take time to identify the topic sentence in every paragraph, in every section, and within
the introduction and conclusion.
There should be transition sentences between the ideas in the document. Does the writing
jump from one idea to the next?
The writing center is an excellent resource, as well.
Use the outline to organize your graduate-level writing.
Do not concern yourself with your SafeAssign score.
o Ensure that quoted words, paraphrasing, and direct references to external sources have citations
and references to the original source of the information. Still not sure? Email me.
o Think about it! What do you think the average SafeAssign percentage is for the outline?
A significant portion of the outline will come from the assignment instructions.
The matching criteria from SafeAssign typically allocates 60-80% scores to submissions
that are correctly written.
Cite every reference. Include all references in the reference section.
Evaluation of all writing assignments by APA 7 criteria.
o Student papers do not include an abstract.
o Vertical spacing is uniform between lines of text
Microsoft Word automatically adds paragraph padding remove it or use the template.
o The text alignment throughout the document is left-align, not justify.
o Do not solely rely on citation and reference generators. These tools are fallible.
8
Documenting Research Guide Last Revised: 8/30/2020 8
Example Research Paper: with notations
The 2016 Presidential Campaign Polling
Dr. Kathy A. McClure
University of the Cumberlands
ITS-530: Data Analysis and Visualization
Dr. Kathy A. McClure
July 23, 2020
One of two places in the
document correctly
documented with non-
uniform vertical spacing.
.
The top name is author.
When you see my name
again it is for the
professor of the course.
The only element in the header is the page number in the
same font as the document, starting at 1. (As this is part of
an example document the numberling is different.)
There is no footer in the student research paper.
There is no footer in a student research paper, per APA 7.
This footer is for document control.
9
Documenting Research Guide Last Revised: 8/30/2020 9
The 2016 Presidential Campaign Polling
The 2016 presidential campaign was tumultuous. It had seemed impossible that President
Trump would win the election. Silver et al. (2016)
indicated that there was a 71.4% chance that Clinton
would win the election. During the campaign, the media
led voters, including elected members of the republican
party, to believe that President Trump would not win the
election (Hohman, 2016). Regardless of the media,
Hohman (2016) retroactively identified that there were
many voters that were not pro-Clinton leading up to the
election. Stevenson (2016) interviewed American
University professor Dr. Allan Lichtman, who overtly
stated that President Trump would win the election based
on historical voting in this country. Dr. Lichtman
specified to exceptions to this claim: candidate Johnson
must receive at least five percent of the vote and
President Trump’s unpredictable behavior. Goldmacher and
Schreckinger (2016) stated that President Trump winning
the election was the “biggest upset in U.S. history”
(title). Many believed Clinton would win.
Problem Statement
Polling samples that represent the population will
provide an accurate prediction of the election winner.
Note that the outline was not followed
explicitly for the topic/introduction
Dont forget to cite and reference sources
of information
Use evidence to support any assertions
that are not common knowledge
Example: Sampling bias was an issue in
all polls. That statement infers this is a
fact when it is not and it would be
impossible to prove this statement!
You must have a citation and reference for
assertions.
From the outline:
The 2016 election was tumultuous
Distinct perception Trump would
not win
Bias may have played a part
Polling samples
shy voters
The research includes analysis of
the polls’ results and how the results
relate to the outcome of the election
Why did this quote end with the word title
in parentheses? It is cited correctly. The
statement began with the source authors and
date. A quote requires three parts in the
citations, author, data, and the page number.
The reference is a website, so there are no
page or paragraph numbers. It must identify
where the quote was found, in this case, the
title.
The problem statement is verbatim from the
outline, unless it was insufficient.
10
Documenting Research Guide Last Revised: 8/30/2020 10
Polling results appeared to indicate that Clinton was going to win, but the election resulted in
President Trump swearing-in as the 45th president. Exploration of the polling and election results
may provide insight as to why the election winner was unexpected.
Method
Research Question
Considering the 2016 presidential campaign,
using the polling data consolidated by Silver et al. (2016)
and the election results consolidated by Ballotpedia. (n.d.), what relationships exist between the
polling and the 2016 election results that indicate that President Trump would win the election?
Sample
This research employed two secondary data sources
for the analysis. Consolidated polling data collected by
Silver et al. (2016) is the first data source. Each observed
poll includes the percentage of votes by location, ending
date, and sample size for Clinton and President Trump.
Ballotpedia (n.d.) election data is also necessary for this
analysis and includes the percentage of votes by location
for Clinton and President Trump. Available electoral
votes for each location is another attribute in the election
data. Locations between the two secondary data sources
differed.
The polls’ locations include the entire nation, each
state, and Washington, DC, and specific districts within Nebraska and Maine. The district polls
The research question(s) are verbatim from
the outline unless the question was
insufficient.
From the outline:
The secondary sample data from
Silver et al. (2016) includes polling
data that represents
fifty states, national polls, and
Washington DC
November 2015 to November
2016, the ending date for each poll
the sample size of each poll
provides a raw percentage of votes
for each poll for President Trump
and Clinton
The secondary sample data used from
Ballotpedia (n.d.) represents:
fifty states and Washington DC
electoral votes available in each
state
2016 election vote percentage of
each state for President Trump and
Clinton
11
Documenting Research Guide Last Revised: 8/30/2020 11
within Nebraska and Maine were representative of the method of electoral vote distribution.
Splitting the electoral vote is possible in Nebraska and Maine (Coleman, 2020). In the other 48
states and Washington, DC, using winner-take-all, the
popular vote winner for the state receives all the electoral
votes. The election data simplified the locations: each
state and Washington, DC.
Analysis Method and Limitations
The method of analysis must be suitably capable
of meeting the objective of this research, statistical
assumptions identification is necessary, if they exist, and
identification of any limitations is essential, along with
mitigation, where possible. Visual analysis is suitable for
extracting relationships that may exist in the data. This
method is also appropriate for confirming the information
derived from the analysis. There are no formal statistical
assumptions. There are three limitations identified for visual
analysis.
High dimensionality, inadequate assessment, and false discoveries are risks associated
with visual analysis. The scope of this research does not include numerous variables, mitigating
the threats associated with high dimensionality. The potential for inadequate assessment and
false discoveries requires mitigation. Visualizations of data provide a perspective of the
information without context. To mitigate these risks, it is compulsory to assess all key findings
from multiple perspectives. This process ensured that there was an adequate assessment of that
From the outline:
Analysis Method and Limitations
assessed via visual analysis
not parametric, therefore no statistical
assumptions
limitations of visual analysis
high dimensionality is challenging to
assess
possibility of inadequate assessment
leading to incorrect conclusions
the more comparisons, the higher
likelihood of false discoveries (Zhao et
al., 2017)
mitigation for inadequate assessment
explore interesting findings via
multiple facets, to ensure adequate
assessment
mitigation for false discoveries
Attempt to view any key finding from
multiple perspectives, to validate the
finding
12
Documenting Research Guide Last Revised: 8/30/2020 12
the perceived information. Focusing on the research question and using two sources of secondary
data, the analysis generated results.
Results
Consolidation of the visual analysis highlighted key findings through four visualizations
of data. Manipulating the data with various summarization techniques generated meaningful
graphics. The sample included nearly a year’s worth of polling data, but limiting the data to polls
closest to the election generated the key findings in this research. The term polling vote
represents polls ending in November 2016, consolidated by state and candidate, using the median
value. Geospatial visualization indicates that in 45 of the 50 states the winning candidate in the
polling vote and the election were the same (see Figure 1). In five states, Clinton led in the
polling vote, but President Trump won in the
election. For simplification, the term flipped states
refers to the five states identified in Figure 1.
Due to the non-uniformity of the data, the measure of centrality in this analysis is the
median. Summarizing data can cause misrepresentation of the data. Comparing the polling vote
identified 12 states with five percent or less difference between candidates. Visualizing the 12
states identified the how well the median represents the data (see Figure 2). The evidence
suggests that the median does not misrepresent the results. The 12 states include the five flipped
states identified in Figure 1. The close margins in the polling data of the flipped states
necessitated a deeper investigation, into individual polls. Before documenting the remaining
results of this analysis, the visualization of the difference between candidates requires further
explanation.
Repeating the same information is ill-advised.
Dont repeat the information in the caption of a
figure or table.
13
Documenting Research Guide Last Revised: 8/30/2020 13
The candidates were compared by subtracting the polling votes for each state (see Figure
3 and Figure 4). The values direction is indicative of the winning candidate. Leads held by
Clinton are to the left of zero. Where President Trumps led, the value is annotated to the right of
zero. The value is indicative of how much lead one candidate has over the other. For example, if
President Trump earned 40% of the vote and Clinton earned 41% of the vote, Clinton led that
vote by one percent. This Clinton lead would be visualized by placing the marker to the left of
zero on the axis marker representing a value of one percent.
APA use of figures & tables is specific. Each figure or table but include enough information to be self-explanatory.
Do not explain the figure in the document. **You must refer to each figure or table in the document, though!**
Results require EVIDENCE. In visual analysis, the evidence is visual!
14
Documenting Research Guide Last Revised: 8/30/2020 14
15
Documenting Research Guide Last Revised: 8/30/2020 15
16
Documenting Research Guide Last Revised: 8/30/2020 16
17
Documenting Research Guide Last Revised: 8/30/2020 17
After identifying the flipped states polling vote by candidate differed by five percent or
less, each poll within flipped states ending in November 2016 were analyzed (see Figure 3). The
majority of the individual polls also varied by less than five percent
between the candidates. Clinton held the lead in nearly all polls in
these states. In Florida, there were no polls that exceeded the five
percent margin between candidates. Trump did not lead in any polls
in Wisconsin from this data.
The polling vote and election vote were compared by
candidate all election locations from the data. While five states
flipped, there were other states with close margins. Additionally, the
compa