Title
AY: 2019/20
CIS7031 – Programming for Data Analysis
20 Credit Hours
Semester 2
Module Leader: Imtiaz Khan
Assessment Brief
Assessment Title:
Employment in Wales
WRIT1 100 %
HAND-OUT DATE:
HAND-IN DATE: 24 May 2020
Contents
Learning Outcomes 3
EDGE 3
Assessment Requirements / Tasks (include all guidance notes) 3
Assessment Criteria 5
Submission Details 5
Feedback 5
Marking Criteria 6
Additional Information 7
Referencing Requirements (Harvard) 7
Mitigating Circumstances 7
Unfair Practice 7
Learning Outcomes
This assessment is designed to demonstrate a students completion of the following Learning Outcomes:
Critically analyse and evaluate various statistical and computational techniques for analysing datasets and determine the most appropriate technique for a business problem;
Critically evaluate, develop and implement solutions for processing datasets and solving complex problems in various environments using relevant programming paradigms;
Evaluate and apply key steps and issues involved in data preparation, cleaning, exploring, creating, optimizing and evaluating models;
Evaluate and apply aspects of data science applications and their use.
EDGE
The Cardiff Met EDGE supports students in graduating with the knowledge, skills, and attributes that allow them to contribute positively and effectively to the communities in which they live and work.
This module assessment provides opportunities for students to demonstrate development of the following EDGE Competencies:
ETHICAL
Students will be required to consider Ethical implication of their analysis and follow the necessary ethical approval processes while addressing problems associated with the assessment.
DIGITAL
Students will be required to demonstrate digital skills in the collation of data and analysis for their project.
GLOBAL
Students will demonstrate an awareness of the global context and apply this to their assessment
ENTREPRENEURIAL
Students will also demonstrate their developed entrepreneurial through working under their own initiative, formulating and presenting recommendations in order to solve an authentic and complex problem associated with the module.
Assessment Requirements / Tasks (include all guidance notes)
This assignment will use employment data of Wales from the StatsWales data source. This dataset provides workplace employment estimates, or estimates of total jobs, for Wales and its NUTS2 areas, along with comparable UK data disaggregated by industry section.
For this assignment students will undertake a data analysis and machine learning approach to reveal the workplace employment landscape of Wales.
1. Data processing
1.1. Download the dataset for the period 2009 2018 and create a dataframe that concatenates Wales (total) employment value only.
1.2. Check for any null value or outlier. If found replace that with mean value.
1.3. Change the name of the industries as bellow
The final dataframe should look like following
Industry
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
Agriculture
Production
Construction
Retail
ICT
Finance
Real_Estate
Professional_Service
Public_Adminstration
Other_Service
Targeted
2. Data analysis
For each question provide graph/chart along with your own interpretation (~ 50 words)
2.1. Which industry employed highest and lowest workers over the period?
2.2. Which industry has the highest and lowest overall growth over the period?
2.3. Which years are the best and worst performing year in relation to number of employment. (highest and lowest employment)
3. Visual analysis
Create a dynamic scatter/bubble plot showing the change of workforce number over the period using Plotly express.
4. PCA/Correlation
4.1. Undertake a PCA (PC=2; columns should be like PC1, PC2, Industry) and produce a scatter plot. Write your interpretation about the plot and in relation to the analysis of section 2 & 3 (for example which industries are correlated over the years as well as in PCA etc.)
4.2. Make a year wise correlation for each industry. Does the aforementioned industries are also correlated over the years? Explain your answer.
5. Clustering (k means & hierarchical)
5.1. Using the best and worst performing year columns employment data (2.3) undertake a K means clustering analysis (K=2 & 3) and identify industries cluster together. Write your own interpretation (~100 words).
5.2. Using the same dataset (best & worst performing) create a hierarchical cluster. Compare the cluster with k means clusters.
6. Discussion
Provide a brief discussion (~ 300 words) on employment landscape of Wales based on the employment data analysis results.
Assessment Criteria
1.1 Data preparation
05
1.2 Data preparation
05
1.3 Data preparation
05
2.1 Data analysis
05
2.2 Data analysis
05
2.3 Data analysis
05
3 Visual analysis
20
4.1 PCA
10
4.1 Correlation
10
5.1 Clustering
10
5.2 Clustering
10
6 Discussion
10
Submission Details
Please see Moodle for confirmation of the Assessment submission date.
Presentation will be on 4:00 PM of submission date.
Any assessments submitted after the deadline will not be marked and will be recorded as a Non-Attempt.
The assessment must be submitted as a zip file / pdf / word document through the Turnitin submission point in Moodle
Your assessment should be titled with your Student ID Number, module code and assessment id, e.g. st12345678 CIS4000 WRIT1
Feedback
Feedback for the assessment will be provided electronically via Moodle, and will normally be available 4 working weeks after initial submission. The feedback return date will be confirmed on Moodle.
Feedback will be provided in the form of a rubric and supported with comments on your strengths and the areas which you improve.
All marks are preliminary and are subject to quality assurance processes and confirmation at the Examination Board.
Further information on the Academic and Feedback Policy in available in the Academic Handbook (Vol 1, Section 4.0)
Marking Criteria
70 100%
(1st)
Addressed all sections and provided correct answers with elegant presentation of results. Applied correct data analysis approaches and provided excellent interpretation on each section.
60-69%
(2:1)
Addressed all sections and provided correct answers with good presentation of results. Applied mostly correct data analysis approaches and provided very good interpretation on each section.
50-59%
(2:2)
Addressed most of the sections and provided mostly correct answers with average presentation of results. Applied some correct data analysis approaches and provided an average interpretation on each section.
40-49%
(3rd)
Addressed few sections with few correct answers with/out any presentation of results. Applied mostly incorrect data analysis approaches and provided poor interpretation on each section.
35-39%
(Narrow Fail)
Addressed few sections and provided mostly incorrect answer with poor presentation of results. Applied incorrect data analysis approaches and provided poor interpretation.
<35% (Fail) Very poor report missing one or more required parts. Additional Information Referencing Requirements (Harvard) The Harvard (or author-date) format should be used for all references (including images). Further information on Referencing can be found at Cardiff Mets Academic Skills website. Mitigating Circumstances If you have experienced changes or events which have adversely affected your academic performance on the assessment, you may be eligible for Mitigating Circumstances (MCs). You should contact your Module Leader, Personal Tutor or Year Tutor in the first instance. An application for MCs, along with appropriate supporting evidence, can be submitted via the following link to the MCs Dashboard Applications for MCs should ideally be submitted as soon as possible after circumstances occur & at the time of the assessment. Applications must be submitted before the relevant Examination Board . Further information on the Mitigating Circumstances procedure is available in the Academic Handbook (Volume 1, Section 5) Unfair Practice Cardiff Metropolitan University takes issues of unfair practice extremely seriously. The University has distinct procedures and penalties for dealing with unfair practice in examination or non-examination conditions. These are explained in full in the University's Unfair Practice Procedure (Academic Handbook: Vol 1, Section 8) Types of Unfair Practice, include: Plagiarism, which can be defined as using without acknowledgement another persons words or ideas and submitting them for assessment as though it were ones own work, for instance by copying, translating from one language to another or unacknowledged paraphrasing. Further examples include: Use of any quotation(s) from the published or unpublished work of other persons, whether published in textbooks, articles, the Web, or in any other format, which quotations have not been clearly identified as such by being placed in quotation marks and acknowledged. Use of another persons words or ideas that have been slightly changed or paraphrased to make it look different from the original. Summarising another persons ideas, judgments, diagrams, figures, or computer programmes without reference to that person in the text and the source in a bibliography or reference list. Use of services of essay banks and/or any other agencies. Use of unacknowledged material downloaded from the Internet. Re-use of ones own material except as authorised by the department. Collusion, which can be defined as when work that that has been undertaken with others is submitted and passed off as solely the work of one person. An example of this would be where several students work together on an assessment and individually submit work which contains sections which are the same. Assessments briefs will clearly identify where joint preparation and joint submission is specifically permitted, in all other cases it is not. Fabrication of data, making false claims to have carried out experiments, observations, interviews or other forms of data collection and analysis, or acting dishonestly in any other way. Page 1 of 8 Page 4 of 8