Data Management and Visualisation Week-2 Assignment
Hello guys, I am writing this blog as a part of the week-2 assignment for the coursera course named Data Management and Visualisation. The assignments are about writing one blog for each week presenting your research work done within the week.
So, in the week-2 the assignment is about running the first program in the python(spyder IDE).
The task is to load the dataset and display the variables which I have decided to work on in the week-1. So, job is display which values are taken by the variables and how many times the values are taken by the variables.
1) Step 1: My program
So, below is the code I have implemented:
2) Step 2: The output that displays three of your variables as frequency tables
Here is the output of the code:
The number of observations are: 384343
The number of variables in the dataset are: 10
The count of LATITUDE_CIRCLE_IMAGE variable is as below:
LATITUDE_CIRCLE_IMAGE
-86.700 1
-86.560 1
-85.988 1
-85.973 1
-85.560 1
..
84.969 1
85.008 1
85.085 1
85.097 1
85.702 1
Length: 129197, dtype: int64
The percentage of LATITUDE_CIRCLE_IMAGE variable is as below:
LATITUDE_CIRCLE_IMAGE
-86.700 0.00026
-86.560 0.00026
-85.988 0.00026
-85.973 0.00026
-85.560 0.00026
84.969 0.00026
85.008 0.00026
85.085 0.00026
85.097 0.00026
85.702 0.00026
Length: 129197, dtype: float64
The count of LONGITUDE_CIRCLE_IMAGE variable is as below:
LONGITUDE_CIRCLE_IMAGE
-179.997 1
-179.993 1
-179.992 2
-179.991 2
-179.990 1
..
179.992 1
179.993 1
179.994 1
179.996 1
179.997 1
Length: 231245, dtype: int64
The percentage of LONGITUDE_CIRCLE_IMAGE variable is as below:
LONGITUDE_CIRCLE_IMAGE
-179.997 0.00026
-179.993 0.00026
-179.992 0.00052
-179.991 0.00052
-179.990 0.00026
179.992 0.00026
179.993 0.00026
179.994 0.00026
179.996 0.00026
179.997 0.00026
Length: 231245, dtype: float64
The count of MORPHOLOGY_EJECTA_1 variable is shown as below:
MORPHOLOGY_EJECTA_1
339718
DLEPC 232
DLEPC/DLEPCPd 4
DLEPC/DLEPS 145
DLEPC/DLEPS/Rd 2
SLERS/Rd 281
SLERS/Rd/SLERS 1
SLERSPd 16
SLERSRd 4
SLErS 1
Length: 156, dtype: int64
The percentage of MORPHOLOGY_EJECTA_1 variable is as below:
MORPHOLOGY_EJECTA_1
88.389277
DLEPC 0.060363
DLEPC/DLEPCPd 0.001041
DLEPC/DLEPS 0.037727
DLEPC/DLEPS/Rd 0.000520
SLERS/Rd 0.073112
SLERS/Rd/SLERS 0.000260
SLERSPd 0.004163
SLERSRd 0.001041
SLErS 0.000260
Length: 156, dtype: float64
3) Step 3: A few sentences describing your frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.
The dataset consists of different values of longitudes and latitudes and through the percentages provided , it can be concluded that across few latitudes and longitudes there is less availability of craters along the surface.
There is no missing or null data across the dataset.
OUTPUT:
CRATER_ID 0
CRATER_NAME 0
LATITUDE_CIRCLE_IMAGE 0
LONGITUDE_CIRCLE_IMAGE 0
DIAM_CIRCLE_IMAGE 0
DEPTH_RIMFLOOR_TOPOG 0
MORPHOLOGY_EJECTA_1 0
MORPHOLOGY_EJECTA_2 0
MORPHOLOGY_EJECTA_3 0
NUMBER_LAYERS 0
dtype: int64