Data Management and Visualisation Week-2 Assignment

Aanshi Patwari
3 min readAug 25, 2020

Hello guys, I am writing this blog as a part of the week-2 assignment for the coursera course named Data Management and Visualisation. The assignments are about writing one blog for each week presenting your research work done within the week.

So, in the week-2 the assignment is about running the first program in the python(spyder IDE).

The task is to load the dataset and display the variables which I have decided to work on in the week-1. So, job is display which values are taken by the variables and how many times the values are taken by the variables.

1) Step 1: My program

So, below is the code I have implemented:

2) Step 2: The output that displays three of your variables as frequency tables

Here is the output of the code:

The number of observations are: 384343

The number of variables in the dataset are: 10

The count of LATITUDE_CIRCLE_IMAGE variable is as below:

LATITUDE_CIRCLE_IMAGE

-86.700 1

-86.560 1

-85.988 1

-85.973 1

-85.560 1

..

84.969 1

85.008 1

85.085 1

85.097 1

85.702 1

Length: 129197, dtype: int64

The percentage of LATITUDE_CIRCLE_IMAGE variable is as below:

LATITUDE_CIRCLE_IMAGE

-86.700 0.00026

-86.560 0.00026

-85.988 0.00026

-85.973 0.00026

-85.560 0.00026

84.969 0.00026

85.008 0.00026

85.085 0.00026

85.097 0.00026

85.702 0.00026

Length: 129197, dtype: float64

The count of LONGITUDE_CIRCLE_IMAGE variable is as below:

LONGITUDE_CIRCLE_IMAGE

-179.997 1

-179.993 1

-179.992 2

-179.991 2

-179.990 1

..

179.992 1

179.993 1

179.994 1

179.996 1

179.997 1

Length: 231245, dtype: int64

The percentage of LONGITUDE_CIRCLE_IMAGE variable is as below:

LONGITUDE_CIRCLE_IMAGE

-179.997 0.00026

-179.993 0.00026

-179.992 0.00052

-179.991 0.00052

-179.990 0.00026

179.992 0.00026

179.993 0.00026

179.994 0.00026

179.996 0.00026

179.997 0.00026

Length: 231245, dtype: float64

The count of MORPHOLOGY_EJECTA_1 variable is shown as below:

MORPHOLOGY_EJECTA_1

339718

DLEPC 232

DLEPC/DLEPCPd 4

DLEPC/DLEPS 145

DLEPC/DLEPS/Rd 2

SLERS/Rd 281

SLERS/Rd/SLERS 1

SLERSPd 16

SLERSRd 4

SLErS 1

Length: 156, dtype: int64

The percentage of MORPHOLOGY_EJECTA_1 variable is as below:

MORPHOLOGY_EJECTA_1

88.389277

DLEPC 0.060363

DLEPC/DLEPCPd 0.001041

DLEPC/DLEPS 0.037727

DLEPC/DLEPS/Rd 0.000520

SLERS/Rd 0.073112

SLERS/Rd/SLERS 0.000260

SLERSPd 0.004163

SLERSRd 0.001041

SLErS 0.000260

Length: 156, dtype: float64

3) Step 3: A few sentences describing your frequency distributions in terms of the values the variables take, how often they take them, the presence of missing data, etc.

The dataset consists of different values of longitudes and latitudes and through the percentages provided , it can be concluded that across few latitudes and longitudes there is less availability of craters along the surface.

There is no missing or null data across the dataset.

OUTPUT:

CRATER_ID 0

CRATER_NAME 0

LATITUDE_CIRCLE_IMAGE 0

LONGITUDE_CIRCLE_IMAGE 0

DIAM_CIRCLE_IMAGE 0

DEPTH_RIMFLOOR_TOPOG 0

MORPHOLOGY_EJECTA_1 0

MORPHOLOGY_EJECTA_2 0

MORPHOLOGY_EJECTA_3 0

NUMBER_LAYERS 0

dtype: int64

--

--