1. ์ „์ฒ˜๋ฆฌ ์ข…ํ•ฉ

๐Ÿ† ๋จผ์ € ๋ฏธ๋ฆฌ ์ „์ฒ˜๋ฆฌํ•˜๊ณ  ๊ฐ€๊ณตํ•œ ๋ฐ์ดํ„ฐ density_2013_tot ์„ ๊ฐ€์ ธ์˜ค์ž.
๐Ÿ† ์ง€๋‚œ ํฌ์ŠคํŒ… (1),(2) ์—์„œ ๋งŒ๋“  ์ „์ฒ˜๋ฆฌ ์ฝ”๋“œ๋ฅผ ํ•œ๋ฒˆ์— ์ข…ํ•ฉํ–ˆ๋‹ค.

import pandas as pd

pop = pd.read_csv("Data/state-population.csv", encoding = 'utf-8-sig')
area = pd.read_csv("Data/state-areas.csv", encoding = 'utf-8-sig')
abb = pd.read_csv("Data/state-abbrevs.csv", encoding = 'utf-8-sig')

pop = pop.dropna()
cols = pop.columns.to_list()
cols[0] = 'abbreviation'
pop.columns = cols
pop_tot = pop[pop['ages'] == 'total']
pop_18 = pop[pop['ages'] == 'under18']

abb_area = pd.merge(area, abb, on = 'state', how = 'outer')
abb_area = abb_area.fillna("PR")

pop_age_tot_final = pd.merge(pop_tot, abb_area, on = 'abbreviation', how = 'outer').dropna()
pop_age_18_final = pd.merge(pop_18, abb_area, on = 'abbreviation', how = 'outer').dropna()

pop_age_tot_final['pop/area'] = pop_age_tot_final['population'] / pop_age_tot_final['area (sq. mi)']
pop_age_18_final['pop/area'] = pop_age_18_final['population'] / pop_age_18_final['area (sq. mi)']

density_2013_tot = pop_age_tot_final[pop_age_tot_final['year']== 2013]
density_2013_tot = density_2013_tot.sort_values('pop/area', ascending=False)

density_2013_tot = density_2013_tot[['abbreviation', 'pop/area']]
density_2013_tot = density_2013_tot.set_index('abbreviation')
density_2013_tot_top10 = density_2013_tot[:10] 

density_2013_tot_top10
>>
                pop/area
abbreviation	
DC	        9506.602941
PR	        1028.473969
NJ	        1020.332378
RI	        680.589644
CT	        648.643579
MA	        634.090384
MD	        477.860401
DE	        473.771238
NY	        360.736613
FL	        297.345722

์ง€์—ญ์˜ ์•ฝ์–ด๋ฅผ ๊ฐ€์ง„ abbreviation์—ด์„ ์ธ๋ฑ์Šค๋กœ ์„ค์ •ํ•ด์คฌ๋‹ค.

2. ์‹œ๊ฐํ™”

2.1. iplot


๐Ÿ† ๋จผ์ € iplot์œผ๋กœ ์‹œ๊ฐํ™”ํ•ด๋ณด์ž.
๐Ÿ† iplot ์„ ์ž„ํฌํŠธํ•ด์ค€๋‹ค.

import chart_studio.plotly as py
import cufflinks as cf
cf.go_offline(connected = True)

๐Ÿ† ์ด์ œ iplot์˜ layout์„ ์„ค์ •ํ•˜์ž.

layout = {
    'title' : {
        'text' : '<b>Population / Area about total ages in 2013</b>', 
        'font' : {
            'size' : 20
        },
        'x' : 0.5
    },
    
    'xaxis' : {
        'showticklabels' : True,
        'title': {
            'text' : 'Abbreviation',
            'font' : {'size' : 15}
        }
    },

    'yaxis' : {
        'showticklabels' : True,
        'dtick' : 1000,
        'title' : {
            'text' : 'pop/area',
            'font' : {'size' : 15}
        }
    }
}  

๐Ÿ† ๋‹จ์ˆœ ์ˆ˜์น˜ ๋น„๊ต์ด๋ฏ€๋กœ bar ๊ทธ๋ž˜ํ”„๋กœ ์‹œ๊ฐํ™”ํ•œ๋‹ค.

density_2013_tot_top10.iplot(kind = 'bar', layout = layout)

2.2. plotly


๐Ÿ† ์ด๋ฒˆ์—๋Š” plotly๋กœ ์‹œ๊ฐํ™”ํ•ด๋ณด์ž.
๐Ÿ† plotly๋ฅผ ์ž„ํฌํŠธํ•ด์ค€๋‹ค.

import plotly.graph_objects as go
import plotly.offline as pyo
pyo.init_notebook_mode()

๐Ÿ† ๊ทธ๋ž˜ํ”„์˜ ํ…œํ”Œ๋ฆฟ์„ ์ •ํ•ด์ฃผ๊ณ  ์‹ถ๋‹ค๋ฉด plotly์˜ ํ…œํ”Œ๋ฆฟ ๋ชฉ๋ก์„ ๋ถˆ๋Ÿฌ์™€์„œ ํ™•์ธํ•ด๋ณด์ž.

import plotly. io as pio
pio.templates
>>
Templates configuration
-----------------------
    Default template: 'plotly'
    Available templates:
        ['ggplot2', 'seaborn', 'simple_white', 'plotly',
         'plotly_white', 'plotly_dark', 'presentation', 'xgridoff',
         'ygridoff', 'gridon', 'none']

๐Ÿ† ์ธ๊ตฌ์ˆ˜๊ฐ€ ์ œ์ผ ๋งŽ์€ DC ์ง€์—ญ์˜ ๊ทธ๋ž˜ํ”„๋Š” ์ƒ‰๊น”์„ ๋‹ค๋ฅด๊ฒŒ ํ•ด์ฃผ๊ธฐ ์œ„ํ•ด colors ๋ฆฌ์ŠคํŠธ๋ฅผ ํ•˜๋‚˜ ๋งŒ๋“ค์–ด์ค€๋‹ค.

colors = ['#04BFAD',] * len(density_2013_tot_top10)
colors[0] = '#F25C5C'

๐Ÿ† layout๊ณผ annotation์„ ์„ค์ •ํ•ด์ฃผ๊ณ  ๊ทธ๋ž˜ํ”„๋ฅผ ์ถœ๋ ฅํ•˜์ž.

fig = go.Figure()

fig.add_trace(
    go.Bar(
    x = density_2013_tot_top10.index, y = density_2013_tot_top10['pop/area'],
    marker_color = colors
    )
)

fig.update_layout(
    {
        'title' : {
            'text' : '<b>Population / Area about total ages in 2013</b>',
            'font' : {'size' : 20},
            'x' : 0.5
            },
        
        'xaxis' : {'title' : {'text' : 'Abbreviation'}, 'showticklabels' : True},
        'yaxis' : {'title' : {'text' : 'pop/area'}, 'showticklabels' : True, 'dtick' : 1000},
        
        'template' : 'plotly_white'
    }
)

fig.add_annotation({
    'x' : "DC",
    'y' : 9550,
    
    'text' : 'pop / area in DC',
    'showarrow' : True,
    'font' : {'size' : 10, 'color' : 'white'},
    
    'align' : 'center',
    'arrowhead' : 2,
    'arrowsize' : 1,
    'arrowwidth' : 2,
    'arrowcolor' : '#04BFAD',
    
    'ax' : 20, 'ay' : -50,
    
    'bordercolor' : '#04BFAD',
    'borderwidth' : 2,
    'borderpad' : 7,
    'bgcolor' : '#F25C5C',
    
    'opacity' : 0.9
})
fig.show()


๐Ÿ‘‰ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๊ฒฐ๊ณผ ์›Œ์‹ฑํ„ด DC์˜ ๋ฉด์ ๋ณ„ ์ธ๊ตฌ์ˆ˜๊ฐ€ ์••๋„์ ์œผ๋กœ ๋งŽ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์•„๋ฌด๋ž˜๋„ 177 ์ œ๊ณฑํ‚ฌ๋กœ๋ฏธํ„ฐ๋ฐ–์— ๋˜์ง€ ์•Š๋Š” ๊ทธ๋ ‡๊ฒŒ ํฌ์ง€ ์•Š์€ ๊ตฌ์—ญ์— ๋งŽ์€ ํ–‰์ •๊ตฌ์—ญ์ด ๋ชฐ๋ ค์žˆ์–ด์„œ ๊ทธ๋Ÿฐ ๊ฒŒ ์•„๋‹๊นŒ ์‹ถ๋‹ค. ๊ทธ ์™ธ์˜ ๋‹ค๋ฅธ ์ง€์—ญ๋“ค๊ณผ ์›Œ์‹ฑํ„ด์˜ ์ฐจ์ด๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์ด ๋‚˜์„œ ์ € ์ง€์—ญ๋“ค๋ผ๋ฆฌ ๋น„๊ตํ•˜๊ธฐ์—๋Š” ๋ถ„์„์˜ ์˜๋ฏธ๊ฐ€ ํ๋ ค์งˆ ๊ฒƒ ๊ฐ™๊ธฐ๋„ ํ•˜๋‹ค. ์ € ๋ฐ์ดํ„ฐ๋Š” ๋‹จ์ง€ ๋ฏธ๊ตญ ๋‚ด์—์„œ ๋ฉด์ ๋ณ„ ์ธ๊ตฌ์ˆ˜๊ฐ€ ๊ฐ€์žฅ ๋งŽ์€ ๋ช‡๊ฐœ์˜ ์ง€์—ญ์„ ๋ฝ‘์•„๋‘” ๋ฐ์ดํ„ฐ๋ผ๊ณ  ์ƒ๊ฐํ•˜๋Š” ๊ฒƒ์ด ๋‚ซ๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค.


๐Ÿ† ์ง์ ‘ ์•„๋ฌด ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๋ถ„์„์„ ์ง„ํ–‰ํ•œ ๊ฒƒ์€ ์ฒ˜์Œ์ธ๋ฐ ํ™•์‹คํžˆ ์ง์ ‘ ํ•ด๋ณผ๋•Œ ๋ชธ์— ์ต๋Š”๋‹ค. ์šด์ข‹๊ฒŒ๋„ ๊ฒฐ์ธก์น˜๊ฐ€ ๋ณ„๋กœ ์—†๋Š” ๊นจ๋—ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณจ๋ž์ง€๋งŒ ์ฒ˜์Œ์ด๋ผ ๊ทธ๋Ÿฐ์ง€ ์ด๊ฒƒ๋„ ์‹œ๊ฐ„์ด ์กฐ๊ธˆ ๋“ค์—ˆ๋‹ค. ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฒฝํ—˜ํ•ด๋ณด์ž!!
๐Ÿ† plotly ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐ์— ์ข€ ๋” ์ต์ˆ™ํ•ด์งˆ ํ•„์š”๊ฐ€ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค.
๐Ÿ† Matplotlib๋„ ๊ณต๋ถ€ํ•˜๊ธฐ.


๐Ÿ’กํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์–ธ์Šค ํ•ธ๋“œ๋ถ(์œ„ํ‚ค๋ถ์Šค,2020)์˜ ์ €์ž Jake VanderPlas ๋ถ„์˜ ๊นƒํ—ˆ๋ธŒ์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์™”์Œ์„ ๋ฐํž™๋‹ˆ๋‹ค.

Leave a comment