πŸ† μ˜€λžœλ§Œμ— λ‹€λ£¨λŠ” ν–‰μ„± 데이터닀. 사싀 μ΄λ ‡κ²Œ ν•œλ²ˆ 더 λ‹€λ£°λ§ŒνΌ μ€‘μš”ν•œ λ°μ΄ν„°λŠ” μ•„λ‹ˆλΌκ³  μƒκ°ν–ˆμ—ˆλŠ”λ°, ν—ˆλ¦¬λ””μŠ€ν¬λ‘œ λˆ„μ›ŒμžˆλŠ” λ™μ•ˆ μ•„μ‰¬μš΄ 점을 μƒκ°ν•΄λ³΄λ‹ˆ μ‹œκ°ν™”λ§Œ 해놓고 κ·Έ 데이터가 μ˜λ―Έν•˜λŠ” λ°”λ₯Ό μ •ν™•νžˆ 뢄석해본 적은 μ—†μ—ˆλ‹€λŠ” 것이 λ– μ˜¬λžλ‹€. κ·Έ 이유둜 μ΄λ ‡κ²Œ 두달정도 전에 λ‹€λ€˜λ˜ 데이터λ₯Ό κ°€μ Έμ™€λ΄€λ‹€πŸ˜‰.

πŸ† 이번 ν¬μŠ€νŒ…μ—μ„œλŠ” plotly μ‹œκ°ν™”μ—μ„œ for문을 μ‚¬μš©ν•˜λŠ” 방법을 μ†Œκ°œν•œλ‹€.


1. for문을 μ‚¬μš©ν•˜μ§€ μ•ŠλŠ” 경우

πŸ† λ¨Όμ € 이전 ν¬μŠ€νŒ…μ—μ„œ μ‹œκ°ν™”ν•œ κ·Έλž˜ν”„λ₯Ό κ°€μ Έμ˜€λ„λ‘ ν•˜μž. ν–‰μ„± λ°μ΄ν„°μ—μ„œ λ‹€λ₯Έ 뢄석을 ν•΄μ•Όκ² λ‹€κ³  μƒκ°ν•˜κ²Œ ν•œ κ°€μž₯ 큰 μ΄μœ μ΄λ‹€.

πŸ† μœ„μ˜ λ§‰λŒ€κ·Έλž˜ν”„λ₯Ό 보면 μ•Œκ² μ§€λ§Œ Radial Velocity 와 Transit 을 μ œμ™Έν•œ λ‚˜λ¨Έμ§€ 관츑방법은 κ·Έλ ‡κ²Œ λˆˆμ— λ„λŠ” κ΄€μΈ‘ 수λ₯Ό λ³΄μ΄μ§€λŠ” λͺ»ν•œλ‹€(λ¬Όλ‘  ν–‰μ„±μ˜ 발견 ν•˜λ‚˜ν•˜λ‚˜κ°€ 우주 역사에 μžˆμ–΄ λͺΉμ‹œ 큰 μ—…μ μ΄μ§€λ§Œ 이 λ°μ΄ν„°μ—μ„œλŠ” λ°°μ œν•œλ‹€). 단지 μ € λ‘κ°œμ˜ κ΄€μΈ‘λ²•λ§Œμ΄ λ§Žμ€ 개수λ₯Ό 가지고 있음이 λ°μ΄ν„°λΆ„μ„μ—μ„œ μ˜λ―Έκ°€ 큰 것은 μ•„λ‹ˆκΈ° λ•Œλ¬Έμ—, λ‹€λ₯Έ 뢄석법을 ν•„μš”λ‘œ ν•œλ‹€. κ·Έλž˜μ„œ μƒκ°ν•œ 방법이 λ…„λ„λ³„λ‘œ κ΄€μΈ‘λ²•μ˜ 좔이λ₯Ό μ•Œμ•„λ³΄λŠ” 것이닀. 그러면 μ„Έμ›”μ˜ 흐름에 따라 두각을 λ‚˜νƒ€λ‚΄λŠ” 관츑법과 μƒˆλ‘­κ²Œ λ– μ˜€λ₯΄λŠ” 관츑법을 μ•Œμ•„λ³΄κΈ°μ— 쒋을 것이라 μƒκ°ν•˜κΈ° λ•Œλ¬Έμ΄λ‹€. 그럼 이제 μ‹œμž‘ν•΄λ³΄μžπŸƒβ€β™‚οΈ.


일단 데이터λ₯Ό 뢈러였자.

import pandas as pd
import seaborn as sns

planets = sns.load_dataset('planets')
planets.head()
>>
	method	        number	orbital_period	mass	distance    year
0	Radial Velocity	1	269.300	        7.10	77.40	    2006
1	Radial Velocity	1	874.774	        2.21	56.95	    2008
2	Radial Velocity	1	763.000	        2.60	19.84	    2011
3	Radial Velocity	1	326.030	        19.40	110.62	    2007
4	Radial Velocity	1	516.220	        10.50	119.47	    2009

μš°λ¦¬λŠ” 2개의 ν˜•νƒœλ‘œ 데이터λ₯Ό μ²˜λ¦¬ν•  것이기 λ•Œλ¬Έμ— λ°μ΄ν„°ν”„λ ˆμž„μ˜ 이름을 _1 / _2 둜 λͺ…λͺ…ν•˜κ² λ‹€.

planets_year_method_1 = planets.groupby(['method', 'year']).agg({'number':'sum'})
planets_year_method_1 = planets_year_method_1.reset_index()
planets_year_method_1
>> 
        method	                        year	number
0	Astrometry	                2010	1
1	Astrometry	                2013	1
2	Eclipse Timing Variations	2008	4
3	Eclipse Timing Variations	2009	1
4	Eclipse Timing Variations	2010	4
...	...	                        ...	...
64	Transit	                        2014	93
65	Transit Timing Variations	2011	2
66	Transit Timing Variations	2012	2
67	Transit Timing Variations	2013	2
68	Transit Timing Variations	2014	3

69 rows Γ— 3 columns

method, year, number 열을 κ°€μ§€λŠ” planets_year_method_1 λ°μ΄ν„°ν”„λ ˆμž„μ΄ μ™„μ„±λ˜μ—ˆλ‹€πŸ˜‰.

λ¨Όμ € Bar κ·Έλž˜ν”„λ‘œ μ‹œκ°ν™”ν•΄λ³΄μž.

import plotly.graph_objects as go
import plotly.offline as pyo
pyo.init_notebook_mode()

fig = go.Figure()
fig.add_trace(
        go.Bar(
            x = planets_year_method_1[planets_year_method_1['method']=='Imaging']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Imaging']['number'], name = 'Imaging'))

fig.add_trace(
        go.Bar(
            x = planets_year_method_1[planets_year_method_1['method']=='Radial Velocity']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Radial Velocity']['number'], name = 'Radial Velocity'))

fig.add_trace(
        go.Bar(
            x = planets_year_method_1[planets_year_method_1['method']=='Eclipse Timing Variations']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Eclipse Timing Variations']['number'], name = 'Eclipse Timing Variations'))

fig.add_trace(
        go.Bar(
            x = planets_year_method_1[planets_year_method_1['method']=='Transit']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Transit']['number'], name = 'Transit'))

fig.add_trace(
        go.Bar(
            x = planets_year_method_1[planets_year_method_1['method']=='Microlensing']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Microlensing']['number'], name = 'Microlensing'))
fig.show()

μš°μ„  for문을 μ‚¬μš©ν•˜μ§€ μ•Šμ•˜κΈ° λ•Œλ¬Έμ— λ§Žμ€ κ΄€μΈ‘μˆ˜λ₯Ό κ°€μ§€λŠ” 5개의 λ°μ΄ν„°λ§Œ μ‹œκ°ν™”ν•΄λ΄€λ‹€. κ²°κ³ΌλŠ” μ•„λž˜μ™€ κ°™λ‹€.

λ§‰λŒ€κ·Έλž˜ν”„μ— λ„ˆλ¬΄ λ§Žμ€ λ‚΄μš©μ΄ λ“€μ–΄ μžˆμ–΄ ν•œλˆˆμ— μ•Œμ•„λ³΄κΈ° μ–΄λ ΅λ‹€. 전체적인 좔이λ₯Ό μœ„ν•΄ line κ·Έλž˜ν”„λ₯Ό ν•œλ²ˆ 그렀보자.

fig = go.Figure()
fig.add_trace(
        go.Scatter(
            x = planets_year_method_1[planets_year_method_1['method']=='Imaging']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Imaging']['number'], name = 'Imaging', mode = 'lines'))

fig.add_trace(
        go.Scatter(
            x = planets_year_method_1[planets_year_method_1['method']=='Radial Velocity']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Radial Velocity']['number'], name = 'Radial Velocity', mode = 'lines'))

fig.add_trace(
        go.Scatter(
            x = planets_year_method_1[planets_year_method_1['method']=='Eclipse Timing Variations']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Eclipse Timing Variations']['number'], name = 'Eclipse Timing Variations', mode = 'lines'))

fig.add_trace(
        go.Scatter(
            x = planets_year_method_1[planets_year_method_1['method']=='Transit']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Transit']['number'], name = 'Transit', mode = 'lines'))

fig.add_trace(
        go.Scatter(
            x = planets_year_method_1[planets_year_method_1['method']=='Microlensing']['year'],
            y = planets_year_method_1[planets_year_method_1['method']=='Microlensing']['number'], name = 'Microlensing', mode = 'lines'))
fig.show()

πŸ‘‰ μ΄λ ‡κ²Œ μ‹œκ°ν™”ν•˜λ‹ˆ ν™•μ‹€νžˆ 전체적인 λͺ¨μ–‘이 잘 보인닀. Radial Velocity λŠ” 1990λ…„λŒ€λΆ€ν„° κΎΈμ€€νžˆ κ·Έ 두각을 보여왔고 Transit 은 2000λ…„λŒ€ μ΄ˆλ°˜λΆ€ν„° 점차적으둜 μ„±μž₯ν•˜λ”λ‹ˆ 2010λ…„λŒ€ λ“€μ–΄μ„œ μš°μ„Έν•œ κ΄€μΈ‘μΉ˜λ₯Ό λ‚˜νƒ€λ‚Έλ‹€. λ‚˜λ¨Έμ§€ μ„Έκ°œμ˜ 관츑법은 졜근 λ“€μ–΄μ„œ λͺ¨μŠ΅μ„ λ³΄μ΄λŠ” 방법인 λ“―ν•˜λ‹€. ν•œλ‘κ°€μ§€ 방법에 μ˜μ‘΄ν•˜μ§€ μ•Šκ³  λ‹€λ₯Έ 기술이 λ°œλ‹¬ν•˜κ³  μžˆλŠ” κ·Έλž˜ν”„λΌκ³  ν•  수 μžˆμ„ 것 κ°™λ‹€.

2. for 문을 μ‚¬μš©ν•˜λŠ” 경우

πŸ† μ΄λ²ˆμ—λŠ” for 문을 μ΄μš©ν—€λ³΄μž. 사싀 μ΄λŸ°μ €λŸ° 책을 μ°Έκ³ ν•˜κ±°λ‚˜ ꡬ글링을 ν•΄μ„œ 찾아봐도 plotly μ—μ„œ for 문을 μ‚¬μš©ν•˜λŠ” 방법이 잘 λ‚˜μ˜€μ§€ μ•Šμ•„μ„œ λ‹΅λ‹΅ν–ˆλŠ”λ°, ꡳ이 μ‚¬μš©ν•˜μ§€ μ•ŠλŠ” 이유λ₯Ό μ΄λ²ˆμ— μ•Œ 수 μžˆμ—ˆλ˜ 것 κ°™λ‹€(μ•„λ‹ˆλ©΄ ꡉμž₯히 높은 ν™•λ₯ λ‘œ λ‚΄κ°€ λͺ¨λ₯΄λŠ” κ±°λŒœβ€¦πŸ˜’).

λ°”λ‘œ μ‹œκ°ν™”ν•΄λ³΄μž.

fig = go.Figure()
for i in planets_year_method_1['method']:
    fig.add_trace(
        go.Scatter(
            x = planets_year_method_1[planets_year_method_1['method']==i]['year'],
            y = planets_year_method_1[planets_year_method_1['method']==i]['number'], name = i, mode = 'lines'))
    
fig.show()

πŸ”‘ μœ„μ— λ³΄μ΄λŠ” λŒ€λ‘œ forλ¬Έ μ•ˆμ— add_trace() λͺ…령어와 graph_object 객체(go.βšͺ)λ₯Ό λ„£μ–΄μ„œ κ·Έλž˜ν”„ 객체λ₯Ό for λ£¨ν”„λ§ŒνΌ μƒμ„±ν•˜λŠ” 것이닀. 그러면 κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.

μœ„μ˜ κ·Έλž˜ν”„μ—μ„œ λ³Ό 수 μžˆλ“―μ΄ 같은 method μ΄μ§€λ§Œ μ€‘λ³΅λ˜λŠ” legned λ₯Ό 가지고 μžˆλŠ” 것을 확인 ν•  수 μžˆλ‹€. 심지어 legned 상에 μ€‘λ³΅λœ 색깔도 μžˆλ‹€πŸ˜‚. κ·Έλž˜μ„œ λ‚˜λŠ” 이 상황을 column에 μžˆλŠ” λͺ¨λ“  데이터λ₯Ό 물러였느라 생긴 일이라 μƒκ°ν•˜κ³  method λ₯Ό 인덱슀둜 groupby ν•˜λ©΄ ν•΄κ²°ν•  수 μžˆμ§€ μ•Šμ„κΉŒ μ˜ˆμƒν–ˆλ‹€. μ½”λ“œμ™€ κ²°κ³ΌλŠ” λ‹€μŒκ³Ό κ°™λ‹€.

planets_year_method_2 = planets.groupby(['method', 'year']).agg({'number':'sum'})
planets_year_method_2
>>
                                        number
method                  	year	
Astrometry	                2010	1
                                2013	1
Eclipse Timing Variations	2008	4
                                2009	1
                                2010	4
...	                        ...	...
Transit	                        2014	93
Transit Timing Variations	2011	2
2012	2
2013	2
2014	3

69 rows Γ— 1 columns

μœ„μ˜ κ²°κ³Ό λ°μ΄ν„°ν”„λ ˆμž„μ„ 보면 method 와 year κ°€ 인덱슀둜 λ“€μ–΄κ°€ 있고 number 만 column 으둜 μ§€μ •λœ 것을 확인할 수 μžˆλ‹€. 이제 μœ„μ˜ λ°©λ²•μ²˜λŸΌ for 문을 돌렀보자.

fig = go.Figure()
for i in range(len(planets_year_method_2.index)):
    fig.add_trace(
        go.Scatter(
            x = pd.Series(planets_year_method_2.index[i][1]),
            y = pd.Series(planets_year_method_2['number'][i]),
            name = planets_year_method_2.index[i][0], mode = 'lines + markers'))
    
fig.show()

μ΄μ œλŠ” 심지어 line 도 μ•ˆ κ·Έλ €μ€€λ‹€(…) κ·Έλž˜μ„œ λ‚΄ μƒκ°μœΌλ‘œλŠ” plotly μ—μ„œ for 문을 μ‚¬μš©ν•˜λ €λ©΄ μ€‘λ³΅λ˜λŠ” 데이터λ₯Ό 가지고 μžˆλŠ” column 을 μ‚¬μš©ν•˜λŠ” 것을 μžμ œν•˜κ±°λ‚˜ legend λ₯Ό 그리지 μ•ŠλŠ” 방법이 μžˆμ„ 것 κ°™λ‹€. λ¬Όλ‘  이제 κ·Έμ € 걸음마λ₯Ό λ–Όκ³  μžˆλŠ” 학생이고, μ•žμœΌλ‘œ λ°°μ›Œλ‚˜κ°€μ•Ό ν•  것이 λ„ˆλ¬΄ λ§Žμ§€λ§Œ 일단 λ‚΄ 짧은 μ†Œκ²¬μœΌλ‘œλŠ” κ·Έλ ‡λ‹€.


πŸ† κ·Έλž˜λ„ μ΄λ ‡κ²Œ plotly μ—μ„œ for loop λŒλ¦¬λŠ” 방법을 μ•Œκ²Œ λ˜μ—ˆμœΌλ‹ˆ λ§Œμ‘±ν•œλ‹€!! λ‚˜μ€‘μ— 이에 λŒ€ν•œ 해결책을 λ°œκ²¬ν•˜λ©΄ ν–‰λ³΅ν•˜κ²Œ λ‹€μ‹œ ν•œλ²ˆ ν¬μŠ€νŒ… 해봐야겠닀.

Leave a comment