๐ŸŒต ๊ฐ€๋”์”ฉ ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ๋ฅผ ํ•˜๋‹ค ๋ณด๋ฉด sort_values( ) ํ•จ์ˆ˜๋‚˜ sort_index( ) ํ•จ์ˆ˜๋ฅผ ์จ๋„ ์›ํ•˜๋Š”๋Œ€๋กœ ์ •๋ ฌ์ด ์ด๋ค„์ง€์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ๋‹ค. ์ด ์ƒํ™ฉ์€ apply( ) ๋กœ ์ž„์˜์˜ ํ•จ์ˆ˜๋ฅผ ๋ฐ์ดํ„ฐ์— ์ ์šฉํ•  ๋•Œ ์ฃผ๋กœ ์ƒ๊ธฐ๋Š”๋ฐ, ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ํ•จ์ˆ˜์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด์ž.

pd.Categorical( ) ํ•จ์ˆ˜

๐ŸŒต ์•ž์„  ํฌ์ŠคํŒ…์—์„œ ๋‹ค๋ฃฌ pd. to_datetime( ) ํ•จ์ˆ˜์— dt ๋ฉ”์„œ๋“œ๋ฅผ ์ ์šฉํ•˜๋ฉด ๋…„, ์›”, ์ผ, ์š”์ผ ๋“ฑ ์‹œ๊ณ„์—ด๊ณผ ๊ด€๋ จํ•œ ๋‹ค์–‘ํ•œ column ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋•Œ ์ƒ๊ธฐ๋Š” ์š”์ผ์€ 0, 1, 2โ€ฆ6 ์˜ ํ˜•ํƒœ๋ฅผ ๊ฐ€์ง€๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€๊ณตํ•˜๋Š” ๊ณผ์ •์—์„œ ์šฐ๋ฆฌ๋Š” ์ต์ˆ™ํ•œ ํ˜•ํƒœ๋กœ ๋ฐ”๊พธ๋Š” ๊ฒฝ์šฐ๊ฐ€ ์ž์ฃผ ์žˆ๋‹ค. ๋จผ์ € ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ํ•œ๋ฒˆ ํ™•์ธํ•ด๋ณด์ž.

birth_date['weekday'] = birth_date['date'].dt.weekday
birth_date.head()
>>
        year	month	day	births	date	        weekday
0	1969	1	1	8486	1969-01-01	2
1	1969	1	2	9002	1969-01-02	3
2	1969	1	3	9542	1969-01-03	4
3	1969	1	4	8960	1969-01-04	5
4	1969	1	5	8390	1969-01-05	6

๋ธ”๋กœ๊ทธ์˜ ํŒ๋‹ค์Šค ์—ฐ์Šตํ•œ ํฌ์ŠคํŒ… ์ค‘์— ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธ์–ด์™”๋‹ค.

์œ„์˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋“ฏ์ด ์š”์ผ์„ ์˜๋ฏธํ•˜๋Š” ์—ด์ธ weekday ๊ฐ€ ์ˆซ์ž๋กœ ์ ํ˜€์žˆ๋‹ค. ๊ทธ๋Ÿผ ์ด weekday๋ฅผ ์ต์ˆ™ํ•œ ํ˜•ํƒœ๋กœ ๋จผ์ € ๋ฐ”๊ฟ”์ฃผ์ž.

def weekday_func(row):
    if row['weekday'] == 0:
        row['weekday'] = 'Mon'
    elif row['weekday'] == 1:
        row['weekday'] = 'Tue'
    elif row['weekday'] == 2:
        row['weekday'] = 'Wed'
    elif row['weekday'] == 3:
        row['weekday'] = 'Thu'
    elif row['weekday'] == 4:
        row['weekday'] = 'Fri' 
    elif row['weekday'] == 5:
        row['weekday'] = 'Sat'
    elif row['weekday'] == 6:
        row['weekday'] = 'Sun'
    
    return row

birth_date = birth_date.apply(weekday_func, axis = 1)
birth_date.head()
>>
        year	month	day	births	date	        weekday
0	1969	1	1	8486	1969-01-01	Wed
1	1969	1	2	9002	1969-01-02	Thu
2	1969	1	3	9542	1969-01-03	Fri
3	1969	1	4	8960	1969-01-04	Sat
4	1969	1	5	8390	1969-01-05	Sun

์ด์ œ ์ด ์นœ๊ตฌ๋ฅผ weekday ๊ธฐ์ค€์œผ๋กœ ์ •๋ ฌํ•ด๋ณด์ž.

birth_date = birth_date.sort_values('weekday', ascending = False)
birth_date
>>
        year	month	day	births	date	        weekday
0	1969	1	1	8486	1969-01-01	Wed
6274	1985	7	3	11698	1985-07-03	Wed
6281	1985	7	10	11640	1985-07-10	Wed
2220	1974	10	16	9211	1974-10-16	Wed
2213	1974	10	9	9505	1974-10-09	Wed
...	...	...	...	...	...	        ...
1606	1973	3	9	9100	1973-03-09	Fri
6636	1986	6	27	11286	1986-06-27	Fri
5542	1983	7	22	11043	1983-07-22	Fri
1613	1973	3	16	8899	1973-03-16	Fri
4770	1981	7	3	9717	1981-07-03	Fri

7305 rows ร— 6 columns

์šฐ๋ฆฌ์—๊ฒŒ ์ต์ˆ™ํ•œ ์›”์š”์ผ๋ถ€ํ„ฐ ์ผ์š”์ผ๊นŒ์ง€ ์ •๋ ฌ๋˜๋Š” ๋Š๋‚Œ์„ ์›ํ–ˆ๋Š”๋ฐ, ์Œ ์‹คํŒจํ–ˆ๋‹ค๐Ÿ˜….

์ด์ œ๋Š” Categorical ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด๋ณด๋„๋ก ํ•˜์ž.

birth_date['weekday'] = pd.Categorical(birth_date['weekday'], categories=['Mon','Tue','Wed','Thu','Fri','Sat','Sun'], ordered = True)
birth_date = birth_date.sort_values(['weekday', 'year'], ascending = True)
birth_date
>>
        year	month	day	births	date	        weekday
198	1969	7	7	10634	1969-07-07	Mon
219	1969	7	28	10548	1969-07-28	Mon
189	1969	6	30	10588	1969-06-30	Mon
241	1969	8	18	10650	1969-08-18	Mon
234	1969	8	11	10706	1969-08-11	Mon
...	...	...	...	...	...	        ...
7226	1988	1	31	8515	1988-01-31	Sun
7369	1988	6	19	9038	1988-06-19	Sun
7282	1988	3	27	8534	1988-03-27	Sun
7311	1988	4	24	8485	1988-04-24	Sun
7275	1988	3	20	8646	1988-03-20	Sun

7305 rows ร— 6 columns

๐ŸŒต ์ด์™•์ด๋ฉด ๋…„๋„๋„ ์ˆœ์„œ๋Œ€๋กœ ๋ณด๊ณ  ์‹ถ์–ด์„œ year ์™€ weekday ๋ฅผ ๋™์‹œ์— sort_values ํ–ˆ๊ณ , ๊ฒฐ๊ณผ๋Š” ๋ณด๋Š” ๋ฐ”์™€ ๊ฐ™์ด ์„ฑ๊ณต์ด๋‹ค๐Ÿ‘. ํ•จ์ˆ˜ ๋‚ด์— ์ •๋ ฌ์„ ์›ํ•˜๋Š” ์‹œ๋ฆฌ์ฆˆ์™€ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์ง€์ •ํ•ด์ฃผ๊ณ , ์ •๋ ฌ๊ฐ’์— True ๋ฅผ ๋ฐฐ์ •ํ•ด์„œ ์‚ฌ์šฉํ•œ๋‹ค. ๋ฌธ๋ฒ•์„ ์‚ดํŽด๋ณด๊ณ  ๋๋‚ด๋„๋ก ํ•˜์ž.

๐Ÿ”‘ ํ•จ์ˆ˜ ๋ฌธ๋ฒ•


pd . Categorical ( ์‹œ๋ฆฌ์ฆˆ, categories = [์นดํ…Œ๊ณ ๋ฆฌ๋ฆฌ์ŠคํŠธ], ordered = True )

ex1 ) Series = pd.Categorical(Series, categories = [โšช, โšช, โšช],ordered = True)  

ex2 ) birth_date['weekday'] = pd.Categorical(birth_date['weekday'], categories=['Mon','Tue','Wed','Thu','Fri','Sat','Sun'], ordered = True)

๐ŸŒต ๊ฐ€๋ณ๊ฒŒ Categoical ํ•จ์ˆ˜์— ๋Œ€ํ•ด์„œ ์‚ดํŽด๋ดค๋‹ค. ํ•จ์ˆ˜ ๋‚ด์— ์ผ์ผ์ด ์นดํ…Œ๊ณ ๋ฆฌ ๋ฆฌ์ŠคํŠธ๋ฅผ ์ˆœ์„œ์ง€์–ด ๋„ฃ์–ด์ค˜์•ผํ•œ๋‹ค๋Š” ์ ์ด ์ปค๋‹ค๋ž€ ๋ฐ์ดํ„ฐ ๋ถ„์„์— ์‚ฌ์šฉํ•˜๊ธฐ์—๋Š” ์•ฝ๊ฐ„ ๊ฑธ๋ฆฌ์ง€๋งŒ datetime ํ•จ์ˆ˜์™€ ์—ฎ์–ด์„œ ์‚ฌ์šฉํ•  ๋•Œ๋Š” ํ›จ์”ฌ ๊น”๋”ํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ์ •๋ ฌ์„ ํ•ด๋„ ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๊ฐ€ ์•ˆ ๋‚˜์˜ฌ๋•Œ๋Š” ์ด ํ•จ์ˆ˜๋ฅผ ์จ๋ณด๋Š” ๊ฑด ์–ด๋–จ๊นŒ๐Ÿ˜Š!!

Leave a comment