๐Ÿงฉ ์ €๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” PCA์˜ ๊ฐœ๋…์— ๋Œ€ํ•ด์„œ, ๊ทธ๋ฆฌ๊ณ  ํŠนํžˆ ๊ทธ ํŠน์ง•์— ๋Œ€ํ•ด์„œ ์ž์„ธํžˆ ์•Œ์•„๋ณด์•˜๋‹ค. ์ตœ๋Œ€ํ•œ ๋‚ด๊ฐ€ ๊ณต๋ถ€ํ•˜๋ฉด์„œ ๊ถ๊ธˆํ–ˆ๋˜ ์ , ๊ทธ๋ฆฌ๊ณ  ๊ฒ€์ƒ‰์„ ํ•ด๋„ ์ž˜ ๋ชจ๋ฅด๋˜ ๋‚ด์šฉ์„ ์ค‘์‹ฌ์œผ๋กœ ๋‹ค๋ฃจ์–ด ๋ดค๋Š”๋ฐ ์–ด๋• ์„ ์ง€ ๋ชจ๋ฅด๊ฒ ๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ €๋ฒˆ ํ•™๊ธฐ์— ์ง„ํ–‰ํ•œ ๋ฐ์ดํ„ฐ๋งˆ์ด๋‹ ํ”„๋กœ์ ํŠธ๋ฅผ ํ†ตํ•ด PCA ๋ถ„์„์„ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ด์ง ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž.


1. PCA ์ฝ”๋“œ ๊ตฌํ˜„

๐Ÿงฉ ํ”„๋กœ์ ํŠธ๋ฅผ ์œ„ํ•ด ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ๋Š” ์บ๊ธ€์—์„œ ๊ตฌํ•œ Cardiovascular Disease dataset ์ด๋‹ค. ๊ถ๊ธˆํ•œ ๋ถ„์€ ๋งํฌ๋ฅผ ๊ฑธ์–ด๋‘์—ˆ์œผ๋‹ˆ ํ•œ๋ฒˆ ํ™•์ธํ•ด ๋ณด๋ฉด์„œ ํฌ์ŠคํŒ…์„ ๋ด๋„ ์ข‹์„ ๊ฒƒ ๊ฐ™๋‹ค. ์‹ฌํ˜ˆ๊ด€ ์งˆํ™˜์˜ ์œ ๋ฌด๋ฅผ ํ†ตํ•ด ์งˆ๋ณ‘์˜ ๋ฐœ๋ณ‘์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ ์š”์ธ๋“ค์„ ์ดํ•ด ์ตœ์ข… ์งˆ๋ณ‘ ์œ ๋ฌด๋ฅผ ํ™•์ธํ•˜๋Š” ๊ฒƒ์ด ํ”„๋กœ์ ํŠธ์˜ ๋ชฉ์ ์ด์—ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์งˆ๋ณ‘์˜ ์œ ๋ฌด๋ฅผ ๊ฐ€์žฅ ์ž˜ ๋‚˜ํƒ€๋‚ด๋Š” ํŠน์ง•์„ ์„ ํƒํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ์—ฐ๊ด€๊ด€๊ณ„ ๋ถ„์„ ๋ฐฉ๋ฒ•์„ ์‹คํ—˜ํ•ด๋ดค์œผ๋ฉฐ, ๊ทธ ์ค‘ ํ•˜๋‚˜๊ฐ€ ์˜ค๋Š˜ ์‚ดํŽด๋ณผ PCA ์ด๋‹ค.

๐Ÿงฉ PCA๋ถ„์„์„ ์œ„ํ•ด sklearn.decomposition ๋ชจ๋“ˆ์˜ PCA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ, ์‹œ๊ฐํ™”๋ฅผ ์œ„ํ•ด์„œ๋Š” plotly ๊ณต์‹ ํ™ˆํŽ˜์ด์ง€๋ฅผ ์ฐธ๊ณ ํ–ˆ๋‹ค.


๐Ÿšฉ ๋ฐ์ดํ„ฐ ํ™•์ธ


  • 70000๋ช…์˜ ์กฐ์‚ฌ๊ตฐ์„ ๋Œ€์ƒ์œผ๋กœ 12๊ฐœ์˜ attribute์™€ ์‹ฌํ˜ˆ๊ด€ ์งˆํ™˜์˜ ์œ ๋ฌด๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” cardio๋ผ๋Š” attribute๋กœ ๋ฐ์ดํ„ฐ๊ฐ€ ๊ตฌ์„ฑ๋˜๋ฉฐ, cardio๋Š” 0์ธ ๊ฒฝ์šฐ ์งˆ๋ณ‘์ด ์—†์Œ์„, 1์ธ ๊ฒฝ์šฐ ์งˆ๋ณ‘์ด ์žˆ์Œ์„ ์˜๋ฏธํ•œ๋‹ค.

๐Ÿšฉ PCA ๊ตฌํ˜„ ๋ฐ ์‹œ๊ฐํ™”_Dimension 3

๐Ÿงฉ ๋จผ์ € ์ฝ”๋“œ๋ถ€ํ„ฐ ํ™•์ธํ•ด๋ณด๋„๋ก ํ•˜์ž๐Ÿ™„.

# target๊ณผ feature data ์„ค์ •
# cardio๊ฐ€ 0์ธ object๋Š” 'N' ์œผ๋กœ ๋ณ€ํ™˜
# cardio๊ฐ€ 1์ธ object๋Š” 'Y' ๋กœ ๋ณ€ํ™˜

cardio_target = cardio['cardio'].copy()
cardio_target[cardio_target==0] = 'N'
cardio_target[cardio_target==1] = 'Y'
cardio_feat = cardio.drop('cardio',axis = 1)
# sklearn.decomposition ๋ชจ๋“ˆ์˜ PCA library ์ž„ํฌํŠธ
# pca.fit_transform() : cardio[features]๋ฅผ scaling ํ•œ ๋’ค์— principal component๋กœ ๋ณ€ํ™˜
# pca.explained_variance_ratio_ : dimension์— ๋”ฐ๋ฅธ variance ์„ค๋ช… ์ •๋„

pca = PCA()
components = pca.fit_transform(cardio_feat)

# PCA ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”

labels = {
    str(i): f"PC {i+1} ({var:.1f}%)"
    for i, var in enumerate(pca.explained_variance_ratio_ * 100)
    }

fig = px.scatter_matrix(
    components,
    labels=labels,
    dimensions=range(3),
    color=cardio_target, opacity = 0.5
    )

fig.update_traces(diagonal_visible=False)
fig.show()
  • ์•ž์„  ํฌ์ŠคํŒ…์—์„œ๋„ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด, PCA๊ณผ์ •์—์„œ ๋งŒ๋“ค์–ด์ง€๋Š” Principal Component์˜ ์ˆ˜์— ๋”ฐ๋ผ์„œ ์›๋ณธ ๋ฐ์ดํ„ฐ์˜ variance ์„ค๋ช… ์ •๋„๊ฐ€ ๊ฒฐ์ •๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด dimension์„ ๊ฒฐ์ •ํ•˜๋Š” ๋ณ€์ˆ˜๊ฐ€ ์กด์žฌํ•œ๋‹ค. ์œ„ ์ฝ”๋“œ์—์„œ dimensions = range(3)๊ฐ€ ๊ทธ ๋ณ€์ˆ˜์ด๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด PCA ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค.


  • ๊ทธ ๊ฒฐ๊ณผ target์œผ๋กœ ์„ค์ •ํ•œ cardio attribut๊ฐ€ ๊ฐ๊ฐ 3๊ฐœ์˜ ์ฐจ์›์—์„œ ์„ค๋ช…๋˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, PC1๊ณผ PC2์—์„œ ๋Œ€๋ถ€๋ถ„์˜ variance๊ฐ€ ์„ค๋ช…๋˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค๐Ÿ™ƒ. PCA๋ฅผ ๊ณต๋ถ€ํ•œ ๋ถ„์ด๋ผ๋ฉด ์•„์‹œ๊ฒ ์ง€๋งŒ ์œ„์˜ ๊ฒฐ๊ณผ๋Š” target label์— ๋”ฐ๋ผ ์„ ๋ช…ํ•˜๊ฒŒ ๋ถ„๋ฅ˜๋œ ๊ฒฝ์šฐ๋Š” ์•„๋‹ˆ๋‹ค. ๊ฐ attribute์— ๋”ฐ๋ผ target label์ด ๋ช…ํ™•ํ•˜๊ฒŒ ๊ฒฐ์ •๋œ๋‹ค๋ฉด, ์œ„์˜ ๊ฒฝ์šฐ์ฒ˜๋Ÿผ ์ ๋“ค์ด ๊ฒน์น˜๊ธฐ๋ณด๋‹ค๋Š” ๋ช…ํ™•ํžˆ ๊ตฌ๋ถ„๋˜๋Š” ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค๊ฒŒ ๋˜๋ฉฐ, PC1์—์„œ 90%๊ฐ€ ๋„˜๋Š” ์„ค๋ช…์ •๋„๋ฅผ ๊ฐ€์ง„๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Š” ๋ฐ์ดํ„ฐ์˜ ์ฐจ์ด๋ผ๊ณ  ๋ณผ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ๊ทธ๋ ‡๊ฒŒ ์‹ ๊ฒฝ์“ธ ๋ถ€๋ถ„์€ ์•„๋‹ ๊ฒƒ์ด๋ผ ์ƒ๊ฐํ•˜๊ณ  ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ–ˆ๋‹ค.

  • ์ด๋ฒˆ์—๋Š” dimension์ด 4์ธ ๊ฒฝ์šฐ๋ฅผ ์‚ดํŽด๋ณด๋„๋ก ํ•˜์ž.


๐Ÿšฉ PCA ๊ตฌํ˜„ ๋ฐ ์‹œ๊ฐํ™”_Dimension 4

# ์‚ฌ์šฉํ•˜๋ ค๋Š” principal componenet ๊ฐœ์ˆ˜ ์ •์˜

n_components = 4


# sklearn.decomposition ๋ชจ๋“ˆ์˜ PCA library ์ž„ํฌํŠธ. principal componenet ๊ฐœ์ˆ˜ ์ •์˜.
# pca.fit_transform() : cardio[features]๋ฅผ scaling ํ•œ ๋’ค์— principal component๋กœ ๋ณ€ํ™˜
# pca.explained_variance_ratio_ : dimension์— ๋”ฐ๋ฅธ variance ์„ค๋ช… ์ •๋„

pca = PCA(n_components=n_components)
components = pca.fit_transform(cardio)


# PCA ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”

total_var = pca.explained_variance_ratio_.sum() * 100

labels = {str(i): f"PC {i+1}" for i in range(n_components)}

fig = px.scatter_matrix(
    components,
    color=cardio_target,
    opacity = 0.5,
    dimensions=range(n_components),
    labels=labels,
    title=f'Total Explained Variance: {total_var:.2f}%',
    )

fig.update_traces(diagonal_visible=False)
fig.show()
  • ์œ„์˜ ์ฝ”๋“œ๋Š” ์•ž์„  ์ฝ”๋“œ์™€ ๋‹ฌ๋ฆฌ n_components ๋ณ€์ˆ˜๋ฅผ ํ†ตํ•ด princiapl component์˜ ๊ฐœ์ˆ˜๋ฅผ ๋ฏธ๋ฆฌ ์„ค์ •ํ•ด๋‘๊ณ  PCA๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜ ๊ทธ๋ฆผ๊ณผ ๊ฐ™๋‹ค.


  • PC4 ๊นŒ์ง€ ์‚ฌ์šฉํ•œ ๊ฒฝ์šฐ์— ์›๋ณธ ๋ฐ์ดํ„ฐ์˜ variance๋ฅผ 99.58% ๊นŒ์ง€ ์„ค๋ช…ํ•˜๋Š” ๊ฒƒ์„ ํ™•์ธํ–ˆ๊ณ , ์ด์— PCA ๋ถ„์„์„ ์œ„ํ•ด dimension์„ 4๋กœ ์„ค์ •ํ•˜๊ธฐ๋กœ ํ–ˆ๋‹ค. ๋‹น์—ฐํžˆ PC์˜ ์ˆ˜๊ฐ€ ๋Š˜์–ด๋‚ ์ˆ˜๋ก ์„ค๋ช… ์ •๋„๋Š” ์ฆ๊ฐ€ํ•  ๊ฒƒ์ด๊ธฐ์—, ์ด๋ฅผ ํ™•์ธํ•ด๋ณด๊ธฐ ์œ„ํ•ด์„œ ์‹œ๊ฐํ™”๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค.
# dimension์— ๋”ฐ๋ฅธ variance ์„ค๋ช… ์ •๋„์˜ ๋ˆ„์ ํ•ฉ ์‹œ๊ฐํ™”

pca = PCA()
pca.fit(cardio)
exp_var_cumul = np.cumsum(pca.explained_variance_ratio_)

px.area(
    x=range(1, exp_var_cumul.shape[0] + 1),
    y=exp_var_cumul,
    labels={"x": "# Components", "y": "Explained Variance"}
)


  • ์ด ๊ฒฐ๊ณผ๋ฅผ ํ† ๋Œ€๋กœ dimension์„ 4๋กœ ์„ค์ •ํ•ด์„œ ๊ฐ attribute ๊ฐ„์˜ ์—ฐ๊ด€๊ด€๊ณ„๋ฅผ ์•Œ์•„๋ณผ ๊ฒƒ์ด๋‹ค.

๐Ÿšฉ attribute ์—ฐ๊ด€๊ด€๊ณ„ ํŒŒ์•…

๐Ÿงฉ ์•ž์„œ ์„ค๋ช…ํ•œ PCA์— ๋Œ€ํ•œ ๊ฐœ๋…๋งŒ ๊ฐ€์ง€๊ณ ์„œ ์‹ค์ œ๋กœ ์—ฐ๊ด€๊ด€๊ณ„๋ฅผ ์–ด๋–ป๊ฒŒ ๋ถ„์„ํ•˜๋Š”์ง€ ์•Œ๊ธฐ๋Š” ์–ด๋ ต๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์‚ฌ์ดํ‚ท๋Ÿฐ๊ณผ plotly ์—์„œ๋Š” ๊ฐ attribute์˜ ๋ฐฉํ–ฅ์„ฑ์„ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•œ ์—ฌ๋Ÿฌ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๋Š”๋ฐ, ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ๋ณด๊ณ  ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž.

# vardiance๋ฅผ ์„ค๋ช…ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ original attribute vector์˜ ๋ฐฉํ–ฅ์„ฑ ํŒŒ์•…

X = cardio[cardio_feat.columns]

pca = PCA(n_components=4)
components = pca.fit_transform(X)

loadings = pca.components_.T * np.sqrt(pca.explained_variance_)

fig = px.scatter(components, x=0, y=1, color=cardio_target, opacity = 0.5)

for i, feature in enumerate(X):

    fig.add_shape(
        type='line',
        x0=0, y0=0,
        x1=loadings[i, 0],
        y1=loadings[i, 1],
        line=dict(color="springgreen",width=3.5)
    )

    fig.add_annotation(
        x=loadings[i, 0],
        y=loadings[i, 1],
        ax=0, ay=0,
        xanchor="center",
        yanchor="bottom",
        text=feature,
        font = {'color':'white'},
        bgcolor = 'grey'
    )

fig.show()
  • loadings ๋ณ€์ˆ˜๋ฅผ ํ†ตํ•ด ๊ฐ attribute vector์˜ ๋ฐฉํ–ฅ์„ ๋‚˜ํƒ€๋‚ผ์ˆ˜ ์žˆ๋„๋ก ํ–ˆ๊ณ , for ๋ฌธ์„ ์‚ฌ์šฉํ•ด์„œ ๋ชจ๋“  point์™€ vector๋ฅผ ์‹œ๊ฐํ™”ํ–ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.


  • ์ด ๊ฒฐ๊ณผ๋งŒ ๊ฐ€์ง€๊ณ  ์ •ํ™•ํ•œ attribute์˜ ๋ฐฉํ–ฅ์„ฑ์„ ํŒŒ์•…ํ•˜๊ธฐ์—๋Š” ๊ฐ vector๊ฐ€ ๋„ˆ๋ฌด ๊ฒน์ณ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, plotly์—์„œ๋Š” ์ด๋ฅผ ํ™•๋Œ€ํ•ด์„œ ๋ณผ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค. ํ™•๋Œ€ํ•ด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.


๐Ÿงฉ PCA ๋ถ„์„์˜ ๊ฒฐ๊ณผ ์„œ๋กœ ๊ฐ€์žฅ ์—ฐ๊ด€๊ด€๊ณ„๊ฐ€ ๊ฐ€์žฅ ๋†’์€ attribute๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ฒฐ์ •๋๋‹ค.

ap_hi, ap_lo, BMI, gluc, cholestero

๐Ÿšฉ PCA DataFrame

๐Ÿงฉ ๋งˆ์ง€๋ง‰์œผ๋กœ ์œ„์˜ ๊ฒฐ๊ณผ ์„ ํƒ๋œ [ap_hi, ap_lo, BMI, gluc, cholestero] attribute๋ฅผ ๊ฐ€์ง€๊ณ  PCA DataFrame์„ ๋งŒ๋“ค์–ด๋ณด์ž๐Ÿ™‚.

# PCA Dataframe ์ถœ๋ ฅ

pca = PCA(n_components=n_components)
components = pca.fit_transform(cardio)
PCA_df = pd.DataFrame({'PC1' : components[:,0],
             'PC2' : components[:,1],
             'PC3' : components[:,2],
             'PC4' : components[:,3],
             'cardio' : cardio['cardio']})
PCA_df


  • ์œ„์˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ๋ณด๋ฉด ์ฒ˜์Œ์— 70000 object๊ฐ€ 64500 object๋กœ ๋ณ€ํ•œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, ์ด๋Š” outlier๋“ค์„ ์ œ๊ฑฐํ•˜๋Š” ๊ณผ์ •์—์„œ ์ˆ˜๊ฐ€ ์ค„์–ด๋“  ๊ฒƒ์ด๋‹ค.

PCA ์ •๋ฆฌ

๐Ÿงฉ ์•ž์„  ํฌ์ŠคํŒ…๊ณผ ์ด๋ฒˆ ํฌ์ŠคํŒ…์˜ ๊ฒฐ๊ณผ PCA์— ๋Œ€ํ•œ ์š”์•ฝ์€ ์•„๋ž˜์™€ ๊ฐ™์ด ์ •๋ฆฌํ•  ์ˆ˜ ์žˆ๋‹ค.

  • ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์ƒˆ๋กœ์šด attributes๊ฐ€ ์ƒˆ๋กœ์šด ์ถ•์ธ principal component๋กœ ์ •์˜๋จ.
  • ๊ธฐ์กด์˜ ๋ฐ์ดํ„ฐ๋“ค์€ ์ƒˆ๋กœ์šด ์ถ• PC1, PC2โ€ฆ์—์„œ ์ƒˆ๋กœ์šด ์ขŒํ‘œ๋ฅผ ๊ฐ€์ง€๊ฒŒ ๋จ.

  • ์ƒˆ๋กœ์šด ์ถ•๋“ค์„ ๋งŒ๋“  ๋’ค์— ๊ทธ ์ถ•๋“ค์„ ๊ธฐ์ค€์œผ๋กœ ์›๋ž˜ ๋ฐ์ดํ„ฐ์˜ variance๋ฅผ ์ž˜ ์„ค๋ช…ํ•˜๋Š” ์ˆœ์„œ๋Œ€๋กœ ๋ฒˆํ˜ธ๋ฅผ ๋งค๊น€.
  • ๋Œ€๋ถ€๋ถ„์˜ variance๊ฐ€ PC1๊ณผ PC2๋งŒ ๊ฐ€์ง€๊ณ ์„œ ์„ค๋ช…์ด ๊ฐ€๋Šฅํ•จ.
  • ์›ํ•˜๋Š” ์ •๋„๊นŒ์ง€๋งŒ variance๋ฅผ ์„ค๋ช…ํ•˜๋ฉด ๋˜๊ณ , ์ด๋ฅผ ๋งŒ์กฑํ•˜๋Š” Principal Component๊นŒ์ง€๋งŒ ์„ ํƒํ•  ๊ฒƒ์ด๋ฏ€๋กœ, ์„ ํƒ๊ณผ์ •์—์„œ Dimensionality Reduction.

  • attribute vector์˜ ๋ฐฉํ–ฅ์„ฑ์„ ๊ธฐ์ค€์œผ๋กœ ์„œ๋กœ ์—ฐ๊ด€๊ด€๊ณ„๊ฐ€ ์žˆ๋Š” attribute๋ฅผ ์•Œ์•„๋‚ผ ์ˆ˜ ์žˆ์Œ.

๐Ÿงฉ ์ด๋ ‡๊ฒŒ ํ•ด์„œ Preprocessing์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฐœ๋…์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋Š” PCA ๋ถ„์„์„ ๋ชจ๋‘ ์•Œ์•„๋ณด์•˜๋‹ค. ๊ฐœ๋… ์ž์ฒด๊ฐ€ ๊ทธ๋ ‡๊ฒŒ ์‰ฝ์ง€๋Š” ์•Š์ง€๋งŒ ์—ฌ๋Ÿฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋“ค์—์„œ ์ด ๋ถ„์„ ๋ฐฉ๋ฒ•์„ ์ง€์›ํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋ ‡๊ฒŒ ๊ฒ๋จน์ง€ ์•Š์•„๋„ ๋  ๊ฒƒ ๊ฐ™๋‹ค๐Ÿ‘.

๐Ÿงฉ ํ”„๋กœ์ ํŠธ ์ง„ํ–‰ ๊ณผ์ •์—์„œ ๋„์›€์„ ๋ฐ›์€ ์‚ฌ์ดํŠธ์™€ ๋‚ด ๋ธ”๋กœ๊ทธ์˜ plotly ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํฌ์ŠคํŒ…์„ ๊ณต์œ ํ• ํ…Œ๋‹ˆ ๊ถ๊ธˆํ•˜์‹  ๋ถ„๋“ค์€ ๋” ์ฐธ๊ณ ํ•˜์‹œ๋ฉด ์ข‹์„ ๊ฒƒ ๊ฐ™๋‹ค๐Ÿ™ƒ๐Ÿ™‚.

๐Ÿ“ PCA Visualization in Python

๐Ÿ“ Plotly ์‹œ๊ฐํ™”

๐Ÿ“ Plotly for๋ฌธ

๐Ÿ“ PCA ๊ฐœ๋…

๐Ÿงฉ ๋‹ค์Œ ํฌ์ŠคํŒ…์—์„œ๋Š” Preprocessing์˜ ๋งˆ์ง€๋ง‰ ๊ฐœ๋…์ธ Data Transformation์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž๐Ÿƒโ€โ™‚๏ธ๐Ÿƒโ€โ™‚๏ธ๐Ÿƒโ€โ™‚๏ธ.


๐Ÿ’ก์œ„ ํฌ์ŠคํŒ…์€ ํ•œ๊ตญ์™ธ๊ตญ์–ด๋Œ€ํ•™๊ต ๋ฐ”์ด์˜ค๋ฉ”๋””์ปฌ๊ณตํ•™๋ถ€ ๊ณ ์œคํฌ ๊ต์ˆ˜๋‹˜์˜ [์ƒ๋ช…์ •๋ณดํ•™์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ๋งˆ์ด๋‹] ๊ฐ•์˜ ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ ํ•จ์„ ๋ฐํž™๋‹ˆ๋‹ค.

Leave a comment