๐Ÿงฉ ์ €๋ฒˆ ํฌ์ŠคํŒ…๊นŒ์ง€ ํ•ด์„œ ๋ฐ์ดํ„ฐ์˜ object๋ฅผ ์ค„์ด๋Š” Numerosity reduction๋ฅผ ๋‹ค๋ฃจ์—ˆ๋‹ค. ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค๋Š” ๊ฒƒ๋งŒ ์•Œ๋ฉด ๊ทธ ๊ฐœ๋…์— ๋Œ€ํ•œ ์ •๋ฆฌ๋Š” ๋๋‚ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค. ์ด๋ฒˆ ํฌ์ŠคํŒ…๋ถ€ํ„ฐ๋Š” ๋ฐ์ดํ„ฐ์˜ attribute๋ฅผ ์ค„์ด๋Š” Dimensionality Reduction์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž.


1. Dimensionality Reduction ๊ฐœ๋…

๐Ÿงฉ ์ด๋•Œ๊นŒ์ง€ ๋‹ค๋ค„์˜จ ๋‚ด์šฉ์—์„œ๋„ ๊ทธ๋žฌ์ง€๋งŒ, ๋ฐ์ดํ„ฐ๊ฐ€ ์ง€๋‚˜์น˜๊ฒŒ ๋ณต์žกํ•˜๋ฉด ์ •๋ง ํ•„์š”ํ•œ ์ •๋ณด๋ฅผ ๋ฝ‘์•„๋‚ด๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. ์ด๋Ÿฌํ•œ ํ˜„์ƒ์„ Curse of Dimensionality, ์ฆ‰ dimension์˜ ์ €์ฃผ๋ผ๊ณ  ํ•œ๋‹ค. dimension์ด ์ฆ๊ฐ€ํ•˜๋ฉด ์˜คํžˆ๋ ค ๋ฐ์ดํ„ฐ์— ๊ฒฐ์ธก๊ฐ’์ด๋‚˜ ๊ด€๋ จ์—†๋Š” ๊ฐ’ ๋“ฑ์˜ ๋นˆ ๊ณต๊ฐ„์ด ๋งŽ์•„์ ธ์„œ ์›ํ•˜๋Š” ์ •๋ณด๋ฅผ ์–ป๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค๋Š” ์˜๋ฏธ์ด๋‹ค. ๋˜ํ•œ diemnsion์ด ๋„ˆ๋ฌด ๋งŽ์œผ๋ฉด ์ •๋ง ์ค‘์š”ํ•œ attribute์˜ ์ค‘์š”๋„๊ฐ€ ์•ฝํ™”๋˜๋ฉด์„œ ์˜คํžˆ๋ ค ์ค‘์š”์„ฑ์ด ๊ฐ€๋ ค์งˆ ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ด์œ ๋กœ dimension์„ ์ค„์—ฌ์ค˜์•ผ ํ•˜๋Š”๋ฐ, ์ด๋ฅผ ์ค„์ด๋Š” ๊ณผ์ •์„ Dimensionality Reduction์ด๋ผ ํ•œ๋‹ค.

โญ Dimensionality Reduction

  • random variables์˜ ์ˆ˜๋ฅผ ์ค„์—ฌ์„œ ์ •๋ง ์ค‘์š”ํ•œ variables๋ฅผ ์–ป๋Š” ๊ฒƒ

โญ Dimensionality Reduction์˜ ์žฅ์ 

  • curse of dimensionality reduction์„ ์ค„์ผ ์ˆ˜ ์žˆ์Œ
  • ์ƒ๊ด€์—†๋Š” attribute / noise ์ œ๊ฑฐ
  • ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹์— ๋“œ๋Š” ์‹œ๊ฐ„๊ณผ ๋…ธ๋ ฅ์„ ์ค„์ผ ์ˆ˜ ์žˆ์Œ
  • ๋ณด๋‹ค ์ˆ˜์›”ํ•œ ์‹œ๊ฐํ™” ๊ฐ€๋Šฅ

2. Dimensionality Reduction Methology

๐Ÿงฉ ์œ„์—์„œ Dimensionality Reduction์˜ ๊ฐœ๋…๊ณผ ์žฅ์ ์„ ์•Œ์•„๋ณด์•˜๋‹ค. ์ด๋ฆ„์ด ๊ธธ์–ด์„œ ์กฐ๊ธˆ ์–ด๋ ค์›Œ ๋ณด์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ํ•˜๋Š” ์ด์œ ๋Š” ์ง€๊ทนํžˆ ๋‹จ์ˆœํ•˜๋‹ค. ํ•˜์ง€๋งŒ ๊ทธ ๊ฐ„๋‹จํ•œ ์ด์œ ๋ฅผ ์œ„ํ•ด ์ˆ˜ํ–‰ํ•ด์•ผ ํ•˜๋Š” ๊ณผ์ •์ด ๋งˆ๋ƒฅ ๋‹จ์ˆœํ•˜๋‹ค๊ณ  ํ•  ์ˆ˜ ๋Š” ์—†์„ ๊ฒƒ ๊ฐ™๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ์–ด๋ด์•ผ ํ•˜๋Š” ๊ฒƒ์€ ๋ฌผ๋ก ์ด๊ณ , ์ด ์ค‘์—์„œ๋„ ์–ด๋Š ์ •๋„ ๊ด€๋ จ์ด ์žˆ๋Š” ๋ถ€๋ถ„๋“ค์—๋งŒ ์ง‘์ค‘ํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์ด์ œ๋Š” ์ด ๋ณต์žกํ•œ ๋ฐฉ๋ฒ•๋“ค์— ๋ญ๊ฐ€ ์žˆ์„์ง€ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž๐Ÿ™ƒ๐Ÿ™ƒ.

  • 1. Feature Selection
    • attribute๋กœ๋ถ€ํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ์„ค๋ช…ํ•ด์ฃผ๋Š” subset์„ ์ฐพ๋Š” ๊ฒƒ
    • ํ•„์š”ํ•œ attribute๋งŒ ๊ณจ๋ผ๋‚ด์„œ ์ƒˆ๋กœ์šด attribute์˜ ์ง‘ํ•ฉ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ
    • ex) Feature Subset Selection / Feature Creation
  • 2. Feature Extraction
    • high-dimensional space data๋ฅผ fewer dimension์œผ๋กœ ๋ณ€ํ™˜
    • ์—ฌ๋Ÿฌ attribute๋ฅผ ๊ฐ€์ง€๊ณ  ์ƒˆ๋กœ์šด attribute๋ฅผ ์ƒ์„ฑํ•จ
    • ex) Principal Componenet Analysis (PCA)

๐Ÿงฉ ์ด๋ ‡๊ฒŒ ํ•ด์„œ Dimensionality Reduction์˜ ๊ฐœ๋…์„ ๊ฐ„๋‹จํ•˜๊ฒŒ ํ•ต์‹ฌ๋งŒ ์‚ดํŽด๋ณด์•˜๋‹ค. ๋ณด๋‹ค ์›ํ• ํ•œ ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ์˜ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์ด๋ผ๊ณ  ์ดํ•ดํ•˜๋ฉด ๋  ๊ฒƒ ๊ฐ™๋‹ค. ๋‹ค์Œ ํฌ์ŠคํŒ…๋ถ€ํ„ฐ๋Š” Dimensionality Reduction์˜ ๋ฐฉ๋ฒ•๋“ค์„ ์•Œ์•„๋ณผํ…๋ฐ, ์ฒซ๋ฒˆ์งธ๋กœ Feature Selection์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž๐Ÿƒโ€โ™‚๏ธ๐Ÿƒโ€โ™‚๏ธ!!


๐Ÿ’ก์œ„ ํฌ์ŠคํŒ…์€ ํ•œ๊ตญ์™ธ๊ตญ์–ด๋Œ€ํ•™๊ต ๋ฐ”์ด์˜ค๋ฉ”๋””์ปฌ๊ณตํ•™๋ถ€ ๊ณ ์œคํฌ ๊ต์ˆ˜๋‹˜์˜ [์ƒ๋ช…์ •๋ณดํ•™์„ ์œ„ํ•œ ๋ฐ์ดํ„ฐ๋งˆ์ด๋‹] ๊ฐ•์˜ ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ ํ•จ์„ ๋ฐํž™๋‹ˆ๋‹ค.

Leave a comment