๐Ÿ’Š ์ €๋ฒˆ ๋‘ ํฌ์ŠคํŒ…์—์„œ ์‹ ์•ฝ๊ฐœ๋ฐœ ๊ณผ์ •์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด์•˜๋‹ค. target์„ ์ฐพ๊ณ  compound๋ฅผ ๋ณ€ํ˜•ํ•˜๋ฉด์„œ ์ตœ์ข…์ ์œผ๋กœ ์‹ ์•ฝ์„ ์ฐพ๋Š” ๋‹จ๊ณ„๋กœ ์ด๋ค„์ง„๋‹ค. ์ด๋•Œ๊นŒ์ง€์˜ ์‹ ์•ฝ๊ฐœ๋ฐœ์„ ์œ„ํ•ด์„œ๋Š” ๊ฐ๊ฐ์˜ target๊ณผ compound๋ฅผ ์ผ์ผ์ด ๋น„๊ตํ•ด์™”์ง€๋งŒ, ์ด ๋น„ํšจ์œจ์ ์ธ ๊ณผ์ •๋“ค์„ ์ค„์ด๊ธฐ ์œ„ํ•ด์„œ ์šฐ๋ฆฌ๋Š” ์ปดํ“จํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด๋•Œ ์‚ฌ์šฉํ•˜๋Š” ๋„๊ตฌ๊ฐ€ ๋ฐ”๋กœ QSAR์ด๋‹ค.

๐Ÿ’Š ์•ž์œผ๋กœ ์šฐ๋ฆฌ๋Š” ์ž„์˜์˜ compound structure๋ฅผ ๊ฐ€์ง€๊ณ  activity๋ฅผ ์•Œ์•„๋‚ด์„œ target molecule์„ ์ฐพ์•„๋‚˜๊ฐˆํ…๋ฐ, ์ด๋•Œ compound์˜ activity๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด์„œ ์šฐ๋ฆฌ๋Š” QSAR๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ์ด์ œ QSAR์— ๋Œ€ํ•ด์„œ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž๐Ÿƒโ€โ™‚๏ธ๐Ÿƒโ€โ™‚๏ธ.


1. QSAR ์ด๋ž€??


  • activity๋Š” ๋ถ„์ž์˜ ๋…์„ฑ๊ณผ ๊ด€๋ จ์ด ๋งŽ์€๋ฐ, ์•ฝ์˜ ๋…์„ฑ์€ ์‹ ์•ฝ ์ œ์กฐ ๋ถ„์•ผ์—์„œ ์ค‘์š”ํ•œ ๊ณ ๋ ค๋Œ€์ƒ์ž„.
  • QSAR ๋ชจ๋ธ์„ ํ†ตํ•ด toxicity๋ฅผ ๋งŒ๋“ค์–ด๋‚ด๋Š” ๋ถ„์ž๊ตฌ์กฐ์˜ substructure๋ฅผ ๋ฐœ๊ฒฌํ•  ์ˆ˜ ์žˆ์Œ.
  • ๋˜ํ•œ ๋‹ค๋ฅธ compound์—์„œ ๋…์„ฑ์„ ๋„๋Š” substructure์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐœ๊ฒฌํ•˜๋ฉด ํ•ด๋‹น ๋ถ„์ž๋„ ๋…์„ฑ์„ ๊ฐ€์งˆ ๊ฒƒ์ด๋ผ๋Š” ์˜ˆ์ธก์ด ๊ฐ€๋Šฅํ•จ.

3. QSAR-guided drug discovery

๐Ÿ’Š QSAR์— ๋Œ€ํ•ด ๊ธฐ๋ณธ์ ์ธ ๊ฐœ๋…์€ ๊ฐ„๋‹จํ•˜๊ฒŒ๋‚˜๋งˆ ํ์–ด๋ณธ ๊ฒƒ ๊ฐ™๋‹ค. ์ด์ œ ๋ณธ๊ฒฉ์ ์œผ๋กœ QSAR๋ฅผ ๊ฐ€์ง€๊ณ  drug๋ฅผ ๋ฐœ๊ฒฌํ•˜๋Š” ๊ณผ์ •์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž.

๐Ÿ’Š ๊ตต์ง๊ตต์งํ•œ ์ˆœ์„œ๋Š” ์œ„์™€ ๊ฐ™๋‹ค. ์•„๋งˆ๋„ 5๋ฒˆ ์ˆœ์„œ์˜ Virtual Screening์— ๋Œ€ํ•ด์„œ๋Š” ์ฒ˜์Œ ๋“ฃ๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์„ํ…๋ฐ, ์ด์ œ ์„ธ๋ถ€์ ์œผ๋กœ ์žก์•„๊ฐ€๋„๋ก ํ•˜์ž๐Ÿ™ƒ๐Ÿ™ƒ.


๐Ÿšฉ 3.1. QSAR-based virtual screening


  • 1. Chemical Library : $10^6$ ~ $10^9$ molecules
  • 2. virtual screening
  • 3. Potential Hits : virtual screening์˜ ๊ฒฐ๊ณผ ์–ป์€ candidate compound๋ฅผ ๊ฐ€์ง€๊ณ  Experimental Validation.

๐Ÿ’Š ์ฆ‰, virtual screening์€ ์‹ ์•ฝ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•œ ํ›„๋ณด compound๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•œ ๊ณผ์ •์ด๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๊ฐ ๋‹จ๊ณ„์—์„œ ํŠน์ • ํ•„ํ„ฐ๋“ค์ด ์‚ฌ์šฉ๋˜๋ฉฐ, ์ฒ˜์Œ ์›๋ณธ Dataset๊ณผ ๋น„๊ตํ•˜๋ฉด candidate compound์˜ ์ˆ˜๋Š” ๋ˆˆ์— ๋„๊ฒŒ ์ค„์–ด๋“œ๋Š” ๊ฒƒ์„ ํ™•์ธ ํ•  ์ˆ˜ ์žˆ๋‹ค.


๐Ÿšฉ 3.2. Target prediction and optimization

  • ์›ํ•˜๋Š” target์—๋งŒ bindingํ•  ์ˆ˜ ์žˆ๋„๋ก optimizationํ•˜๋Š” ๊ณผ์ • : Adaptive Drug Design

  • Target Prediction : ์ž„์˜์˜ drug์— ๋Œ€ํ•ด์„œ drug๊ฐ€ ์ž‘์šฉํ•  ์ˆ˜ ์žˆ๋Š” target์„ ํ™•์ธํ•˜๋Š” ๊ณผ์ •
    • ๋ณดํ†ต ChEMBI database์—์„œ ๋ช‡๋ฐฑ๊ฐœ์˜ target์„ ๊ฐ€์ ธ์™€์„œ ๊ทธ ๋งŒํผ์˜ QSAR Model์„ ๋นŒ๋“œํ•จ.
    • QSAR Model์„ ํ†ตํ•ด ๋‚˜์˜ค๋Š” ์ถœ๋ ฅ๊ฐ’ (probavility/score)์„ ๊ธฐ์ค€์œผ๋กœ drug๊ฐ€ bindingํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ target์„ ํ™•์ธํ•˜๊ณ , ๊ทธ ์ค‘์—์„œ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” target์˜ ์กด์žฌ์—ฌ๋ถ€๋ฅผ ํ™•์ธ.
    • ๋งŒ์•ฝ ์ถœ๋ ฅ ๊ฒฐ๊ณผ target์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ๊ฒฝ์šฐ์—๋Š” ์›ํ•˜๋Š” target์—๋งŒ ์ž‘์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก drug์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ”์•ผ ํ•˜๋Š”๋ฐ, ์ด ๊ณผ์ •์„ Adaptive Drug Design ์ด๋ผ๊ณ  ํ•จ.
  • Adaptive Drug Design : ์›ํ•˜๋Š” target์—๋งŒ drug๊ฐ€ bindingํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ทธ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊พธ๋Š” ๊ณผ์ •
    • drug์˜ ๊ตฌ์กฐ์— ์ƒˆ๋กœ์šด substructure๋ฅผ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ์ œ๊ฑฐํ•จ์œผ๋กœ์จ ์ง„ํ–‰
    • ์ด๋ ‡๊ฒŒ ๋ณ€ํ˜•ํ•œ drug์— ๋Œ€ํ•ด์„œ ๋‹ค์‹œ Target Prediction์„ ํ•ด์„œ ๊ทธ ๊ฒฐ๊ณผ ์›ํ•˜๋Š” target์—๋งŒ ์ž‘์šฉํ•˜๋Š”์ง€ ํ™•์ธ

๐Ÿ’Š ์ •๋ฆฌํ•˜์ž๋ฉด ๋‹ค์Œ ์ˆœ์„œ๋กœ ์ง„ํ–‰์ด ๋  ๊ฒƒ ๊ฐ™๋‹ค. virtual screening์„ ํ†ตํ•ด ์–ป์€ candidate compound๋ฅผ ์ค‘์—์„œ ํ•˜๋‚˜์˜ compound๋ฅผ ๊ณจ๋ผ drug๋กœ ๋†“๊ณ , ์ด์— ๋Œ€ํ•ด์„œ Target Prediction์„ ํ•˜๊ณ  ์›ํ•˜๋Š” target์— ์ž‘์šฉํ•˜๋Š”์ง€ ํ™•์ธํ•œ๋‹ค. ๋งŒ์•ฝ ์›ํ•˜๋Š” target์™ธ์— ๋‹ค๋ฅธ target์—๋„ ์ž‘์šฉํ•œ๋‹ค๋ฉด, Adaptive Drug Design๋ฅผ ํ†ตํ•ด drug์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ” ํŠน์ • target์—๋งŒ ์ž‘์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ฐ”๊ฟ”์ค€๋‹ค. ๊ทธ ๋‹ค์Œ์— ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊พผ drug์— ๋Œ€ํ•ด์„œ ๋‹ค์‹œ Target Prediction์„ ์ง„ํ–‰ํ•จ์œผ๋กœ์จ ํ™•์ธ ์ž‘์—…์„ ๋๋‚ธ๋‹ค.


4. Components

  • 1. ํ™”ํ•ฉ๋ฌผ ๋ฐ์ดํ„ฐ
  • 2. Activity ๋ฐ์ดํ„ฐ
    • Y = Activity.
    • ๊ด€์ฐฐ ๊ฒฐ๊ณผ ์–ป์–ด์ง„ structures์™€ ๊ด€๋ จ๋œ ๋ถ„์ž์˜ activity.
    • ex) bindingํ•˜๋Š” target, toxicity
    • biological activities์—๋งŒ ๊ตญํ•œ๋˜์ง€ ์•Š๋Š” ๋ชจ๋“  ํ˜•ํƒœ์˜ ์‹คํ—˜ ๊ด€์ฐฐ
  • 3. molecular descriptors์™€ activities ๊ฐ„์˜ ์ฃผ์š” ๊ด€๊ณ„๋ฅผ ์‹๋ณ„ํ•˜๊ธฐ ์œ„ํ•œ ํ†ต๊ณ„์  ๋ชจ๋ธ๋ง ๋ฐฉ๋ฒ•.
    • Linear regression, SVM, Random forest, Deep learning

5. PREPARATION

  • โ€˜Garbage-in, garbage-outโ€™ principle : ์ข‹์€ ๋ฐ์ดํ„ฐ์—์„œ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Œ.

  • Data and/or Statistical method : ๋ฐ์ดํ„ฐ์˜ ์ข…๋ฅ˜๋‚˜ ์–‘์— ๋”ฐ๋ผ model์˜ ์„ ํƒ์ด ์ค‘์š”ํ•จ.

  • checking observationโ€™s consistency : ํ•˜๋‚˜์˜ ์—ฐ๊ตฌ target์— ๋Œ€ํ•ด์„œ๋Š” ํ•˜๋‚˜์˜ ์‹คํ—˜ source์—์„œ ์–ป๋Š” ํŽธ์ด ์ข‹๋‹ค. ํ•˜์ง€๋งŒ, ์‹ค์ œ๋กœ ์ด๋ฅผ ์–ป๊ธฐ๋Š” ์–ด๋ ต๋‹ค.

  • evenly spreaded data point : outlier์— ๋Œ€ํ•œ ๊ณ ๋ ค๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๊ท ๋“ฑ๋ถ„ํฌ๋œ data point๊ฐ€ ์ข‹๋‹ค.

  • not reported is indeed negative : ์‹คํ—˜ํ•˜์ง€ ์•Š์€ molecule์„ inactive๋กœ labelingํ–ˆ์„ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ์ดํ„ฐ์˜ activity label์ด inactive๋ผ๊ณ  ํ•ด์„œ ๊ทธ์— ๋Œ€ํ•œ ๊ณ ๋ ค๋ฅผ ๋ฐฐ์ œํ•ด์„œ๋Š” ์•ˆ๋œ๋‹ค.


๐Ÿ’Š ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” QSAR๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š” drug๋ฅผ ์ฐพ์•„๋‚˜๊ฐ€๋Š” ๊ณผ์ •๊ณผ ๊ทธ ๊ณ ๋ ค์‚ฌํ•ญ์— ๋Œ€ํ•ด์„œ ์‚ดํŽด๋ณด์•˜๋‹ค. ๋Œ€๋ถ€๋ถ„์ด ์•Œ์•„๋‘๋ฉด ๊ธฐ์ดˆ ์ง€์‹์„ ์žก์•„๋‘๋Š” ๋ฐ์— ๊ต‰์žฅํžˆ ์œ ์šฉํ•  ๊ฒƒ ๊ฐ™์€๋ฐ, ๊ทธ ์ค‘์—์„œ๋„ virtual screening, Target prediction, Adaptive Drug Design๋Š” ํŠนํžˆ ๊ทธ ์ˆœ์„œ๊ฐ€ ์—ฐ์†์ ์œผ๋กœ ์ผ์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์— ์ž˜ ์ดํ•ดํ•˜๊ณ  ๋„˜์–ด๊ฐ”์œผ๋ฉด ์ข‹๊ฒ ๋‹ค๐Ÿ™„๐Ÿ™„.

๐Ÿ’Š ์–ด์จŒ๋“  QSAR ๋ชจ๋ธ์€ ์ด๋ ‡๊ฒŒ ๊ตฌํ•œ drug์— ๋Œ€ํ•ด์„œ ๊ทธ activity๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์—, ์ด ๋ชจ๋ธ์˜ ์˜ˆ์ธก ์ •ํ™•์„ฑ์„ ํŒ๋‹จํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค. ๋‹ค์Œ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ด๋ฅผ ์œ„ํ•œ validation์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž๐Ÿƒโ€โ™‚๏ธ๐Ÿƒโ€โ™‚๏ธ.


๐Ÿ’ก์œ„ ํฌ์ŠคํŒ…์€ LAIDD์— ์—…๋กœ๋“œ๋œ KAIST ๊น€๋™์„ญ ๊ต์ˆ˜๋‹˜์˜ QSAR ๋ชจ๋ธ ๊ฐœ๋ฐœ ๊ณผ์ • ๊ฐ•์˜ ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ ํ•จ์„ ๋ฐํž™๋‹ˆ๋‹ค.

Leave a comment