DNA-sequencing multi-omics tool 'Varwalker' - develop version of Random walker

카테고리 없음

DNA-sequencing multi-omics tool 'Varwalker' - develop version of Random walker

happip_jh 2023. 3. 5. 22:48

지속적으로 multi-omics가 성행하고 있음,

DNA-sequencing 연장선으로 항상 multi-omics software에 관심이 많았는데

가장 유명한 random walk를 활용해 cancer driver gene을 찾는 Varwalker를 발견하게 되었음

2014년에 publish되었고 PLoS Comput Biol에 게재됨

논문

VarWalker: Personalized Mutation Network Analysis of Putative Cancer Genes from Next-Generation Sequencing Data

PDF: https://pubmed.ncbi.nlm.nih.gov/24516372/

VarWalker란 무엇인가?

VarWalker, to prioritize driver genes in large scale cancer mutation data.

출처 입력

VarWalker fits generalized additive models for each sample based on sample-specific mutation profiles and builds on the join frequency of both mutation genes and their close interactors

=> protein-protein interaction network를 통해 서로 가깝게 interact한 유전자들을 Random Walk의 Restart 알고리즘을 추가해 interactors들을 최적화함.

VarWalker tool 검증 방법

여기서 이것을 검정한 데이터는 183명의 lung cancer sample들과 121 melanoma sample들로 검정함.

이 연구를 통해 암에 관여하는 유전자 혹은 그 pathway에 관여하는 유전자 네트워크를 (cancer-speciifc mutation networks)를 검정할 수 있었음

VarWalker의 장점은 무엇인가?

이 VarWalker를 통해 알 수 있는 것은

기존에 single-gene-based 접근으로는 암과 관련된지 알지 못했던,

빈도가 낮지만 다른 mutation과 활발히 interact하는 mutation을 prioritized 발굴해 낼 수 있다는 장점을 지님

cancer와 gene을 공부하는데 있어서 어려운 점
- cancer는 genetic factor와 environmental factors 같이 영향을 받음
- ['hotspot' families]Olfactory gene 처럼 상관은 없는데 높은 freqeuncy가 보이는 유전자 같은 biases가 낌

-> 여기서는 SNV에 대한 definition을 최소한 한개의 SNV 또는 INDEL을 가진 걸로,

VarWalker가 나오게 된 계기

문제점

-> 많은 유전자들이 여러가지 이유로 biological networks / functional pathways에 묵상되는 것들이 많음

-> 이런 부족한 점을 mRNA와 methylation data 보완으로 이 driver genes의 evidence를 강화시킬 수 있음

-> 그러나, 이런 모든 데이터를 얻는 경우는 드묾

해결 방식:

VarWalker performs sample-specific filtering and implements Random Walk with Restart algorithm to search for frequently interrupted interactions between MutGenes and their interactors

VarWalker가 가진 2가지 이점:

sample-specific filter
Ramdom Walk with Restart algorithm

=> mutation을 일으키는 유전자도 알아내고 interactors들도 알아냄

여기서 말하는 중요한 mutation은 여러 샘플들에서 많이 protein들이 linking된 것들을 말함.

=> 암을 유발하는 유전자를 발굴 할 뿐만 아니라, 빈도가 낮게 나오지만 잠재적으로 암과 연관 있는 유전자도 발굴하도록

VarWalker method:

대표사진 삭제

사진 설명을 입력하세요.

step1. Patient-specific assessment of MutGenes

목적: potential한 유전자만 걸러내기 위한 단계

대표사진 삭제

가중치

-> sample-specific한 PWV(probability weight vector)를 만듦

위의 계산값으로 가중치를 부여한 probability vector를 만드는 것인데 gene 길이가 길 수록 mutation이 높을 수 밖에 없다는 단점을 보완하기 위한 것임

-> weighted resampling process는 1,000번 시행함

-> patient에 specific한 mutation을 sample별로 만들고 빈도가 0.05인 거슬만 filtering함

step2. Random Walk with Restart

대표사진 삭제

사진 설명을 입력하세요.

r= restart probability

p0,pt,p(t+1) = vectors of size n

represents a vector in which the i element holds the probability that the walker is at node i at time steps 0, t , and t+1 respectively

step3. Randomization-based evaluation of the candidate interactors

->step2 network가 우연히 발생되지 않은 것인지 확인 하기 위해 random하게 100개의 network들을 만들고 얼마나 original network와 같은 network를 유지하는지 살펴봄.

-> 추가적으로 100의 random network들을 뽑을 때 top 10 node도 추출

step4. Construction of a consensus mutation subnetwork

각

Linear regression model을 사용해서 the number of high frequency edges occurred more often than expected

현재글DNA-sequencing multi-omics tool 'Varwalker' - develop version of Random walker

Today :
Yesterday :

happip-jh

DNA-sequencing multi-omics tool 'Varwalker' - develop version of Random walker

'카테고리 없음'의 다른글

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31