|
|
|
|
|
|
|
[2019] 2. Su Jin Jeong, Hyo Jung Lee, Soong Deok Lee, Seung Hwan Lee, Su Jeong Park, Jong Sik Kim, Jae Won Lee |
|
|
|
Poster :
Date : 19-09-04 15:30
Hit : 801
|
|
Publication; issue : 2019 Year 43 Vol 3 iss 97 p
|
(338.6K), Down : 26, 2019-09-04 15:30:22 | |
Classification of Common Relationships
Based on Short Tandem Repeat Profiles Using Data Mining
Korean J Leg Med 2019;43:97-105
|
|
Department of Statistics, Korea University,
Seoul, Korea,
Product Development HQ, Dong-A ST, Seoul, Korea
Department of Forensic Medicine, Seoul
National University College of Medicine, Seoul, Korea
Forensic Science Division 2, Supreme
Prosecutor’s Office, Seoul, Korea
E-mail: jael@korea.ac.kr
We reviewed past studies on the
identification of familial relationships using 22 short tandem repeat markers.
As a result, we can obtain a high discrimination power and a relatively
accurate cut-off value in parent-child and full sibling relationships. However,
in the case of pairs of uncle-nephew or cousin, we found a limit of low
discrimination power of the likelihood ratio (LR) method. Therefore, we compare
the LR ranking method and data mining techniques (e.g., logistic regression,
linear discriminant analysis, diagonal linear discriminant analysis, diagonal
quadratic discriminant analysis, K-nearest neighbor, classification and
regression trees, support vector machines, random forest [RF], and penalized
multivariate analysis) that can be applied to identify familial relationships,
and provide a guideline for choosing the most appropriate model under a given
situation. RF, one of the data mining techniques, was found to be more accurate
than other methods. The accuracy of RF is 99.99% for parentchild, 99.44% for
full siblings, 90.34% for uncle-nephew, and 79.69% for first cousins.
Key Words:
Short tandem repeats; Kinship testing; Relationships; Likelihood ratio;
Data mining
|
|
|
|
|
|
|