K-means Clustering 예제(1)

2017. 11. 14. 16:38

K-means Clustering 테스트 자료

import tensorflow as tf
from tensorflow.contrib.factorization import KMeans
from tensorflow.python.framework import ops


k = 3 #3개의 Cluster로 설정
num_features  = 3 #데이터 Feature 2 (카테고리 분류 코드, 제목)

# 데이터 읽어오기
Data_X = []
with open("C:/Users/N3815/Desktop/sample_kmeans_data.txt", 'r') as f:
    for line in f.readlines():
        dump = []
        dump.append(float(line.split()[1].split(":")[1]) )
        dump.append(float(line.split()[2].split(":")[1]))
        dump.append(float(line.split()[3].split(":")[1]))
        Data_X.append(dump)

    print(Data_X)

X = tf.placeholder(tf.float32, shape = [None, num_features])

kmeans = KMeans(inputs=X, num_clusters=k, distance_metric='squared_euclidean', use_mini_batch=True)

(all_scores, cluster_idx, scores, cluster_centers_initialized, init_op, train_op) = kmeans.training_graph()
cluster_idx = cluster_idx[0]
avg_distance = tf.reduce_mean(scores)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
sess.run(init_op, feed_dict={X: Data_X})

#학습
for i in range(1, 100) :
    _, d, idx = sess.run([train_op, avg_distance, cluster_idx], feed_dict={X: Data_X})

#확인
print(idx, d)
for i in range(0,k) :
    result = []
    for j in range(0, idx.size,1) :
        if(idx[j] == i):
            result.append(Data_X[j])
        print(i, '에 속한 데이터 :', result)

0 1:0.0 2:0.0 3:0.0

1 1:0.1 2:0.1 3:0.1

2 1:0.2 2:0.2 3:0.2

3 1:9.0 2:9.0 3:9.0

4 1:9.1 2:9.1 3:9.1

5 1:9.2 2:9.2 3:9.2

6 1:5.5 2:2.5 3:5.7

7 1:5.2 2:2.5 3:5.3

8 1:5.4 2:5.9 3:5.9

9 1:0.1 2:9.0 3:9.1

10 1:9.1 2:9.2 3:9.3

참고 : http://iamksu.tistory.com/84

저작자표시 비영리 변경금지 (새창열림)

'◼︎ 개발 > ML 알고리즘' 카테고리의 다른 글

doc2vec (0)	2017.12.08
[머신러닝] lab 8 : Tensor Manipulation (0)	2017.04.21
[머신러닝] lec 8-2 : 딥러닝의 기본 개념2 : Back-propagation 과 2006/2007 '딥'의 출현 (0)	2017.04.21
텍스트 유사성을 판단하는 편집거리 알고리즘 (1)	2017.04.12
[머신러닝] lec 8-1 : 딥러닝의 기본 개념 : 시작과 XOR 문제 (0)	2017.04.11
[머신러닝] lab 7-2 : Meet MNIST Dataset (0)	2017.04.11
[머신러닝] lab 7-1 : training/test dataset, learning rate, normalization (0)	2017.04.04

Ailyn의 기술 블로그

K-means Clustering 예제(1)

'◼︎ 개발 > ML 알고리즘' 카테고리의 다른 글

+ Recent posts

티스토리툴바