CS855 HW2

Problem 2:

Use Cha’s Dichotomy Model in a biometric authentication problem with two categories: within-class (same person vector differences) and between-class (different people vector differences).

Given: feature vector samples (from problems 2 and 3 of the Midterm Examination). Transform these samples from the given feature vector space into the Dichotomy Model’s feature-vector-difference space and plot the resulting vector difference samples. Use x’s for within-class samples and o’s for between-class samples and indicate the number of duplicates next to the x’s and o’s when more than one (there are many duplicates due to the similar distributions of samples within classes).

Calculate: - the total number of samples in the feature difference space - the number of within-class difference vector samples - the number of between-class difference vector samples

In [26]:
# Plots of original sample space and differences sample spaces
--------- Plots after taking the differences ---------

In [12]:
# Total number of sample spaces
The total number of samples in the feature difference space:
66
The number of within-class difference vector samples:
6 each class total for 3 classes 18
The number of between-class difference vector samples:
16 between 2 classes, total 48

In [9]:
# points differences calculations
within-class sample differences:

RED class:
[[0, 4], [-4, 0], [4, 0], [0, -4]]
p1:[0 4]	p2:[-4  0]	DIST: [4, 4]
p1:[0 4]	p2:[4 0]	DIST: [4, 4]
p1:[0 4]	p2:[ 0 -4]	DIST: [0, 8]
p1:[-4  0]	p2:[4 0]	DIST: [8, 0]
p1:[-4  0]	p2:[ 0 -4]	DIST: [4, 4]
p1:[4 0]	p2:[ 0 -4]	DIST: [4, 4]
GREEN class (c1):
[[-6, 17], [-10, 13], [-2, 13], [-6, 9]]
p1:[-6 17]	p2:[-10  13]	DIST: [4, 4]
p1:[-6 17]	p2:[-2 13]	DIST: [4, 4]
p1:[-6 17]	p2:[-6  9]	DIST: [0, 8]
p1:[-10  13]	p2:[-2 13]	DIST: [8, 0]
p1:[-10  13]	p2:[-6  9]	DIST: [4, 4]
p1:[-2 13]	p2:[-6  9]	DIST: [4, 4]
BLUE class (c2):
[[10, 14], [6, 10], [14, 10], [10, 6]]
p1:[10 14]	p2:[ 6 10]	DIST: [4, 4]
p1:[10 14]	p2:[14 10]	DIST: [4, 4]
p1:[10 14]	p2:[10  6]	DIST: [0, 8]
p1:[ 6 10]	p2:[14 10]	DIST: [8, 0]
p1:[ 6 10]	p2:[10  6]	DIST: [4, 4]
p1:[14 10]	p2:[10  6]	DIST: [4, 4]

between-class sample differences:

RED-to-GREEN (c1-to-c2):
[[0, 4], [-4, 0], [4, 0], [0, -4]] 
[[-6, 17], [-10, 13], [-2, 13], [-6, 9]]
p1:[0 4]	p2:[-6 17]	DIST:[6, 13]
p1:[0 4]	p2:[-10  13]	DIST:[10, 9]
p1:[0 4]	p2:[-2 13]	DIST:[2, 9]
p1:[0 4]	p2:[-6  9]	DIST:[6, 5]
p1:[-4  0]	p2:[-6 17]	DIST:[2, 17]
p1:[-4  0]	p2:[-10  13]	DIST:[6, 13]
p1:[-4  0]	p2:[-2 13]	DIST:[2, 13]
p1:[-4  0]	p2:[-6  9]	DIST:[2, 9]
p1:[4 0]	p2:[-6 17]	DIST:[10, 17]
p1:[4 0]	p2:[-10  13]	DIST:[14, 13]
p1:[4 0]	p2:[-2 13]	DIST:[6, 13]
p1:[4 0]	p2:[-6  9]	DIST:[10, 9]
p1:[ 0 -4]	p2:[-6 17]	DIST:[6, 21]
p1:[ 0 -4]	p2:[-10  13]	DIST:[10, 17]
p1:[ 0 -4]	p2:[-2 13]	DIST:[2, 17]
p1:[ 0 -4]	p2:[-6  9]	DIST:[6, 13]
GREEN-to-BLUE (c2-to-c3):
[[-6, 17], [-10, 13], [-2, 13], [-6, 9]] 
[[10, 14], [6, 10], [14, 10], [10, 6]]
p1:[-6 17]	p2:[10 14]	DIST:[16, 3]
p1:[-6 17]	p2:[ 6 10]	DIST:[12, 7]
p1:[-6 17]	p2:[14 10]	DIST:[20, 7]
p1:[-6 17]	p2:[10  6]	DIST:[16, 11]
p1:[-10  13]	p2:[10 14]	DIST:[20, 1]
p1:[-10  13]	p2:[ 6 10]	DIST:[16, 3]
p1:[-10  13]	p2:[14 10]	DIST:[24, 3]
p1:[-10  13]	p2:[10  6]	DIST:[20, 7]
p1:[-2 13]	p2:[10 14]	DIST:[12, 1]
p1:[-2 13]	p2:[ 6 10]	DIST:[8, 3]
p1:[-2 13]	p2:[14 10]	DIST:[16, 3]
p1:[-2 13]	p2:[10  6]	DIST:[12, 7]
p1:[-6  9]	p2:[10 14]	DIST:[16, 5]
p1:[-6  9]	p2:[ 6 10]	DIST:[12, 1]
p1:[-6  9]	p2:[14 10]	DIST:[20, 1]
p1:[-6  9]	p2:[10  6]	DIST:[16, 3]
RED-to-BLUE (c1-to-c3):
[[0, 4], [-4, 0], [4, 0], [0, -4]] 
[[10, 14], [6, 10], [14, 10], [10, 6]]
p1:[0 4]	p2:[10 14]	DIST:[10, 10]
p1:[0 4]	p2:[ 6 10]	DIST:[6, 6]
p1:[0 4]	p2:[14 10]	DIST:[14, 6]
p1:[0 4]	p2:[10  6]	DIST:[10, 2]
p1:[-4  0]	p2:[10 14]	DIST:[14, 14]
p1:[-4  0]	p2:[ 6 10]	DIST:[10, 10]
p1:[-4  0]	p2:[14 10]	DIST:[18, 10]
p1:[-4  0]	p2:[10  6]	DIST:[14, 6]
p1:[4 0]	p2:[10 14]	DIST:[6, 14]
p1:[4 0]	p2:[ 6 10]	DIST:[2, 10]
p1:[4 0]	p2:[14 10]	DIST:[10, 10]
p1:[4 0]	p2:[10  6]	DIST:[6, 6]
p1:[ 0 -4]	p2:[10 14]	DIST:[10, 18]
p1:[ 0 -4]	p2:[ 6 10]	DIST:[6, 14]
p1:[ 0 -4]	p2:[14 10]	DIST:[14, 14]
p1:[ 0 -4]	p2:[10  6]	DIST:[10, 10]

In [14]:
# Difference Feature vectors
Differences as a feature vectors:

c1 (red):
 [[4 4 0 8 4 4]
 [4 4 8 0 4 4]]
c2 (green):
 [[4 4 0 8 4 4]
 [4 4 8 0 4 4]]
c3 (blue):
 [[4 4 0 8 4 4]
 [4 4 8 0 4 4]]
between c1 and c2 (red-green):
[[ 6 10  2  6  2  6  2  2 10 14  6 10  6 10  2  6]
 [13  9  9  5 17 13 13  9 17 13 13  9 21 17 17 13]]
between c2 and c3 (green-blue):
[[16 12 20 16 20 16 24 20 12  8 16 12 16 12 20 16]
 [ 3  7  7 11  1  3  3  7  1  3  3  7  5  1  1  3]]
between c1 and c3 (red-blue):
[[10  6 14 10 14 10 18 14  6  2 10  6 10  6 14 10]
 [10  6  6  2 14 10 10  6 14 10 10  6 18 14 14 10]]

In [1]:
__author__ = "A.Aziz Altowayan"
__email__ = "aa10212w@pace.edu"

Wed Nov 5 23:10:23 EST 2014

In [2]:
import matplotlib.pyplot as plt
import numpy as np
In [3]:
# data
def get_data():
    m1 = np.array([[0,4],[-4,0],[4,0],[0,-4]])
    m2 = np.array([[-6,17],[-10,13],[-2,13],[-6,9]])
    m3 = np.array([[10,14],[6,10],[14,10],[10,6]])
    return m1, m2, m3
In [5]:
# distance between 2 points
def dist_point(p0, p1):
    x1, x2 = [abs(p0[0] - p1[0]), abs(p0[1] - p1[1])]
    return x1, x2
In [19]:
def get_out_distances(c1,c2):
    if len(c1) < 1 or len(c2) < 1:
        return
    distances = []
    def rec(c1):
        if len(c1) < 1:
            return
        for i in range(len(c2)):
            d = list(dist_point(c1[0], c2[i]))
            print("p1:{}\tp2:{}\tDIST:{}".format(c1[0],c2[i],d))
            distances.append(d)
        return rec(c1[1:])
    rec(c1)
    return np.array(distances)

# get distances of inside-class points
def get_in_distances(c):
    distances = []
    def rec(points):
        if len(points) < 1:
            return
        for i in range(len(points)-1):
            d = list(dist_point(points[0], points[i+1]))
            print("p1:{}\tp2:{}\tDIST: {}".format(points[0],points[i+1], d))
            distances.append(d)
        return rec(points[1:])
    rec(c)
    return np.array(distances)
In [25]:
c1, c2, c3 = get_data()

# original points
w1, w2, w3 = c1.T, c2.T, c3.T

plt.scatter(w1[0],w1[1], c='r', label="c1")
plt.scatter(w2[0],w2[1], c='g', label="c2")
plt.scatter(w3[0],w3[1], c='b', label="c3")
plt.grid();plt.title("Original 3 classes samples");plt.legend(bbox_to_anchor = (1.5, 1));plt.show()

# in-class distances
d1 = get_in_distances(c1).T
d2 = get_in_distances(c2).T
d3 = get_in_distances(c3).T

# outside-class distances
d12 = get_out_distances(c1,c2).T
d23 = get_out_distances(c2,c3).T
d13 = get_out_distances(c1,c3).T

print("--------- Plots after taking the differences ---------")
# plotting
plt.scatter(d1[0],d1[1], c='r', marker='x', label="dist: in-class c1")
plt.scatter(d2[0],d2[1], c='g', marker='x', label="dist: in-class c2")
plt.scatter(d3[0],d3[1], c='b', marker='x', label="dist: in-class c3")
plt.grid();plt.title("in-class for all 3 classes");plt.legend(bbox_to_anchor = (1.5, 1));plt.show()

plt.scatter(d12[0],d12[1], c='c', label="dist: c1-c2")
plt.grid();plt.title("Between c1 and c2");plt.legend(bbox_to_anchor = (1.5, 1));plt.show()

plt.scatter(d23[0],d23[1], c='y', label="dist: c2-c3")
plt.grid();plt.title("Between c2 and c3");plt.legend(bbox_to_anchor = (1.5, 1));plt.show()

plt.scatter(d13[0],d13[1], c='r', label="dist: c1-c3")
plt.grid();plt.title("Between c1 and c3");plt.legend(bbox_to_anchor = (1.5, 1));plt.show()

plt.scatter(d12[0],d12[1], c='c', label="dist: c1-c2")
plt.scatter(d23[0],d23[1], c='y', label="dist: c2-c3")
plt.scatter(d13[0],d13[1], c='r', label="dist: c1-c3")

plt.scatter(d1[0],d1[1], c='r', marker='x', label="dist: in-class")
plt.scatter(d2[0],d2[1], c='g', marker='x', label="dist: in-class c2")
plt.scatter(d3[0],d3[1], c='b', marker='x', label="dist: in-class c3")
plt.grid();plt.title("All together");plt.legend(bbox_to_anchor = (1.5, 1));plt.show()

# concatenate results in one matrix
inside = np.concatenate((d1,d2,d3),axis=1)
outside = np.concatenate((d12,d23,d13),axis=1)
In [11]:
print("The total number of samples in the feature difference space:\n{}".format(len(d1.T) + len(d2.T) + len(d3.T) + len(d12.T) + len(d13.T) + len(d23.T)))
print("The number of within-class difference vector samples:\n{} each class total for 3 classes {}".format(len(d1.T), len(d1.T) + len(d2.T) + len(d3.T)))
print("The number of between-class difference vector samples:\n{} between 2 classes, total {}".format(len(d12.T), len(d12.T) + len(d13.T) + len(d23.T)))
In [16]:
print("within-class sample differences:\n")
print("RED class:\n{}".format([list(x) for x in c1]))
get_in_distances(c1)
print("GREEN class (c1):\n{}".format([list(x) for x in c2]))
get_in_distances(c2)
print("BLUE class (c2):\n{}".format([list(x) for x in c3]))
get_in_distances(c3)

print("\nbetween-class sample differences:\n")
print("RED-to-GREEN (c1-to-c2):\n{} \n{}".format([list(x) for x in c1], [list(x) for x in c2]))
get_out_distances(c1, c2)
print("GREEN-to-BLUE (c2-to-c3):\n{} \n{}".format([list(x) for x in c2], [list(x) for x in c3]))
get_out_distances(c2, c3)
print("RED-to-BLUE (c1-to-c3):\n{} \n{}".format([list(x) for x in c1], [list(x) for x in c3]))
get_out_distances(c1, c3)
In [17]:
print("Differences as a feature vectors:\n")
print("c1 (red):\n {}".format(d1))
print("c2 (green):\n {}".format(d2))
print("c3 (blue):\n {}".format(d3))
print("between c1 and c2 (red-green):\n{}".format(d12))
print("between c2 and c3 (green-blue):\n{}".format(d23))
print("between c1 and c3 (red-blue):\n{}".format(d13))