Pen Settings

HTML

CSS

CSS Base

Vendor Prefixing

Add External Stylesheets/Pens

Any URL's added here will be added as <link>s in order, and before the CSS in the editor. If you link to another Pen, it will include the CSS from that Pen. If the preprocessor matches, it will attempt to combine them before processing.

+ add another resource

JavaScript

Babel includes JSX processing.

Add External Scripts/Pens

Any URL's added here will be added as <script>s in order, and run before the JavaScript in the editor. You can use the URL of any other Pen and it will include the JavaScript from that Pen.

+ add another resource

Packages

Add Packages

Search for and use JavaScript packages from npm here. By selecting a package, an import statement will be added to the top of the JavaScript editor for this package.

Behavior

Save Automatically?

If active, Pens will autosave every 30 seconds after being saved once.

Auto-Updating Preview

If enabled, the preview panel updates automatically as you code. If disabled, use the "Run" button to update.

Format on Save

If enabled, your code will be formatted when you actively save your Pen. Note: your code becomes un-folded during formatting.

Editor Settings

Code Indentation

Want to change your Syntax Highlighting theme, Fonts and more?

Visit your global Editor Settings.

HTML

              
                <!-- The library is included in this pen's javascript settings, but the following is also possible: -->
<!--<script src="https://rawgit.com/StefanieStoppel/d3ML/master/lib/d3ml.min.js"></script>-->
<h1 class="center">K-Nearest Neighbor (KNN)</h1>
<p class="center"><strong>A usage example of the <a href="https://github.com/StefanieStoppel/d3ML" target="_blank" title="d3ML repository on GitHub">d3ML</a> javascript library.</strong></p>
<p class="center">Click on the grey area to create new circles and see how their color changes based on their neighbors.</p>
<div id="knn"></div>
<div class="description">
<p>
  In the visualization you can see how the <strong>k-nearest neighbor (KNN)</strong> algorithm
  <strong>classifies</strong> new data points based on their <strong>k</strong> nearest neighbors, where the amount of neighbors k is specified using the slider above. The new point will be assigned the <strong>class</strong> that the majority of its k nearest neighbors hold.
  <br><br>
  The initial circles are our "training" data set. The data set is divided into two classes: <strong>red and blue</strong>.
  When we add a new circle, we want to find out which class (aka color) we need to assign it to, depending on its neighbors.
  <br><br>
  What's going on behind the scenes after you add a circle:
</p>
<ol>
  <li>The algorithm goes through all other circles and calculates their distance from your new circle.</li>
  <li>It then sorts them by distance from your circle in ascending order, meaning the circles with the smallest distances to the new circle come first.</li>
  <li>It picks the first k entries from the result of step 2.</li>
  <li>It looks at the k nearest neighbors' classes. If the majority of them are blue, our new point will be blue as well. If most of them are red, then the new point will be red.</li>
</ol>
<h3>The weighted KNN</h3>
<p>
  If you check the <strong>"Weighted" checkbox</strong> above, the algorithm is justified, so that the <strong>inverse of the distance</strong> of the k neighbors is taken into account. This is important during "ties", meaning that you chose an even amount of neighbors k and half of them are class red while the other half are blue. 
  If you don't use the weighted version of KNN in this case, the neighbor with the closest distance will <strong>ALWAYS</strong> win, so that your circle will take on its color. 
<p>
<h4>Let's look at an example:</h4>
<p>
  Imagine you chose <strong>k=4</strong> and the nearest neighbors are <strong>[(Blue,2), (Blue,500), (Red,3), (Red,4)]</strong>, where the numbers represent the distances from the new circle. In the unweighted KNN, the nearest neighbor's class - in this case Blue - would automatically win. But look at the distances: The two red circles are much closer to the new circle than the second blue circle. It might not be the right decision to just go on the closest circle and not take the others into account!
<br><br>
    When using the <strong>weighted KNN</strong>, the way of determining which class to assign to the new circle is different:<br>
  The weighted KNN sums up the <strong>inverted distance</strong> of all the nearest neighbors belonging to the <strong>same class</strong>. It then checks which of the classes' <strong>sums is the greatest</strong> and assigns the corresponding class to the new circle.<br><br>
  In our example the weights of the classes are:
</p>
<ul>
  <li>Weight of Blue = 1/2 + 1/500 = 0.502</li>
  <li>Weight of Red = 1/3 + 1/4 = 0.583</li>
</ul>
<p>
Because the weight of class Red is greater than that of class Blue, the new circle will be <strong>red</strong>!<br>
</div>
              
            
!

CSS

              
                body {
  display: flex;
  flex-direction: column;
  align-items: center;
  font-family: Helvetica;
  background-color: #33353d;
  color: white;
}
a {
  color: yellow;
}
h1 {
  margin-top: 0.4em;
  margin-bottom: 0;
}
p {
  margin-top: 0.5em;
}
.center {
  text-align: center;
}
.d3ml__settings {
  display: flex;
  justify-content: space-around;
}
.d3ml__settings__group:first-of-type {
  display: flex;
  flex-direction: column;
}
.description {
  width: 80%;
}
              
            
!

JS

              
                const displayWidth = window.innerWidth - 25;
const displayHeight = 450;
const dataSetSize = 250;
const options = {
  rootNode: '#knn',
  width: displayWidth,
  height: displayHeight,
  backgroundColor: 'black',
  circleFill: '#3fe4h2',
  circleStroke: 'white' 
};
const types = ['A', 'B'];

function getRandomInt(min, max) {
    return Math.floor(Math.random() * (max - min + 1)) + min;
}
function createRandomEllipsoidCoordinates(width, height, cx, cy) {
  const rho = Math.sqrt(Math.random()) 
  const phi = Math.random() * Math.PI * 2
  const rands = {x: getRandomInt(-width/2,width/2), y: getRandomInt(-height/2,height/2)}
  const x = (rho * Math.cos(phi) * width/2) + cx + rands.x
  const y = (rho * Math.sin(phi) * height/2) + cy + rands.y
  return {x, y}
}
function createRandomData() {
  const ellipsoidOptions = {
    'A': {
      width: displayWidth/3,
      height: displayWidth/3,
      cx: displayWidth/3,
      cy: displayHeight/3
    },
    'B': {
       width: displayWidth/2.5,
       height: displayWidth/2.5,
       cx: displayWidth*0.663, 
       cy: displayHeight*0.66
    }
  };
  return Array.apply(null, Array(dataSetSize))
    .map(d => {
      const type = Math.random() > 0.5 ? types[0] : types[1];
      const {width, height, cx, cy} = ellipsoidOptions[type]
      const {x, y} = createRandomEllipsoidCoordinates(width, height, cx, cy);
      return {x, y, type};
    }
  );
}
const data = createRandomData();
const k = 3;
const vis = new d3ml.KNNVisualization(data, options, types, k);
vis.draw(); 
              
            
!
999px

Console