In this tutorial we are going to create complex user interaction events like drag, drop and vertical, horizontal swipe events using the power of RxJS.

If you try to implement these aggregated events from scratch without RxJS, you are probably going to use a lot of state variables. Like you maintain a boolean isMouseDown to know when you should handle the current mouse move events and when should you ignore them because the mouse is not even pressed. For more complex cases this approach soon becomes long and hard to maintain, and there's a chance that you end up with a few bugs in your code.

This tutorial requires ES6 knowledge, as all the examples use arrow function notation, but it does not require prior RxJS knowledge. The run the examples below, you need to include the RxJS library.

What is RxJS?

RxJs is a reactive programming library in JavaScript for composing asynchronous and event-based programs by using observable sequences. An Observable is a representation of any set of values over any amount of time. Take the events for example, they happen over time and their payload contains some values. You can think of the Observables as data streams. The good thing about them, is that you can not only observe them, but you can create new streams by altering, filtering or mixing them in different ways to achieve more complex Observables. You do all these things reactively, meaning you don't have to wait till all the data is in place, till every move event have already happened, but you can react to each chunk of data exactly when it happens. RxJS is not about telling what to do next, but rather about how to react once something happens and then how to handle these data flows.

The other thing you will notice, that it uses the functional programing paradigm. For now I won't give you a proper explanation for it, but you will notice that instead of having separate statements after each other, you will rather see a lot of function chaining like thing.map(transposeFunc).merge(otherThing).subscribe(handleFunc). At first this approach might be hard to follow, but in a bit you will find it easy, and you will realize, that you can do much more with less code. And not only because you put everything in one line.

Turning events into Observables

For starter let's just define a few Observables. Nothing special will happen here, just turning the mouse events into Observables:

  function getObservables(domItem) {
  const mouseDowns = Rx.Observable.fromEvent(domItem, "mousedown");
  const mouseMoves = Rx.Observable.fromEvent(window, "mousemove");
  const mouseUps = Rx.Observable.fromEvent(window, "mouseup");

  return { mouseDowns, mouseMoves, mouseUps };
}

const domItem = document.getElementById('content');
const observables = getObservables(domItem);

observables.mouseDowns.forEach(event => {
  console.log('Mouse down');        
});

observables.mouseMoves.forEach(event => {
  console.log('Mouse move');        
});

observables.mouseUps.forEach(event => {
  console.log('Mouse up');        
});

For now we only transformed some events to observables, then handled them with a console.log statement to see that they work. Yes, you could have just used the addEventListener function to achieve the same result, but stay with me, it gets better.

Handling mouse and touch events the same way

In the real world you might need to handle touch events as well. And most probably you want to handle them more or less the same way as the mouse events. The good thing about streams that they can be merged. But once you start that, you realize that the mouse events and touch events have a different format. Touch events can contain multiple touches for instance. So what do we do?

  function getObservables(domItem) {
  const mouseEventToCoordinate = mouseEvent => {
    mouseEvent.preventDefault();
    return {
      x: mouseEvent.clientX, 
      y: mouseEvent.clientY
    };
  };

  const touchEventToCoordinate = touchEvent => {
    touchEvent.preventDefault();
    return {
      x: touchEvent.changedTouches[0].clientX, 
      y: touchEvent.changedTouches[0].clientY
    };
  };

  const mouseDowns = Rx.Observable.fromEvent(domItem, "mousedown").map(mouseEventToCoordinate);
  const mouseMoves = Rx.Observable.fromEvent(window, "mousemove").map(mouseEventToCoordinate);
  const mouseUps = Rx.Observable.fromEvent(window, "mouseup").map(mouseEventToCoordinate);

  const touchStarts = Rx.Observable.fromEvent(domItem, "touchstart").map(touchEventToCoordinate);
  const touchMoves = Rx.Observable.fromEvent(domItem, "touchmove").map(touchEventToCoordinate);
  const touchEnds = Rx.Observable.fromEvent(window, "touchend").map(touchEventToCoordinate);

  const starts = mouseDowns.merge(touchStarts);
  const moves = mouseMoves.merge(touchMoves);
  const ends = mouseUps.merge(touchEnds);

  return { starts, moves, ends };
}

const domItem = document.getElementById('content');
const observables = getObservables(domItem);

observables.moves.forEach(coordinate => {
  console.log('Moved to', coordinate.x, coordinate.y);        
});

Like the creeks unite to a river, streams can be merged with the merge function. This new streams will contain both the corresponding mouse event and the touch event.

But before we do that you can notice that our original Observables have been altered with the map function to transform them to the same format. This map function behaves the same way as the Array.prototype.map function. It gets every item in the stream one by one, and changes them to a new format. In RxJS though, this does not happen with existing static content of an array, but it happens every time reactively when a new item pops up in the stream. This way we ignore most of the event payload, like we won't have multi touch data anymore, but in return they will all have the same {x, y} payload in the followings.

You can also notice that our mapper functions contains a preventDefault call as well. Most probably you want to avoid scrolling the page, when you try to move an item. Also, touch devices assume that you do not handle touch events so they try to help you by generating mouse events once a touch event happens. You don't want this either, but fortunately the preventDefault function prevents that as well.

Basic drag & drop

Once we have our basics let's leverage RxJS's potentials by creating a drag and a drop event.

  function getObservables(domItem) {

  // ...
  // To keep it short the first part of this method is removed from here
  // Check the previous example to see how it is
  // ...

  const starts = mouseDowns.merge(touchStarts);
  const moves = mouseMoves.merge(touchMoves);
  const ends = mouseUps.merge(touchEnds);

  const drags = starts.concatMap(dragStartEvent => 
    moves.takeUntil(ends).map(dragEvent => {
      const x = dragEvent.x - dragStartEvent.x;
      const y = dragEvent.y - dragStartEvent.y;
      return {x, y};
    })
  );

  const drops = starts.concatMap(dragStartEvent => 
    ends.first().map(dragEndEvent => {
      const x = dragEndEvent.x - dragStartEvent.x;
      const y = dragEndEvent.y - dragStartEvent.y;
      return {x, y};
    })
  );

  return { drags, drops };
}

Okay, this is where it starts to get complicated. But before we get into the code, let's try to explain what a drag event means. Once a mouseDown / touchStart event occurs a drag events should be a sequence of move events that happen before a mouseUp / touchEnd event.

In the code the same is happening, but at first it can be a bit confusing. Even though the definition of drags Observable starts with the starts Observable, it actually will be a sequence of moves events. That's what the concatMap method does here. Once a start event occurs, it does not give back the start event itself, but it goes down one level and it reveals the inner content / sequence of the concatMap method, witch is a sequence of move events in this case. Or to be more precise it reveals not simply a sequence of move events, but it only return a sequence of move events till a mouseUp or touchEnd event occurs. That's what the moves.takeUntil(ends) part means.

While we drag things we also need to measure the movement itself. Previously we mapped each mouse and touch event in a way that they both contain an {x, y} coordinate. Now we can use this to calculate distance. Luckily by enclosing the move events inside the starts event's concatMap method, we have access both to the start events payload, and to the current event's payload, so with a simple map function we can calculate the distance and return that as the new payload of the drag event.

drag event explanation

The drop event is more or less the same, except we do not care about the movement steps now, we want to know when the mouse is up again or when your finger is not touching the screen anymore. That's what the drop's concatMap method reveals the first end event once a start event happened. Notice that ends.first() part here will not give back a sequence of end events once once a start event occurs, but it gives back only the first one.

You can find a working example using the code above here:

Vertical and horizontal swipes

The vertical and horizontal swipes are a special kind of drag and drop event. They occur in one direction. They might also have some velocity and it depends how you define them, but in this case for the sake of simplicity we simply define them as drag and drops which initially appear to be in a vertical or horizontal direction.

  function getObservables(domItem) {

  // ...
  // To keep it short the first part of this method is removed from here
  // Check the second example to see how it is
  // ...

  const starts = mouseDowns.merge(touchStarts);
  const moves = mouseMoves.merge(touchMoves);
  const ends = mouseUps.merge(touchEnds);

  // Move starts with direction: Pair the move start events with the 3rd subsequent move event,
  // but only if no end event happens in between
  let moveStartsWithDirection = starts.concatMap(dragStartEvent => 
    moves
      .takeUntil(ends)      
      .elementAt(3)                                            
      .catch(err => Rx.Observable.empty())
      .map(dragEvent => {
        const intialDeltaX = dragEvent.x - dragStartEvent.x;
        const initialDeltaY = dragEvent.y - dragStartEvent.y;
        return {x: dragStartEvent.x, y: dragStartEvent.y, intialDeltaX, initialDeltaY};
      })
  );

  // Vertical move starts: Keep only those move start events 
  // where the 3rd subsequent move event is rather vertical than horizontal
  let verticalMoveStarts = moveStartsWithDirection.filter(dragStartEvent => 
    Math.abs(dragStartEvent.intialDeltaX) < Math.abs(dragStartEvent.initialDeltaY)
  );

  // Horizontal move starts: Keep only those move start events 
  // where the 3rd subsequent move event is rather horizontal than vertical
  let horizontalMoveStarts = moveStartsWithDirection.filter(dragStartEvent => 
    Math.abs(dragStartEvent.intialDeltaX) >= Math.abs(dragStartEvent.initialDeltaY)
  );

  // Take the moves until an end occurs
  const movesUntilEnds = dragStartEvent => 
    moves.takeUntil(ends).map(dragEvent => {
      const x = dragEvent.x - dragStartEvent.x;
      const y = dragEvent.y - dragStartEvent.y;
      return {x, y};
    });

  let verticalMoves = verticalMoveStarts.concatMap(movesUntilEnds);
  let horizontalMoves = horizontalMoveStarts.concatMap(movesUntilEnds);

  const lastMovesAtEnds = dragStartEvent => 
    ends.first().map(dragEndEvent => {
      const x = dragEndEvent.x - dragStartEvent.x;
      const y = dragEndEvent.y - dragStartEvent.y;
      return {x, y};
    });

  let verticalMoveEnds = verticalMoveStarts.concatMap(lastMovesAtEnds);
  let horizontalMoveEnds = horizontalMoveStarts.concatMap(lastMovesAtEnds);

  return { 
    verticalMoves, verticalMoveEnds,
    horizontalMoves, horizontalMoveEnds
  };
}

This example is more complex, but once you fully understood the previous example, it shouldn't be that hard to process.

Here we have a moveStartsWithDirection stream which is similar to the starts stream as it also occurs once a mouseDown or touchStart event happens, just with a bit of a delay and it not only gives back the starting coordinates, but it also gives back the initial direction of the move.

The initial direction is calculated by checking the 3rd subsequent move event after the start event (if it exists). If you look at the code you can see that it is almost the same pattern as previously at the moves stream definition, starts.concatMap(() => moves). But make no mistake, in this case this will not result in a series of move events. The moves.elementAt(3) part gives back only the 3rd move event, so in response to a start event, there will be one move event. We take the 3rd move event because on a the touch screen the users might not move their finger that accurately and even with a mouse you might unwittingly do a wrong move at first. This part will introduce a tiny lag, because nothing will happen between the start event and the 3rd move event, but it gives us more accurate direction data, and most probably you won't even notice it.

swipe event explanation

Of course we only want to detect a movement sequence as a swipe if the picked 3rd movement is happening before and end event, but this gives us a problem. What if it does not happen before an end event? That might cause an error we want to catch so the part moves.takeUntil(ends).elementAt(3) comes with an error handling method that gives back an empty Observable. That will behave like nothing happened, which is good for us, because no swipe event happened actually.

Once we have moveStartsWithDirection stream we can quite simply split it up into two separate streams to have a verticalMoveStarts and a horizontalMoveStarts. This is simply done by applying the filter method with a proper condition for the two case.

Once we got this, the rest is the same. Except we do not build up the moves and ends streams based on the starts stream anymore, but we use our new verticalMoveStarts and horizontalMoveStarts streams. And now we have two separate moves stream and two separate ends stream as well.

A more advanced example

From now on with this knowledge you can do even more complicated things. Like what if you not only need vertical and horizontal swipes, but you would also like to have simple click events and click then hold then drag events for the very same DOM item? And of course you want to clearly separate the (short) click event from the long click (hold) event and the swipe events.

If you want to know more you, check out this pen, where you can see an example of all the above mentioned user interactions in one place: